Design & Architecture#
This page describes jamb’s internal architecture and the reasoning behind key design choices.
Package Overview#
jamb is organized into focused packages, each with a single responsibility:
Package |
Role |
|---|---|
|
Domain models ( |
|
Filesystem I/O: discovery, YAML reading/writing, graph building, validation |
|
pytest hooks, marker extraction, test-outcome recording |
|
Traceability-matrix generation in multiple formats |
|
Human-readable document rendering (HTML, Markdown, DOCX) |
|
Click-based command-line interface |
Shared modules at the package root include config (configuration loading from pyproject.toml) and yaml_io (YAML utilities).
graph TD
CLI[cli] --> Storage[storage]
CLI --> Matrix[matrix]
CLI --> Publish[publish]
Plugin[pytest_plugin] --> Storage
Plugin --> Matrix
Plugin --> Core[core]
Matrix --> Core
Publish --> Core
Storage --> Core
Storage --> Config[config]
Storage --> YamlIO[yaml_io]
Design rationale. Domain models in core/ have zero I/O dependencies, making them easy to test in isolation. All filesystem interaction is confined to storage/, so the rest of the codebase works with pure in-memory data structures.
Domain Models#
The core domain lives in core/models.py.
Item represents a single requirement, informational note, or heading. Its key fields are uid, text, document_prefix, active, type, links, header, reviewed, derived, and testable.
LinkedTest models a test-to-requirement link. It records test_nodeid, item_uid, and test_outcome, plus optional notes, test_actions, expected_results, and actual_results captured via the jamb_log fixture.
ItemCoverage pairs an Item with its LinkedTest list and exposes is_covered / all_tests_passed properties.
TraceabilityGraph is the complete bidirectional graph. It stores items by UID and maintains two adjacency lists (item_parents, item_children) plus a document-level DAG (document_parents). It provides BFS-based get_ancestors() and get_descendants() traversals.
TraceabilityGraph uses adjacency lists rather than a matrix because the graph is sparse – most items link to only one or two parents.
The Document DAG#
Documents form a directed acyclic graph (DAG), not a simple tree, because a child document can trace to multiple parents. For example, the Software Requirements Specification (SRS) traces to both System Requirements (SYS) and Risk Controls (RC).
The DocumentDAG class (storage/document_dag.py) manages document relationships and provides two critical operations:
Topological sort (Kahn’s algorithm) – Orders documents so that parents are always loaded before children. This ensures that when an item’s links are resolved, the target items already exist in memory.
Cycle detection – After running Kahn’s algorithm, any nodes not visited form cycles. The validator reports the specific prefixes involved.
graph TD
PRJ[PRJ<br/>Project Plan] --> UN[UN<br/>User Needs]
PRJ --> HAZ[HAZ<br/>Hazard Analysis]
UN --> SYS[SYS<br/>System Requirements]
SYS --> SRS[SRS<br/>Software Requirements]
HAZ --> RC[RC<br/>Risk Controls]
RC --> SRS
Data Flow#
The discovery-to-graph pipeline runs in three phases:
Discovery (
storage/discovery.py) – Walks the filesystem from the project root, finding all.jamb.ymlconfig files viaPath.rglob(). Each config defines a document prefix and its parent documents. The result is aDocumentDAG.Graph building (
storage/graph_builder.py) – Iterates documents in topological order, reading YAML item files viastorage/items.py. Each file becomes anItemadded to theTraceabilityGraph.Item reading (
storage/items.py) – Parses individual YAML files, normalizing two link formats (plain- UIDand hashed- UID: hash_value) into a consistent structure.
flowchart LR
FS["Filesystem<br/>.jamb.yml files"] --> Disc["discovery.py<br/>discover_documents()"]
Disc --> DAG["DocumentDAG"]
DAG --> GB["graph_builder.py<br/>build_traceability_graph()"]
GB --> Items["items.py<br/>read_document_items()"]
Items --> TG["TraceabilityGraph"]
pytest Integration Lifecycle#
The pytest plugin (pytest_plugin/plugin.py and pytest_plugin/collector.py) integrates with pytest through a chain of hooks:
sequenceDiagram
participant U as User
participant P as pytest
participant Pl as plugin.py
participant C as RequirementCollector
participant M as markers.py
participant G as TraceabilityGraph
participant Mx as matrix/generator.py
U->>P: pytest --jamb
P->>Pl: pytest_addoption()
P->>Pl: pytest_configure()
Pl->>C: create collector
C->>G: discover & build graph
P->>C: pytest_collection_modifyitems()
C->>M: get_requirement_markers()
M-->>C: list of UIDs
C->>C: create LinkedTest entries
loop each test
P->>C: pytest_runtest_makereport()
C->>C: record outcome, notes, actions
end
P->>Pl: pytest_sessionfinish()
Pl->>Mx: generate_matrix() (if --jamb-trace-matrix or --jamb-test-matrix)
P->>Pl: pytest_terminal_summary()
Pl->>U: coverage report
Key hooks in order:
pytest_addoption– Registers--jamb,--jamb-test-matrix,--jamb-trace-matrix,--jamb-fail-uncovered,--jamb-documents,--jamb-tester-id,--jamb-software-version,--trace-from,--include-ancestors.pytest_configure– Creates aRequirementCollectorthat discovers documents and builds the graph.pytest_collection_modifyitems– Scans collected tests for@pytest.mark.requirementmarkers and createsLinkedTestentries.pytest_runtest_makereport– After each test’s call phase, records the outcome and any data fromjamb_log.pytest_sessionfinish– Generates the traceability matrix if requested; sets exit code if uncovered items exist and--jamb-fail-uncoveredis set.pytest_terminal_summary– Prints coverage statistics, uncovered items, and unknown UIDs.
Suspect Link Detection#
When an item’s content changes, any items that link to it may become outdated. jamb detects this through content hashing.
Hash computation (storage/items.py: compute_content_hash()): the function concatenates text, header, sorted links, and type with | as delimiter, computes a SHA-256 digest, and encodes it as URL-safe base64 with padding stripped.
Why hashing over timestamps? Hashes are deterministic across git clones – every developer and CI runner computes the same hash for the same content. Timestamps would break on fresh clones or rebases.
Detection flow (storage/validation.py: _check_suspect_links()):
flowchart TD
Start["validate()"] --> Load["Load all active items"]
Load --> Loop{"For each item<br/>with stored link hashes"}
Loop --> Read["Read raw YAML to<br/>get link_hashes dict"]
Read --> Hash["Compute current hash<br/>of linked item"]
Hash --> Compare{"Stored hash ==<br/>current hash?"}
Compare -- Yes --> Next["Link is clean"]
Compare -- No --> Flag["Flag as suspect link<br/>(warning)"]
Next --> Loop
Flag --> Loop
Loop -- "No stored hash" --> Warn["Warn: link not verified"]
Matrix & Publishing#
The matrix generator (matrix/generator.py) dispatches to format-specific renderers based on a simple if/elif chain. Five formats are supported: HTML, Markdown, JSON, CSV, and XLSX.
Format modules are imported lazily inside the dispatch function. This avoids requiring optional dependencies (like openpyxl for XLSX) when they aren’t used.
The publish package renders human-readable requirement documents in HTML, Markdown, and DOCX formats, with internal hyperlinks between items and document-order grouping. HTML and DOCX renderers live in publish/formats/; Markdown rendering is handled in the CLI layer.
Validation Architecture#
The validate() function (storage/validation.py) is the single entry point for all validation. It runs nine independent checks across three categories. Eight are controlled by keyword flags (all defaulting to True); DAG acyclicity always runs:
Structural checks:
DAG acyclicity – Detects cycles in the document hierarchy.
Item link cycles – DFS with three-color marking (white/gray/black) to find cycles in item-to-item links.
Link validity and conformance – A single
_check_linksfunction that catches self-links, links to non-existent items, links to inactive items, non-normative items with links, and verifies links point to items in valid parent documents per the DAG.
Content checks:
Suspect links – Content-hash comparison (see above).
Review status – Checks that normative items have a
reviewedhash matching current content.Empty text – Flags items with blank or whitespace-only text.
Completeness checks:
Child links – Non-leaf-document items should have children linking to them.
Unlinked items – Child-document items (non-derived) should link to a parent document.
Empty documents – Flags documents containing no items.
Each check returns a list of ValidationIssue objects with a level (error, warning, or info), the relevant uid or prefix, and a descriptive message.
Design Decisions Summary#
Decision |
Alternative |
Rationale |
|---|---|---|
DAG over tree |
Tree hierarchy |
Multi-parent tracing (e.g., software requirements linked to both system requirements and risk controls) |
Content hashing (SHA-256) |
Timestamps |
Deterministic across git clones; no breakage on fresh checkout or rebase |
YAML per item |
Single file per document |
Better git diffs, fewer merge conflicts when multiple authors edit the same document |
In-memory graph |
Database |
Data fits in memory; no need for a query language or external process |
pytest hooks |
Custom test runner |
Leverages the existing pytest ecosystem, markers, and reporting |
Lazy format imports |
Plugin registry |
Simple and sufficient for a small, stable set of output formats |