Skip to content

πŸ“Š Question Bank Import Placeholder

STATUS: Awaiting question bank data dump

Expected Format

The question bank import will support multiple formats:

Option 1: Markdown with Metadata

## Question: Q001

**Text**: Does your organization enforce multi-factor authentication for all
administrative access?
**Category**: Access Control
**Frameworks**: Essential Eight, ISO27001
**Answer Type**: Boolean
**Required Evidence**: MFA enrollment report

---

## Question: Q002

**Text**: How often are security patches applied to critical systems?
...

Option 2: CSV Format

id,question_text,category,frameworks,answer_type,evidence_required,source,
  persona
Q001,"Does your organization enforce MFA?","Access Control","E8,ISO27001",
  boolean,true,"Allianz-2024","IT Manager"
Q002,"How often are patches applied?","Patch Management","E8",frequency,true,
  "ACSC","IT Manager"

Expected Metadata Fields

Field Description Example
id Unique identifier Q001, ALZ-2024-3.2
question_text The actual question "Do you have MFA?"
category Domain category Access Control
frameworks Related compliance E8, ISO27001, NIST
answer_type Expected response boolean, text, number
evidence_required Needs proof true/false
source Where it came from Allianz-2024, ACSC
persona Who answers IT Manager, Executive
risk_patterns Related risks no_mfa_admin
premium_impact Insurance relevance high/medium/low

Import Process

graph LR
    A[Question Dump] --> B[Parse & Validate]
    B --> C[Deduplicate]
    C --> D[Enrich Metadata]
    D --> E[Unified Question Bank]
    E --> F[Tag Relationships]

Deduplication Strategy

Since questions will come from multiple sources:

  1. Fuzzy matching on question text
  2. Identify variations of same question
  3. Create canonical question with aliases
  4. Preserve source tracking

Integration Points

  • Policy variable extraction (questions about company info)
  • Insurance form questions (premium-impacting)
  • Audit questionnaires (compliance scoring)
  • Onboarding flow (progressive disclosure)

Storage Structure

questions/
β”œβ”€β”€ by-source/
β”‚   β”œβ”€β”€ insurance/
β”‚   β”‚   β”œβ”€β”€ allianz-2024.yaml
β”‚   β”‚   └── chubb-2024.yaml
β”‚   β”œβ”€β”€ compliance/
β”‚   β”‚   β”œβ”€β”€ essential-eight.yaml
β”‚   β”‚   └── iso27001.yaml
β”‚   └── internal/
β”‚       └── onboarding.yaml
β”œβ”€β”€ by-category/
β”‚   β”œβ”€β”€ access-control.yaml
β”‚   β”œβ”€β”€ incident-response.yaml
β”‚   └── data-protection.yaml
└── master-bank.yaml # Deduplicated canonical questions

Next Steps

  1. Receive question dump (MD or CSV format)
  2. Parse and analyze metadata completeness
  3. Identify missing metadata to enrich
  4. Build import scripts
  5. Create question relationship mappings

Note: The more metadata provided upfront, the better the question intelligence will be. Even partial metadata is helpful - we can enrich programmatically.

See /docs-internal/docs/00-documentation-meta/outstanding-work-tracker.md for status