# Data Model Specification This document defines the core data schemas for Specs and Issues in the sudocode system. Agent, Artifact, and Execution entities will be defined incrementally as the design evolves. ## Design Principles 1. **Dual Representation**: Each entity has both a human-editable format (Markdown + YAML frontmatter) and a machine-optimized format (JSONL + SQLite) 2. **Bidirectional Links**: Relationships are tracked in both directions for efficient querying 3. **Immutable IDs**: Once assigned, IDs never change (even on renaming) 4. **Audit Trail**: All changes tracked with timestamps and actors 5. **Git-Friendly**: Primary storage (JSONL + Markdown) optimized for version control 6. **Flexible Content**: Spec and issue markdown content is free-form to adapt to user needs --- ## Core Entity Definitions ### 1. Spec (Specification) **Purpose**: Captures user intent, requirements, and design decisions at various levels of detail. #### Markdown File Format **Location**: `.sudocode/specs/{name}.md` **Structure**: ```markdown --- id: spec-001 title: Authentication System Design type: architecture status: draft priority: 1 created_at: 2025-10-16T10:00:00Z updated_at: 2025-10-16T15:30:00Z created_by: alice updated_by: alice parent: spec-000 blocks: [spec-002] related: [spec-010, spec-015] tags: [auth, security, backend] --- # Authentication System Design The content below is flexible markdown. Users can structure it however they want. ## Example Section Content here... ## Requirements 1. Support OAuth 2.0 [[@issue-001]] 2. Multi-factor authentication [[@issue-002]] ## Cross-References See also [[spec-010]] for API design patterns. ``` **Frontmatter Schema**: ```yaml id: string # Unique identifier (spec-NNN) title: string # Human-readable title (max 500 chars) type: enum # architecture | api | database | feature | research status: enum # draft | review | approved | deprecated priority: int # 0-4 (0=highest, 2=default) created_at: timestamp # ISO 8601 format updated_at: timestamp # ISO 8601 format created_by: string # Username or agent ID updated_by: string # Username or agent ID parent: string? # Optional parent spec ID blocks: [string] # Array of spec IDs this blocks related: [string] # Array of related spec IDs tags: [string] # Free-form tags for organization ``` **Content Guidelines**: - Markdown content is **completely flexible** - no enforced structure - Users can organize sections however they want - System extracts references but doesn't enforce format - Issue references: `[[@issue-001]]` - Links to specific issue - Spec references: `[[spec-002]]` - Links to another spec - Backlinks automatically tracked in relationship graph #### JSONL Format **Location**: `.sudocode/specs/specs.jsonl` **Structure** (one JSON object per line): ```json { "id": "spec-001", "title": "Authentication System Design", "file_path": ".sudocode/specs/auth-system.md", "content": "# Authentication System Design\n\n...", "type": "architecture", "status": "draft", "priority": 1, "created_at": "2025-10-16T10:00:00Z", "updated_at": "2025-10-16T15:30:00Z", "created_by": "alice", "updated_by": "alice", "parent": "spec-000", "relationships": [ {"from": "spec-001", "to": "spec-002", "type": "blocks"}, {"from": "spec-001", "to": "spec-010", "type": "related"} ], "issue_refs": ["issue-001", "issue-002", "issue-003"], "tags": ["auth", "security", "backend"] } ``` **Field Definitions**: - `id`: Immutable unique identifier - `title`: Display name (editable) - `file_path`: Relative path to markdown file - `content`: Full markdown content (without frontmatter) - `type`: Category of spec - `status`: Current state in lifecycle - `priority`: Urgency (0=critical, 4=low) - `relationships`: Embedded relationship array - `issue_refs`: Extracted from `[[@issue-NNN]]` in content - `tags`: Free-form tags for filtering and search --- ### 2. Issue **Purpose**: Captures actionable work items derived from specs, assigned to agents or humans. #### Markdown File Format **Location**: `.sudocode/issues/{id}.md` **Structure**: ```markdown --- id: issue-001 title: Implement OAuth 2.0 token endpoint description: Create REST endpoint for OAuth token exchange status: open priority: 1 issue_type: task assignee: agent-backend-dev estimated_minutes: 120 created_at: 2025-10-16T10:00:00Z updated_at: 2025-10-16T15:30:00Z closed_at: null created_by: agent-planner spec_refs: [spec-001] parent: null blocks: [issue-002] blocked_by: [] related: [issue-010] tags: [auth, backend, api] --- # Implement OAuth 2.0 token endpoint Content here is flexible markdown. Common sections might include: ## Description Create REST endpoint for OAuth token exchange following RFC 6749. ## Design Notes - Endpoint: POST /oauth/token - Support grant types: authorization_code, refresh_token - Return JWT tokens with 1hr expiry ## Acceptance Criteria - [ ] Endpoint accepts valid authorization codes - [ ] Returns valid JWT tokens - [ ] Handles invalid requests with proper error codes - [ ] Unit tests with >90% coverage ## Notes Links back to [[spec-001]] requirements section. ``` **Frontmatter Schema**: ```yaml id: string # Unique identifier (issue-NNN) title: string # Short description (max 500 chars) description: string # Detailed problem statement status: enum # open | in_progress | blocked | needs_review | closed priority: int # 0-4 (0=highest, 2=default) issue_type: enum # bug | feature | task | epic | chore assignee: string? # Agent ID or username estimated_minutes: int? # Estimated effort created_at: timestamp # ISO 8601 format updated_at: timestamp # ISO 8601 format closed_at: timestamp? # When closed (null if open) created_by: string # Who created (user or agent) spec_refs: [string] # Specs this issue relates to parent: string? # Parent issue (for epics) blocks: [string] # Issues this blocks blocked_by: [string] # Issues blocking this (computed) related: [string] # Related issues tags: [string] # Free-form labels for organization ``` **Status Lifecycle**: - `open` → `in_progress` → `closed` - `open` → `blocked` → `in_progress` → `closed` - Can reopen: `closed` → `open` **Content Guidelines**: - Markdown content is **flexible** - users/agents can structure as needed - Common sections (Description, Design, Acceptance Criteria, Notes) are conventions, not requirements - Issue templates may be added later, but not enforced #### JSONL Format **Location**: `.sudocode/issues/issues.jsonl` **Structure**: ```json { "id": "issue-001", "title": "Implement OAuth 2.0 token endpoint", "description": "Create REST endpoint for OAuth token exchange", "content": "Full markdown content here...", "status": "open", "priority": 1, "issue_type": "task", "assignee": "agent-backend-dev", "estimated_minutes": 120, "created_at": "2025-10-16T10:00:00Z", "updated_at": "2025-10-16T15:30:00Z", "closed_at": null, "created_by": "agent-planner", "spec_refs": ["spec-001"], "relationships": [ {"from": "issue-001", "to": "issue-002", "type": "blocks"}, {"from": "issue-001", "to": "issue-010", "type": "related"} ], "tags": ["auth", "backend", "api"] } ``` **Field Definitions**: - `content`: Full markdown content (without frontmatter) - may include design, acceptance criteria, notes - `spec_refs`: Bidirectional links to specs - `blocked_by`: Computed from relationship graph (not stored directly in JSONL) --- ## Relationship Structure **Purpose**: Captures edges in the dependency graph between specs and issues. **Location**: Relationships can be stored in two ways: 1. **Embedded in entity JSONL** (as shown above in `relationships` array) 2. **Separate relationships file** (optional, for easier graph operations) **Relationship Types**: | Type | Description | Valid Pairs | |------|-------------|-------------| | `blocks` | Hard dependency blocker | issue→issue, spec→spec | | `related` | Soft contextual link | any→any | | `parent-child` | Hierarchical relationship | spec→spec, issue→issue | | `discovered-from` | Found during execution | issue→issue | | `implements` | Implementation link | issue→spec | **Note**: More detailed relationship schema will be defined when designing the storage layer. --- ## ID Assignment Strategy ### Format - Specs: `spec-NNN` (e.g., `spec-001`, `spec-042`) - Issues: `issue-NNN` (e.g., `issue-001`, `issue-123`) ### ID Generation - Sequential numbering per type - IDs never reused - IDs assigned at creation, immutable - Counter stored in `.sudocode/meta.json`: ```json { "next_spec_id": 43, "next_issue_id": 157 } ``` ### Collision Handling On import/merge conflicts: 1. Detect ID collision (same ID, different content) 2. Score by reference count (how many places reference this ID) 3. Renumber entity with fewer references 4. Update all references in text fields and relationships 5. Record mapping in conflict log --- ## Example: Complete Data Flow ### Scenario: User creates spec, plans issues **1. User creates spec** ```bash sudocode spec create auth-system ``` Creates: - `.sudocode/specs/auth-system.md` (with frontmatter) - Entry in `.sudocode/specs/specs.jsonl` - Row in SQLite `specs` table **2. User invokes planning** ```bash sudocode plan spec-001 ``` Creates: - `issue-001`, `issue-002`, `issue-003` (markdown + JSONL + SQLite) - Relationships: `issue-001 implements spec-001` - Relationships: `issue-002 blocks issue-003` - Updates `spec-001.md` with issue references: `[[@issue-001]]` **3. User or agent updates issue status** ```bash sudocode issue update issue-001 --status in_progress ``` Updates: - Frontmatter in `.sudocode/issues/issue-001.md` - Entry in `.sudocode/issues/issues.jsonl` - Row in SQLite `issues` table --- ## Future Entity Definitions (TODO) The following entities will be defined as the design evolves: ### 3. Agent (TODO) **Purpose**: Defines agent configurations, capabilities, and execution parameters. **Placeholder**: Will include agent type (claude-code, etc.), capabilities, config (MCP servers, hooks, plugins), and scheduling parameters. ### 4. Artifact (TODO) **Purpose**: Represents outputs from agent executions (code changes, reports, documentation). **Placeholder**: Will track execution ID, issue ID, artifact type, file path, status (pending-review, approved, applied), and metadata. ### 5. Execution (TODO) **Purpose**: Tracks individual agent runs against issues. **Placeholder**: Will include issue ID, agent ID, start/end timestamps, exit code, log path, produced artifacts, and feedback (discovered issues, spec updates). --- ## Next Steps After validating the Spec and Issue schemas: 1. Design SQLite database schema for specs and issues (storage.md) 2. Define JSONL ↔ SQLite sync mechanism 3. Implement ID generation and collision resolution 4. Define CLI commands for spec and issue CRUD operations 5. Prototype planning agent workflow (spec → issues) 6. Incrementally add Agent/Artifact/Execution schemas as needed