Introduction

trueblocks-dalle is a Go library (module github.com/TrueBlocks/trueblocks-dalle/v6) that deterministically generates AI art by converting seed strings (typically Ethereum addresses) into structured semantic attributes, building layered natural-language prompts, optionally enhancing those prompts through OpenAI Chat, generating images through OpenAI's DALL·E API, annotating images with captions, and providing optional text-to-speech narration.

This is not a generic wrapper around OpenAI. It is a deterministic prompt orchestration and artifact pipeline designed for reproducible creative output.

Core Properties

  • Deterministic Attribute Derivation: Seeds are sliced into 6-hex-byte windows that map to indexed rows across curated databases (adjectives, nouns, emotions, art styles, etc.)
  • Layered Prompt System: Multiple template formats including data, title, terse, full prompt, and optional enhancement via OpenAI Chat
  • Series-Based Filtering: Optional JSON-backed filter lists that constrain which database entries are available for each attribute type
  • Context Management: LRU + TTL cache of loaded contexts to handle multiple series without unbounded memory growth
  • Complete Artifact Pipeline: Persistent output directory structure storing prompts, images (generated and annotated), JSON metadata, and optional audio
  • Progress Tracking: Fine-grained phase tracking with ETA estimation, exponential moving averages, and optional run archival
  • Image Annotation: Dynamic palette-based background generation with contrast-aware text rendering
  • Text-to-Speech: Optional prompt narration via OpenAI TTS

Architecture Overview

The library is organized into several key packages:

PackagePurpose
Root packagePublic API, context management, series CRUD, main generation orchestration
pkg/modelCore data structures (DalleDress, attributes, types)
pkg/promptTemplate definitions, attribute derivation, OpenAI enhancement
pkg/imageImage generation, download, processing coordination
pkg/annotateImage annotation with dynamic backgrounds and text
pkg/progressPhase-based progress tracking with metrics
pkg/storageData directory management, database caching, file operations
pkg/utilsUtility functions for various operations

Generation Flow

  1. Context Resolution: Get or create a cached Context for the specified series
  2. Attribute Derivation: Slice the seed string and map chunks to database entries, respecting series filters
  3. Prompt Construction: Execute multiple templates (data, title, terse, full) using selected attributes
  4. Optional Enhancement: Use OpenAI Chat to rewrite the prompt (if enabled and API key present)
  5. Image Generation: POST to OpenAI Images API, handle download or base64 decoding
  6. Image Annotation: Add terse caption with palette-based background and contrast-safe text
  7. Artifact Persistence: Save all outputs (prompts, images, JSON, optional audio) to organized directory structure
  8. Progress Updates: Track timing through all phases for metrics and ETA calculation

Key Data Structures

  • Context: Contains templates, database slices, in-memory DalleDress cache, and series configuration
  • DalleDress: Complete snapshot of generation state including all prompts, paths, attributes, and metadata
  • Series: JSON-backed configuration with attribute filters and metadata
  • Attribute: Individual semantic unit derived from seed slice and database lookup
  • ProgressReport: Real-time generation phase tracking with percentages and ETA

Determinism & Reproducibility

Given the same seed string and series configuration, the library produces identical results through the image generation step. The only non-deterministic component is optional prompt enhancement via OpenAI Chat, which can be disabled with TB_DALLE_NO_ENHANCE=1.

All artifacts are persisted with predictable file paths, enabling caching, auditing, and external processing.

When to Use

  • Need reproducible AI image generation from deterministic seeds
  • Want structured attribute-driven prompt construction
  • Require complete artifact trails for auditing or caching
  • Building applications that generate visual identities from addresses or tokens
  • Need progress tracking for long-running generation processes

When Not to Use

  • Need batch generation of multiple images per prompt
  • Require offline execution (depends on OpenAI APIs unless stubbed)
  • Want completely free-form prompt construction outside the template system
  • Need real-time streaming generation

Next Steps

Jump to the Quick Start for immediate usage examples, or continue to Architecture Overview for deeper system understanding.

Quick Start

This walkthrough shows how to use the main public API functions with minimal code and explains where artifacts are stored.

Prerequisites

Environment Setup

Set your OpenAI API key (required for image generation, enhancement, and text-to-speech):

export OPENAI_API_KEY="sk-..."

Optionally configure a custom data directory (defaults to platform-specific location):

export TB_DALLE_DATA_DIR="/path/to/your/dalle-data"

Optional: disable prompt enhancement for faster/deterministic runs:

export TB_DALLE_NO_ENHANCE=1

Installation

go get github.com/TrueBlocks/trueblocks-dalle/v6@latest

Basic Usage

Simple Image Generation

package main

import (
    "fmt"
    "log"
    "time"
    
    dalle "github.com/TrueBlocks/trueblocks-dalle/v6"
)

func main() {
    series := "demo"
    address := "0x1234abcd5678ef901234abcd5678ef901234abcd"
    
    // Generate annotated image (full pipeline)
    imagePath, err := dalle.GenerateAnnotatedImage(series, address, false, 5*time.Minute)
    if err != nil {
        log.Fatal(err)
    }
    fmt.Printf("Generated annotated image: %s\n", imagePath)
    
    // Optional: Generate speech narration
    audioPath, err := dalle.GenerateSpeech(series, address, 5*time.Minute)
    if err != nil {
        log.Printf("Speech generation failed: %v", err)
    } else if audioPath != "" {
        fmt.Printf("Generated speech: %s\n", audioPath)
    }
}

Progress Tracking

package main

import (
    "fmt"
    "time"
    
    dalle "github.com/TrueBlocks/trueblocks-dalle/v6"
)

func main() {
    series := "demo"
    address := "0xabcdef1234567890abcdef1234567890abcdef12"
    
    // Start generation in a goroutine
    go func() {
        _, err := dalle.GenerateAnnotatedImage(series, address, false, 5*time.Minute)
        if err != nil {
            fmt.Printf("Generation failed: %v\n", err)
        }
    }()
    
    // Monitor progress
    for {
        progress := dalle.GetProgress(series, address)
        if progress == nil {
            fmt.Println("No active progress")
            break
        }
        
        fmt.Printf("Phase: %s, Progress: %.1f%%, ETA: %ds\n", 
            progress.Current, progress.Percent, progress.ETASeconds)
            
        if progress.Done {
            fmt.Println("Generation completed!")
            break
        }
        
        time.Sleep(1 * time.Second)
    }
}

Series Management

package main

import (
    "fmt"
    
    dalle "github.com/TrueBlocks/trueblocks-dalle/v6"
)

func main() {
    // List all available series
    series := dalle.ListSeries()
    fmt.Printf("Available series: %v\n", series)
    
    // Clean up artifacts for a specific series/address
    dalle.Clean("demo", "0x1234...")
    
    // Get context count (for monitoring cache usage)
    count := dalle.ContextCount()
    fmt.Printf("Cached contexts: %d\n", count)
}

Generated Artifacts

Running the examples above creates the following directory structure under your data directory:

$TB_DALLE_DATA_DIR/
└── output/
    └── <series>/
        ├── data/
        │   └── <address>.txt          # Raw attribute data
        ├── title/
        │   └── <address>.txt          # Human-readable title
        ├── terse/
        │   └── <address>.txt          # Short caption
        ├── prompt/
        │   └── <address>.txt          # Full structured prompt
        ├── enhanced/
        │   └── <address>.txt          # OpenAI-enhanced prompt (if enabled)
        ├── generated/
        │   └── <address>.png          # Raw generated image
        ├── annotated/
        │   └── <address>.png          # Image with caption overlay
        ├── selector/
        │   └── <address>.json         # Complete DalleDress metadata
        └── audio/
            └── <address>.mp3          # Text-to-speech audio (if generated)

Caching Behavior

  • Cache hits: If an annotated image already exists, GenerateAnnotatedImage returns immediately
  • Incremental generation: Individual artifacts are cached, so partial runs can resume
  • Context caching: Series configurations are cached in memory with LRU eviction

Error Handling

imagePath, err := dalle.GenerateAnnotatedImage(series, address, false, 5*time.Minute)
if err != nil {
    switch {
    case strings.Contains(err.Error(), "API key"):
        log.Fatal("OpenAI API key required")
    case strings.Contains(err.Error(), "address required"):
        log.Fatal("Valid address string required")
    default:
        log.Fatalf("Generation failed: %v", err)
    }
}

Next Steps

Architecture Overview

The trueblocks-dalle library generates AI art deterministically from seed strings (like Ethereum addresses).

How It Works

The library implements a deterministic AI art generation pipeline that converts seed strings into structured semantic attributes, builds layered prompts, generates images via OpenAI APIs, and produces complete artifact sets with progress tracking.

Seed String → Select Attributes → Build Prompts → Generate Image → Add Caption → Save FilesImage Creation → Annotation → Artifact Persistence → Optional TTS

Key Concepts

  • Deterministic: Same seed always produces same output
  • Attribute-Driven: Seed chunks map to curated word lists (adjectives, styles, etc.)
  • Template-Based: Multiple prompt formats for different purposes
  • Complete Pipeline: Handles everything from prompt to final annotated image

Package Structure

The library is organized into focused packages:

Root Package (github.com/TrueBlocks/trueblocks-dalle/v6)

FileResponsibility
context.goContext struct, database loading, prompt generation orchestration
manager.goContext lifecycle management, LRU cache, public API functions
series.goSeries struct definition and core methods
series_crud.goSeries persistence, filtering, and management operations
text2speech.goOpenAI TTS integration and audio generation

Core Packages

PackagePurposeKey Files
pkg/modelData structures and typesdalledress.go, types.go
pkg/promptTemplate system and attribute derivationprompt.go, attribute.go
pkg/imageImage generation and processingimage.go
pkg/annotateImage annotation with text overlaysannotate.go
pkg/progressPhase tracking and metricsprogress.go
pkg/storageData directory and cache managementdatadir.go, cache.go, database.go
pkg/utilsUtility functionsVarious utility files

Core Components

1. Context Management

Purpose: Manages loaded series configurations, template compilation, and database caching.

type Context struct {
    Series         Series
    Databases      map[string][]string
    DalleCache     map[string]*model.DalleDress
    CacheMutex     sync.Mutex
    promptTemplate *template.Template
    // ... additional templates
}

Key Operations:

  • Series loading and filter application
  • Database slicing based on series constraints
  • DalleDress creation and caching
  • Template execution for multiple prompt formats

2. Series System

Purpose: Provides configurable filtering for attribute databases, enabling customized generation behavior.

type Series struct {
    Suffix       string   `json:"suffix"`
    Purpose      string   `json:"purpose,omitempty"`
    Deleted      bool     `json:"deleted,omitempty"`
    Adjectives   []string `json:"adjectives"`
    Nouns        []string `json:"nouns"`
    Emotions     []string `json:"emotions"`
    // ... additional attribute filters
}

Features:

  • JSON-backed persistence
  • Optional filtering for each attribute type
  • Soft deletion with recovery
  • Hierarchical organization

3. Prompt Generation Pipeline

Purpose: Converts deterministic attributes into multiple prompt formats using Go templates.

Template Types:

  • Data Template: Raw attribute listing
  • Title Template: Human-readable title generation
  • Terse Template: Short caption text
  • Prompt Template: Full structured prompt for image generation
  • Author Template: Attribution information

Enhancement Flow:

Base Prompt → (Optional) OpenAI Chat Enhancement → Final Prompt

4. Attribute Derivation

Purpose: Deterministically maps seed chunks to database entries.

Process:

  1. Normalize seed string (remove 0x, lowercase, pad)
  2. Split into 6-character hex chunks
  3. Map each chunk to database index via modulo
  4. Apply series filters if present
  5. Return selected attribute records

5. Image Generation

Purpose: Coordinates OpenAI DALL·E API calls with automatic retry, size detection, and download handling.

Features:

  • Orientation detection (portrait/landscape/square)
  • Size optimization based on prompt length
  • Base64 and URL download support
  • Retry logic with exponential backoff
  • Progress tracking integration

6. Image Annotation

Purpose: Adds caption overlays with dynamic background generation and contrast optimization.

Process:

  1. Analyze image palette for dominant colors
  2. Generate contrasting background banner
  3. Calculate optimal font size and positioning
  4. Render text with anti-aliasing
  5. Composite final annotated image

7. Progress Tracking

Purpose: Provides real-time generation monitoring with phase timing and ETA calculation.

Phases:

const (
    PhaseSetup         Phase = "setup"
    PhaseBasePrompts   Phase = "base_prompts"
    PhaseEnhance       Phase = "enhance_prompt"
    PhaseImagePrep     Phase = "image_prep"
    PhaseImageWait     Phase = "image_wait"
    PhaseImageDownload Phase = "image_download"
    PhaseAnnotate      Phase = "annotate"
    PhaseCompleted     Phase = "completed"
    PhaseFailed        Phase = "failed"
)

Features:

  • Exponential moving averages for ETA calculation
  • Optional run archival for historical analysis
  • Concurrent access safety
  • Cache hit detection

Data Flow Architecture

1. Context Resolution

Input: (series, address)
↓
Manager checks LRU cache
↓
If miss: Create new context, load series, filter databases
↓
Return cached context

2. DalleDress Creation

Context + Address
↓
Check DalleDress cache
↓
If miss: Derive attributes, execute templates, persist
↓
Return DalleDress with all prompt variants

3. Image Generation

DalleDress + API Key
↓
Determine orientation and size
↓
POST to OpenAI Images API
↓
Download/decode image
↓
Save to generated/ directory

4. Annotation

Generated Image + Terse Caption
↓
Analyze image palette
↓
Generate contrasting background
↓
Render text overlay
↓
Save to annotated/ directory

Storage Architecture

Directory Structure

$DATA_DIR/
├── output/
│   └── <series>/
│       ├── data/         # Attribute dumps
│       ├── title/        # Human titles
│       ├── terse/        # Captions
│       ├── prompt/       # Base prompts
│       ├── enhanced/     # Enhanced prompts
│       ├── generated/    # Raw images
│       ├── annotated/    # Captioned images
│       ├── selector/     # DalleDress JSON
│       └── audio/        # TTS audio
├── cache/
│   ├── databases.cache   # Binary database cache
│   └── temp/            # Temporary files
├── series/              # Series configurations
└── metrics/             # Progress metrics

Caching Strategy

Context Cache: LRU with TTL eviction prevents unbounded memory growth Database Cache: Binary serialization of processed CSV databases Artifact Cache: File existence checks enable fast cache hits Progress Cache: In-memory tracking with optional persistence

Integration Points

OpenAI APIs

  1. Chat Completions (optional): Prompt enhancement
  2. Images (required): DALL·E 3 generation
  3. Audio/Speech (optional): TTS narration

External Dependencies

  • github.com/TrueBlocks/trueblocks-core: Logging and file utilities
  • github.com/TrueBlocks/trueblocks-sdk: SDK integration
  • git.sr.ht/~sbinet/gg: Graphics rendering for annotation
  • github.com/lucasb-eyer/go-colorful: Color analysis

Error Handling Strategy

Network Resilience

  • Exponential backoff for API retries
  • Timeout configuration per operation type
  • Graceful degradation when services unavailable

Data Integrity

  • Atomic file operations to prevent corruption
  • Checksum validation for caches
  • Path traversal prevention

Recovery Mechanisms

  • Automatic cache rebuilding on corruption
  • Partial pipeline resumption via artifact caching
  • Context recreation on management errors

Extensibility Points

Custom Providers

  • Replace image.RequestImage for alternative generation services
  • Implement custom annotation renderers
  • Add new attribute databases

Template System

  • Add new prompt templates for different formats
  • Customize enhancement prompts for specific use cases
  • Extend attribute derivation logic

Progress Integration

  • Custom progress reporters for external monitoring
  • Metric exporters for observability systems
  • Archive processors for historical analysis

This architecture ensures scalable, reliable, and maintainable AI art generation while preserving deterministic behavior and comprehensive auditability.

Each phase completion updates a moving average (unless cache hit). Percent/ETA = (sum elapsed or average)/(sum averages).

Error Strategies

  • Network errors wrap into typed errors where practical (OpenAIAPIError).
  • Missing API key yields placeholder or skipped enhancement/image steps without failing the pipeline.
  • File path traversal is prevented via cleaned absolute path prefix checks.

Extending

Replace image.RequestImage for alternate providers; add new databases + methods on DalleDress for extra semantic dimensions; or decorate progress manager for custom telemetry.

Next: Context & Manager.

Context & Manager

This chapter drills into context construction, caching, and lifecycle management.

Context Responsibilities

context.go defines Context which bundles:

  • Templates (prompt, data, title, terse, author)
  • In-memory Databases (map: database name -> slice of CSV row strings)
  • Series metadata (filters & suffix)
  • DalleCache (address -> *DalleDress)

The context owns pure prompt state; it does not perform network calls (image generation and enhancement are separate functions using the context’s outputs).

Building a Context

NewContext():

  1. Loads cache manager (storage.GetCacheManager().LoadOrBuild())
  2. Initializes template pointers from prompt package variables
  3. Creates empty maps
  4. Calls ReloadDatabases("empty") to seed initial series

Database Loading

ReloadDatabases(filter string):

  • Loads Series via loadSeries(filter)
  • For each name in prompt.DatabaseNames tries cached binary index → falls back to CSV
  • Applies optional per-field filtering from Series.GetFilter(fieldName)
  • Ensures at least one row ("none") to avoid zero-length selection panics

Constructing a DalleDress

MakeDalleDress(address string):

  • Normalizes key (filename-safe) and returns cached instance if present
  • Builds a seed = original + reverse(original); enforces length >= 66; strips 0x
  • Iteratively slices 6 hex chars every 8 chars; maps them into attributes until databases exhausted
  • Builds prompt layers by executing templates; conditionally loads an enhanced prompt from disk if present
  • Stores under both original and normalized cache keys for future hits

Thread Safety

CacheMutex protects DalleCache. Additional saveMutex guards concurrent file writes in reportOn.

Manager Layer

manager.go adds an LRU+TTL around contexts so each series has at most one resident context. Key pieces:

  • managedContext struct holds context + lastUsed timestamp
  • Global contextManager map + order slice
  • ManagerOptions (MaxContexts, ContextTTL) adjustable via ConfigureManager
  • Eviction: contexts older than TTL are dropped; if still above capacity, least-recently-used removed

Generation Entry Point

GenerateAnnotatedImage(series, address, skipImage, lockTTL):

  1. Early return if annotated image already exists (synthetic cache hit progress run created)
  2. Acquire per-(series,address) lock with TTL to avoid duplicate concurrent generations
  3. Build / fetch context and DalleDress
  4. Start and transition progress phases (base prompts → enhance → image...) unless skipImage
  5. Delegate to Context.GenerateImageWithBaseURL for image pipeline
  6. Mark completion, update metrics

skipImage=true still produces prompt artifacts but bypasses network phases.

Locks

A map of requestLocks with TTL prevents burst duplicate work. Expired locks are cleaned opportunistically.

Cache Hit Shortcut

If annotated/<address>.png exists the system:

  • Builds DalleDress (ensures consistent metadata)
  • Starts a progress run (if one doesn’t already exist)
  • Marks cacheHit + completed without regenerating

Cleaning Artifacts

Clean(series, address) removes the generated set: annotated png, raw image, selector JSON, audio, and prompt text files across all prompt subdirectories.

When to Add a New Context Field

Add new fields only if they reflect deterministic state or necessary caches. Side-effectful network concerns belong outside.

Extension Strategies

  • Alternate Persistence: wrap reportOn or post-process after GenerateAnnotatedImage.
  • Custom Prompt Layers: execute additional templates with DalleDress.FromTemplate.
  • Series Variants: manage multiple series suffixes and rely on manager eviction for memory control.

Next: Series & Attribute Databases

Series & Attribute Databases

Purpose

A Series constrains or themes generations by restricting which rows from each logical database may be selected during attribute derivation. It also names the output namespace (folder suffix).

Database Order

Defined in prompt.DatabaseNames (order is significant for deterministic mapping):

adverbs, adjectives, nouns, emotions, occupations, actions,
artstyles, artstyles, litstyles,
colors, colors, colors,
orientations, gazes, backstyles

Duplicated entries (artstyles, colors) allow multiple independent selections without custom logic.

Raw Rows

Each database loads as a slice of strings (CSV lines, version prefixes stripped). Rows are treated as opaque until later parsed by accessor methods in DalleDress (splitting on commas, trimming pieces, etc.).

Series JSON Schema (excerpt)

{
  "suffix": "demo",
  "adverbs": ["swiftly", "boldly"],
  "adjectives": [],
  "nouns": [],
  "emotions": ["joy"],
  "deleted": false
}

Only non-empty slices act as filters. If a slice is empty, no filtering occurs for that category.

Filtering Logic

For each database:

  1. Load full slice (cache index → fallback CSV)
  2. If the corresponding Series slice is non-empty, retain rows containing any filter substring
  3. If the resulting slice is empty, insert a sentinel "none" to avoid selection panics

Substring containment (not exact match) enables flexible partial filters but may admit unintended rows; prefer distinctive tokens.

Attribute Construction Recap

prompt.NewAttribute(dbs, index, bytes):

  • Interprets 6 hex chars as number → factor in [0,1)
  • Scales to database length to pick selector index
  • Captures value string; accessor methods later format for prompt templates

Extending with a New Attribute

  1. Add a database file & loader logic (mirroring existing ones)
  2. Append names to DatabaseNames and attributeNames in the same positional slot
  3. Add a slice field to Series (exported, plural) for potential filtering
  4. Create accessor on DalleDress (e.g. func (d *DalleDress) Weather(short bool) string)
  5. Update templates (promptTemplateStr, etc.) to include the new semantic
  6. Regenerate docs

Changing the order of DatabaseNames is a breaking change to deterministic mapping and should be avoided after release.

Pitfalls

  • Over-filtering (e.g. selecting a single emotion) reduces variety and can cause visually repetitive outputs.
  • Adding a new attribute without updating templates yields unused entropy.
  • Removing an attribute breaks existing cached serialized DalleDress JSON consumers expecting that field.

Example Filter Use Case

To produce a cohesive color-themed series—populate colors slice in the series JSON with a shortlist (e.g. "ultramarine", "amber"). Those rows will dominate selection while other attributes still vary.

Next: Prompt Generation Pipeline

Storage Architecture & Data Directories

The trueblocks-dalle library organizes all data using a structured directory hierarchy managed by the pkg/storage package. This chapter explains the storage architecture, data directory resolution, and file organization patterns.

Data Directory Resolution

Default Location

The library automatically determines an appropriate data directory based on the platform:

func DataDir() string {
    if dir := os.Getenv("TB_DALLE_DATA_DIR"); dir != "" {
        return dir
    }
    // Falls back to platform-specific defaults
}

Platform defaults:

  • macOS: ~/Library/Application Support/TrueBlocks
  • Linux: ~/.local/share/TrueBlocks
  • Windows: %APPDATA%/TrueBlocks

Environment Override

Set TB_DALLE_DATA_DIR to use a custom location:

export TB_DALLE_DATA_DIR="/custom/path/to/dalle-data"

Directory Structure

The data directory contains several key subdirectories:

$TB_DALLE_DATA_DIR/
├── output/              # Generated artifacts (images, prompts, audio)
├── cache/               # Database and context caches
├── series/              # Series configuration files
└── metrics/             # Progress timing data

Output Directory

Generated artifacts are organized by series under output/:

output/
└── <series-name>/
    ├── data/            # Raw attribute data dumps
    ├── title/           # Human-readable titles
    ├── terse/           # Short captions
    ├── prompt/          # Full structured prompts
    ├── enhanced/        # OpenAI-enhanced prompts
    ├── generated/       # Raw DALL·E generated images
    ├── annotated/       # Images with caption overlays
    ├── selector/        # Complete DalleDress JSON metadata
    └── audio/           # Text-to-speech MP3 files

Each subdirectory contains files named <address>.ext where:

  • address is the input seed string (typically Ethereum address)
  • ext is the appropriate file extension (.txt, .png, .json, .mp3)

Cache Directory

The cache directory stores processed database indexes and temporary files:

cache/
├── databases.cache      # Binary database cache file
├── series.cache         # Series configuration cache
└── temp/               # Temporary files during processing

Series Directory

Series configurations are stored as JSON files:

series/
├── default.json         # Default series configuration
├── custom-series.json   # Custom series with filters
└── deleted/            # Soft-deleted series
    └── old-series.json

File Path Utilities

The storage package provides utilities for constructing paths:

Core Functions

// Base directories
func DataDir() string                    // Main data directory
func OutputDir() string                  // output/ subdirectory
func SeriesDir() string                  // series/ subdirectory
func CacheDir() string                   // cache/ subdirectory

// Path construction
func EnsureDir(path string) error        // Create directory if needed
func CleanPath(path string) string       // Sanitize file paths

Path Security

All file operations include security checks to prevent directory traversal:

// Example from annotate.go
cleanName := filepath.Clean(fileName)
if !strings.Contains(cleanName, string(os.PathSeparator)+"generated"+string(os.PathSeparator)) {
    return "", fmt.Errorf("invalid image path: %s", fileName)
}

Artifact Lifecycle

Creation Flow

  1. Directory Creation: Output directories are created as needed during generation
  2. Incremental Writing: Artifacts are written as they're generated (prompts → image → annotation)
  3. Atomic Operations: Files are written atomically to prevent corruption
  4. Metadata Updates: JSON metadata is updated throughout the process

Caching Strategy

  • Existence Checks: If an annotated image exists, the pipeline returns immediately (cache hit)
  • Incremental Processing: Individual artifacts are cached, allowing partial resume
  • Selective Regeneration: Only missing or outdated artifacts are regenerated

Cleanup Operations

The Clean function removes all artifacts for a series/address pair:

func Clean(series, address string) {
    // Removes files from all output subdirectories
    // Clears cached DalleDress entries
    // Updates progress tracking
}

Database Storage

Embedded Databases

Attribute databases are embedded in the binary as compressed tar.gz archives:

pkg/storage/databases.tar.gz     # Compressed attribute databases

Cache Format

Processed databases are cached in binary format for fast loading:

type DatabaseCache struct {
    Version    string                   // Cache version
    Timestamp  int64                    // Creation time
    Databases  map[string]DatabaseIndex // Processed indexes
    Checksum   string                   // Validation checksum
    SourceHash string                   // Source data hash
}

Cache Validation

The cache system validates integrity on load:

  1. Checksum Verification: Ensures cache file hasn't been corrupted
  2. Source Hash Check: Detects if embedded databases have changed
  3. Version Compatibility: Handles cache format changes
  4. Automatic Rebuild: Rebuilds cache if validation fails

Performance Considerations

Directory Operations

  • Lazy Creation: Directories are created only when needed
  • Path Caching: Resolved paths are cached to avoid repeated filesystem calls
  • Batch Operations: Multiple files in the same directory are processed efficiently

Storage Optimization

  • Binary Caching: Database indexes use efficient binary serialization
  • Compression: Embedded databases are compressed to reduce binary size
  • Selective Loading: Only required database sections are loaded into memory

Cleanup Strategies

  • Automatic Cleanup: Temporary files are cleaned up on completion or failure
  • LRU Eviction: Context cache uses LRU eviction to prevent unbounded growth
  • Configurable Retention: TTL settings control how long contexts remain cached

Error Handling

Common Storage Errors

// Permission issues
if os.IsPermission(err) {
    // Handle insufficient filesystem permissions
}

// Disk space issues
if strings.Contains(err.Error(), "no space left") {
    // Handle disk space exhaustion
}

// Path traversal attempts
if strings.Contains(err.Error(), "invalid path") {
    // Handle security violations
}

Recovery Strategies

  • Graceful Degradation: Continue operation when non-critical files can't be written
  • Cache Rebuilding: Automatically rebuild corrupted caches
  • Alternative Paths: Fall back to temporary directories if primary locations fail

Integration Points

With Context Management

  • Series configurations are loaded from the series directory
  • Context cache uses storage utilities for persistence
  • Database loading integrates with the cache management system

With Progress Tracking

  • Progress metrics are persisted to the data directory
  • Temporary run state is stored in cache directory
  • Completed runs can optionally archive detailed timing data

With Generation Pipeline

  • Each generation phase writes artifacts to appropriate subdirectories
  • File existence checks drive caching decisions
  • Path resolution ensures consistent artifact locations

This storage architecture provides a robust foundation for reproducible, auditable, and efficient artifact management throughout the generation pipeline.

Database Caching & Management

The trueblocks-dalle library uses a sophisticated caching system to manage attribute databases efficiently. This chapter explains how databases are loaded, cached, and accessed during generation.

Database Architecture

Embedded Databases

The library includes curated databases of semantic attributes embedded as compressed archives:

pkg/storage/databases.tar.gz    # Compressed CSV databases
pkg/storage/series.tar.gz       # Default series configurations

Database Types

The system includes these semantic databases:

DatabasePurposeExample Entries
adjectivesDescriptive attributes"mysterious", "elegant", "ancient"
adverbsManner modifiers"gracefully", "boldly", "subtly"
nounsCore subjects"warrior", "scholar", "merchant"
emotionsEmotional states"contemplative", "joyful", "melancholic"
occupationsProfessional roles"architect", "botanist", "craftsperson"
actionsPhysical activities"meditating", "dancing", "reading"
artstylesArtistic movements"impressionist", "art nouveau", "bauhaus"
litstylesLiterary styles"romantic", "gothic", "minimalist"
colorsColor palettes"cerulean", "burnt sienna", "sage green"
viewpointsCamera angles"bird's eye view", "close-up", "wide shot"
gazesEye directions"looking away", "direct gaze", "upward"
backstylesBackground types"cosmic void", "forest clearing", "urban"
compositionsLayout patterns"rule of thirds", "centered", "asymmetric"

Cache System

Cache Manager

The CacheManager provides centralized access to processed databases:

type CacheManager struct {
    mu       sync.RWMutex
    cacheDir string
    dbCache  *DatabaseCache
    loaded   bool
}

func GetCacheManager() *CacheManager {
    // Returns singleton instance
}

Cache Loading

func (cm *CacheManager) LoadOrBuild() error {
    // 1. Try to load existing cache
    // 2. Validate cache integrity
    // 3. Rebuild if invalid or missing
    // 4. Update loaded flag
}

Cache Structure

type DatabaseCache struct {
    Version    string                   `json:"version"`
    Timestamp  int64                    `json:"timestamp"`
    Databases  map[string]DatabaseIndex `json:"databases"`
    Checksum   string                   `json:"checksum"`
    SourceHash string                   `json:"sourceHash"`
}

type DatabaseIndex struct {
    Name    string           `json:"name"`
    Version string           `json:"version"`
    Records []DatabaseRecord `json:"records"`
    Lookup  map[string]int   `json:"lookup"`
}

type DatabaseRecord struct {
    Key    string   `json:"key"`
    Values []string `json:"values"`
}

Database Loading Process

1. Cache Validation

func (cm *CacheManager) validateCache() bool {
    // Check file existence
    // Verify checksum integrity
    // Compare source hash with embedded data
    // Validate version compatibility
}

2. Database Extraction

If cache is invalid, databases are extracted from embedded archives:

func extractDatabases() error {
    // Extract databases.tar.gz
    // Parse CSV files
    // Build lookup indexes
    // Generate checksums
}

3. Index Building

Each database CSV is processed into an efficient lookup structure:

CSV Format:
key,value1,value2,version
warrior,brave fighter,medieval soldier,1.0
scholar,learned person,academic researcher,1.0

Index Structure:
{
  "Name": "nouns",
  "Version": "1.0",
  "Records": [
    {"Key": "warrior", "Values": ["brave fighter", "medieval soldier", "1.0"]},
    {"Key": "scholar", "Values": ["learned person", "academic researcher", "1.0"]}
  ],
  "Lookup": {"warrior": 0, "scholar": 1}
}

4. Binary Serialization

Processed indexes are serialized using Go's gob encoding for fast loading:

func saveCacheLocked(cm *CacheManager) error {
    file, err := os.Create(cm.cacheFile())
    if err != nil {
        return err
    }
    defer file.Close()
    
    encoder := gob.NewEncoder(file)
    return encoder.Encode(cm.dbCache)
}

Attribute Selection

Deterministic Lookup

Attributes are selected deterministically using seed-based indexing:

func selectAttribute(database []DatabaseRecord, seedChunk string) DatabaseRecord {
    // Convert hex chunk to number
    // Use modulo to get valid index
    // Return corresponding record
    index := hexToNumber(seedChunk) % len(database)
    return database[index]
}

Seed Processing

The input seed is processed into consistent chunks:

func processSeed(address string) []string {
    // Normalize to lowercase hex
    // Remove 0x prefix if present
    // Pad to minimum length
    // Split into 6-character chunks
    // Return ordered chunks
}

Series Filtering

When a series specifies filters, only matching records are eligible:

func applySeriesFilter(records []DatabaseRecord, filter []string) []DatabaseRecord {
    if len(filter) == 0 {
        return records // No filter = all records
    }
    
    var filtered []DatabaseRecord
    for _, record := range records {
        if contains(filter, record.Key) {
            filtered = append(filtered, record)
        }
    }
    return filtered
}

Performance Optimizations

Memory Management

  • Lazy Loading: Databases are loaded only when first accessed
  • Shared Instances: Multiple contexts share the same cache manager
  • Efficient Indexes: O(1) lookup using hash maps

Cache Efficiency

// Binary cache loading is significantly faster than CSV parsing
func BenchmarkCacheLoading(b *testing.B) {
    // Binary cache: ~1ms
    // CSV parsing: ~50ms
    // Speedup: 50x
}

Concurrent Access

The cache manager uses read-write locks for thread safety:

func (cm *CacheManager) GetDatabase(name string) DatabaseIndex {
    cm.mu.RLock()
    defer cm.mu.RUnlock()
    return cm.dbCache.Databases[name]
}

Cache Invalidation

Automatic Rebuilding

The cache is automatically rebuilt when:

  1. Missing Cache File: No cache exists on disk
  2. Checksum Mismatch: Cache file is corrupted
  3. Source Hash Change: Embedded databases have been updated
  4. Version Incompatibility: Cache format has changed

Manual Cache Management

// Force cache rebuild
func (cm *CacheManager) Rebuild() error {
    cm.mu.Lock()
    defer cm.mu.Unlock()
    return cm.buildCacheLocked()
}

// Clear cache
func (cm *CacheManager) Clear() error {
    return os.Remove(cm.cacheFile())
}

Integration with Context

Database Loading in Context

func (ctx *Context) ReloadDatabases(series string) error {
    cm := storage.GetCacheManager()
    
    // Ensure cache is loaded
    if err := cm.LoadOrBuild(); err != nil {
        return err
    }
    
    // Apply series filters to each database
    for dbName := range ctx.Databases {
        filtered := cm.GetFilteredDatabase(dbName, series)
        ctx.Databases[dbName] = filtered
    }
    
    return nil
}

Filter Application

Series filters are applied when loading databases into a context:

func (cm *CacheManager) GetFilteredDatabase(dbName, series string) []string {
    // Load series configuration
    seriesConfig := loadSeries(series)
    
    // Get database records
    dbIndex := cm.dbCache.Databases[dbName]
    
    // Apply series filter if present
    filter := seriesConfig.GetFilter(dbName)
    if len(filter) > 0 {
        return applyFilter(dbIndex.Records, filter)
    }
    
    // Return all records if no filter
    return extractKeys(dbIndex.Records)
}

Error Handling

Cache Loading Errors

func handleCacheError(err error) {
    switch {
    case os.IsNotExist(err):
        logger.Info("Cache file not found, will rebuild")
    case strings.Contains(err.Error(), "checksum"):
        logger.Warn("Cache checksum mismatch, rebuilding")
    case strings.Contains(err.Error(), "version"):
        logger.Info("Cache version changed, rebuilding")
    default:
        logger.Error("Unexpected cache error:", err)
    }
}

Fallback Strategies

  • Memory Fallback: If cache can't be written, keep in memory only
  • Rebuild on Error: Automatically rebuild corrupted caches
  • Graceful Degradation: Continue with available databases if some fail

Monitoring and Debugging

Cache Statistics

func (cm *CacheManager) Stats() CacheStats {
    return CacheStats{
        LoadTime:     cm.loadTime,
        DatabaseCount: len(cm.dbCache.Databases),
        TotalRecords: cm.countRecords(),
        CacheSize:    cm.cacheFileSize(),
        LastUpdated:  time.Unix(cm.dbCache.Timestamp, 0),
    }
}

Debug Information

func (cm *CacheManager) DebugInfo() map[string]interface{} {
    return map[string]interface{}{
        "loaded":      cm.loaded,
        "cache_file":  cm.cacheFile(),
        "version":     cm.dbCache.Version,
        "checksum":    cm.dbCache.Checksum,
        "databases":   maps.Keys(cm.dbCache.Databases),
    }
}

This caching system ensures fast, reliable access to semantic databases while maintaining data integrity and supporting efficient filtering through the series system.

Prompt Generation Pipeline

Overview

Prompt layers provide multiple projections of the same attribute set for different downstream uses: captioning, logging, enhancement, and model instruction.

Layers

DirectoryPurpose
data/Raw attribute dump for auditing.
title/Composite title (emotion + adverb + adjective + occupation + noun).
terse/Short caption placed on annotated image.
prompt/Structured base instruction combining attributes into narrative.
enhanced/Optional ChatGPT-refined version of base prompt.

Templates

Defined in pkg/prompt/prompt.go as Go text/template instances. Template methods invoked on DalleDress (e.g. {{.Noun true}}) control short/long formatting.

Example Snippet (Base Prompt)

Draw a {{.Adverb false}} {{.Adjective false}} {{.Noun true}} ...

Literary Style

If a literary style attribute is present (not none), extra author persona context (AuthorTemplate) precedes enhancement.

Enhancement

EnhancePrompt(prompt, authorType) calls OpenAI Chat with model gpt-4. Bypass rules:

  • Environment TB_DALLE_NO_ENHANCE=1
  • Missing OPENAI_API_KEY

Output is wrapped with guard text: DO NOT PUT TEXT IN THE IMAGE (added both sides) to discourage textual artifacts inside generated images.

Accessor Semantics

Each accessor on DalleDress returns either a short token or expanded annotated string depending on a boolean flag (short). Some (Orientation, BackStyle) embed other attribute values.

Adding A New Layer

  1. Create template constant + compiled variable in prompt.go.
  2. Add pointer in Context if persisted similarly.
  3. Execute in MakeDalleDress and persist like existing layers.
  4. Optionally expose accessor or include in API docs.

Failure Modes

  • Missing attribute keys (extending without accessor) → template execution error.
  • Enhancement HTTP failure → returns typed OpenAIAPIError; generation may still proceed with base prompt.

Next: Image Request & Annotation

Image Request & Annotation

Request Flow

image.RequestImage steps:

  1. Build prompt.Request with model (currently dall-e-3).
  2. Infer size from orientation keywords (landscape/horizontal/vertical) else square.
  3. Early placeholder file if OPENAI_API_KEY missing.
  4. POST to OpenAI images endpoint (override via baseURL parameter upstream).
  5. Parse response: URL path OR base64 fallback (b64_json).
  6. Download or decode → write generated/<file>.png.
  7. Annotate with terse prompt via annotate.Annotate → write annotated/<file>.png.
  8. Update progress phases (wait → download → annotate) and DalleDress fields (ImageURL, DownloadMode, paths).

Download Modes

  • url direct HTTP GET
  • b64 inline base64 decode (when no URL provided)

Annotation Mechanics

annotate.Annotate:

  • Validates source path contains /generated/
  • Reads image, computes dominant colors (top 3 frequency) → average background
  • Chooses contrasting text (white/black) via lightness threshold
  • Draws separator line + wraps caption text
  • Writes sibling file swapping generatedannotated

Error Handling

  • Network or decode failure returns error; upstream marks progress failure.
  • Annotation failure aborts after raw image write (annotated artifact missing).
  • Missing API key: annotated placeholder is empty file (still allows cache semantics).

Customization Points

  • Change model: patch selection logic near modelName variable.
  • Bypass annotation: replace annotateFunc var in tests or fork logic.
  • Add watermark: extend Annotate to composite additional graphics.

Next: Progress Tracking

Progress Tracking

Purpose

Real-time insight & metrics for the generation lifecycle with ETA estimates grounded in moving average phase durations.

Phases

Ordered list (progress.OrderedPhases):

setup → base_prompts → enhance_prompt → image_prep → image_wait → image_download → annotate → failed/completed

image_prep currently acts as a transitional placeholder (timing can be extended in future).

Data Structures

  • ProgressManager singleton keyed by series:address
  • progressRun internal mutable state
  • ProgressReport externally returned snapshot
  • Exponential moving average per phase (alpha=0.2)

Percent & ETA Calculation

  1. Sum average durations of non-terminal phases → total
  2. Accumulate averages of completed phases + capped elapsed of current phase → done
  3. percent = done/total * 100; ETA = (total - done)

Cache hits skip average updates and are immediately marked completed.

Archival & Metrics Persistence

Phase averages stored in <DataDir>/metrics/progress_phase_stats.json. Set TB_DALLE_ARCHIVE_RUNS=1 to serialize per-run snapshots under metrics/runs/.

Public Functions

  • GetProgress(series,address) returns (and prunes when completed)
  • ActiveProgressReports() returns active snapshots

Failure Path

On error: current phase ends, run transitions to failed, summary log emitted, and (if archival enabled) snapshot saved.

Extending

Add a new phase by appending to OrderedPhases, initializing timing in StartRun, and inserting transitions in generation code. Consider metrics implications (initial averages start undefined until one run completes that phase).

Next: Text-to-Speech

Text-to-Speech

Overview

Optional conversion of the enhanced (or base) prompt into an mp3 using OpenAI tts-1 model.

Entry Points

  • GenerateSpeech(series,address,lockTTL) ensures mp3 exists (respects per-address lock)
  • Speak(series,address) generate-if-missing then returns path
  • ReadToMe(series,address) same semantics (ensure mp3, return path)

Conditions

  • Skips silently if OPENAI_API_KEY unset
  • Voice defaults to alloy
  • 1 minute context timeout; simple retry loop until success or timeout

Storage

<DataDir>/output/<series>/audio/<address>.mp3

Implementation Notes

  • Minimal JSON body manually constructed (lighter than defining structs)
  • Escapes quotes/newlines with marshalEscaped
  • Retries on non-200 logging attempt, status, and any error
  • Uses 0600 file permissions

Customization

Wrap TextToSpeech to swap provider or implement local TTS; keep same output path for integration with existing tooling.

Next: Public API Reference

Public API Reference

Complete documentation of exported functions and types in the trueblocks-dalle package.

Primary Generation Functions

Image Generation

GenerateAnnotatedImage

func GenerateAnnotatedImage(series, address string, skipImage bool, lockTTL time.Duration) (string, error)

Generates a complete annotated image through the full pipeline. Returns the path to the annotated PNG file.

Parameters:

  • series: Series name for filtering attributes and organizing output
  • address: Seed string (typically Ethereum address) for deterministic generation
  • skipImage: If true, skips image generation (useful for prompt-only operations)
  • lockTTL: Maximum time to hold generation lock (prevents concurrent runs)

Returns: Path to annotated image file

GenerateAnnotatedImageWithBaseURL

func GenerateAnnotatedImageWithBaseURL(series, address string, skipImage bool, lockTTL time.Duration, baseURL string) (string, error)

Same as GenerateAnnotatedImage but allows overriding the OpenAI API base URL.

Additional Parameter:

  • baseURL: Custom OpenAI API endpoint (empty string uses default)

Speech Generation

GenerateSpeech

func GenerateSpeech(series, address string, lockTTL time.Duration) (string, error)

Generates text-to-speech audio for the enhanced prompt. Returns path to MP3 file.

Parameters:

  • series: Series name for file organization
  • address: Seed string (creates DalleDress to get prompt text)
  • lockTTL: Lock timeout duration (0 uses default 2 minutes)

Returns: Path to generated MP3 file (empty string if no API key)

Example:

audioPath, err := dalle.GenerateSpeech("demo", "0x1234...", 5*time.Minute)
if err != nil {
    log.Fatal(err)
}
if audioPath != "" {
    fmt.Printf("Audio saved to: %s", audioPath)
    // Output: Audio saved to: output/demo/audio/0x1234....mp3
}

Speak

func Speak(series, address string) (string, error)

Convenience function that generates speech if not already present, then returns the path.

Example:

audioPath, err := dalle.Speak("demo", "0x1234...")
if err != nil {
    log.Fatal(err)
}
// Uses default lockTTL, generates if missing

ReadToMe

func ReadToMe(series, address string) (string, error)

Alias for Speak with semantic naming. Same functionality as Speak.

Example:

audioPath, err := dalle.ReadToMe("demo", "0x1234...")
// Identical to dalle.Speak("demo", "0x1234...")

TextToSpeech

func TextToSpeech(text string, voice string, series string, address string) (string, error)

Low-level text-to-speech function for custom text.

Parameters:

  • text: Text content to convert to speech
  • voice: OpenAI voice name ("alloy", "echo", "fable", "onyx", "nova", "shimmer")
  • series: Series for output organization
  • address: Address for file naming

Example:

audioPath, err := dalle.TextToSpeech("Hello, this is a test message", "alloy", "demo", "test")
if err != nil {
    log.Fatal(err)
}
// Creates: output/demo/audio/test.mp3

Available Voices:

  • alloy: Neutral, balanced tone
  • echo: Clear, crisp delivery
  • fable: Warm, expressive reading
  • onyx: Deep, rich voice
  • nova: Bright, energetic tone
  • shimmer: Soft, gentle delivery

Context Management

Context Access

NewContext

func NewContext() *Context

Creates a new Context with initialized templates, databases, and cache. Loads the "empty" series by default.

func (ctx *Context) MakeDalleDress(address string) (*model.DalleDress, error)

Builds or retrieves a DalleDress from cache for the given address.

func (ctx *Context) GetPrompt(address string) string
func (ctx *Context) GetEnhanced(address string) string

Retrieve base or enhanced prompt text for an address. Returns error message as string if address lookup fails.

func (ctx *Context) GenerateImage(address string) (string, error)
func (ctx *Context) GenerateImageWithBaseURL(address, baseURL string) (string, error)

Generate image for an address (requires existing DalleDress in cache). Returns path to generated image.

func (ctx *Context) GenerateEnhanced(address string) (string, error)

Generates a literarily-enhanced prompt for the given address using OpenAI Chat API.

func (ctx *Context) Save(address string) bool

Generates and saves prompt data for the given address. Returns true on success.

func (ctx *Context) ReloadDatabases(filter string) error

Reload attribute databases with series-specific filters.

Context Manager

ConfigureManager

func ConfigureManager(opts ManagerOptions)

Configure context cache behavior.

type ManagerOptions struct {
    MaxContexts int           // Maximum cached contexts (default: 20)
    ContextTTL  time.Duration // Context expiration time (default: 30 minutes)
}

ContextCount

func ContextCount() int

Returns the number of currently cached contexts.

IsValidSeries

func IsValidSeries(series string, list []string) bool

Determines whether a requested series is valid given an optional list. If list is empty, returns true for any series.

Parameters:

  • series: Series name to validate
  • list: Optional list of valid series names

Returns: True if series is valid, false otherwise

Series Management

ListSeries

func ListSeries() []string

Returns list of all available series names.

Example:

series := dalle.ListSeries()
fmt.Printf("Available series: %v", series)
// Output: Available series: [demo test custom]

Series CRUD Operations

func LoadSeriesModels(dir string) ([]Series, error)
func LoadActiveSeriesModels(dir string) ([]Series, error)
func LoadDeletedSeriesModels(dir string) ([]Series, error)

Load series configurations from directory.

func DeleteSeries(dir, suffix string) error
func UndeleteSeries(dir, suffix string) error
func RemoveSeries(dir, suffix string) error

Manage series lifecycle (mark deleted, restore, or permanently remove).

Progress Tracking

GetProgress

func GetProgress(series, addr string) *ProgressReport

Get current progress snapshot for a generation run (nil if not active).

ActiveProgressReports

func ActiveProgressReports() []*ProgressReport

Get all currently active progress reports. Returns snapshots for all non-completed runs.

Progress Testing Helpers

func ForceMetricsSave()
func ResetMetricsForTest()

Testing utilities for forcing metrics persistence and clearing metrics state.

Utility Functions

Clean

func Clean(series, address string)

Remove all generated artifacts for a specific series/address combination.

Example:

// Clean up all files for a specific address
dalle.Clean("demo", "0x1234567890abcdef1234567890abcdef12345678")

// This removes:
// - output/demo/annotated/0x1234...png
// - output/demo/generated/0x1234...png  
// - output/demo/selector/0x1234...json
// - output/demo/audio/0x1234...mp3
// - All prompt text files (data, title, terse, etc.)

Test Helpers

func ResetContextManagerForTest()

Reset context manager state (testing only).

Core Data Types

DalleDress

type DalleDress struct {
    Original        string                    `json:"original"`
    OriginalName    string                    `json:"originalName"`
    FileName        string                    `json:"fileName"`
    FileSize        int64                     `json:"fileSize"`
    ModifiedAt      int64                     `json:"modifiedAt"`
    Seed            string                    `json:"seed"`
    Prompt          string                    `json:"prompt"`
    DataPrompt      string                    `json:"dataPrompt"`
    TitlePrompt     string                    `json:"titlePrompt"`
    TersePrompt     string                    `json:"tersePrompt"`
    EnhancedPrompt  string                    `json:"enhancedPrompt"`
    Attribs         []prompt.Attribute        `json:"attributes"`
    AttribMap       map[string]prompt.Attribute `json:"-"`
    SeedChunks      []string                  `json:"seedChunks"`
    SelectedTokens  []string                  `json:"selectedTokens"`
    SelectedRecords []string                  `json:"selectedRecords"`
    ImageURL        string                    `json:"imageUrl"`
    GeneratedPath   string                    `json:"generatedPath"`
    AnnotatedPath   string                    `json:"annotatedPath"`
    DownloadMode    string                    `json:"downloadMode"`
    IPFSHash        string                    `json:"ipfsHash"`
    CacheHit        bool                      `json:"cacheHit"`
    Completed       bool                      `json:"completed"`
    Series          string                    `json:"series"`
}

Series

type Series struct {
    Last         int      `json:"last,omitempty"`
    Suffix       string   `json:"suffix"`
    Purpose      string   `json:"purpose,omitempty"`
    Deleted      bool     `json:"deleted,omitempty"`
    Adverbs      []string `json:"adverbs"`
    Adjectives   []string `json:"adjectives"`
    Nouns        []string `json:"nouns"`
    Emotions     []string `json:"emotions"`
    Occupations  []string `json:"occupations"`
    Actions      []string `json:"actions"`
    Artstyles    []string `json:"artstyles"`
    Litstyles    []string `json:"litstyles"`
    Colors       []string `json:"colors"`
    Viewpoints   []string `json:"viewpoints"`
    Gazes        []string `json:"gazes"`
    Backstyles   []string `json:"backstyles"`
    Compositions []string `json:"compositions"`
    ModifiedAt   string   `json:"modifiedAt,omitempty"`
}

Attribute

type Attribute struct {
    Database string   `json:"database"`
    Name     string   `json:"name"`
    Bytes    string   `json:"bytes"`
    Number   int      `json:"number"`
    Factor   float64  `json:"factor"`
    Selector string   `json:"selector"`
    Value    string   `json:"value"`
}

ProgressReport

type ProgressReport struct {
    Series        string                  `json:"series"`
    Address       string                  `json:"address"`
    Current       Phase                   `json:"currentPhase"`
    StartedNs     int64                   `json:"startedNs"`
    Percent       float64                 `json:"percent"`
    ETASeconds    float64                 `json:"etaSeconds"`
    Done          bool                    `json:"done"`
    Error         string                  `json:"error"`
    CacheHit      bool                    `json:"cacheHit"`
    Phases        []*PhaseTiming          `json:"phases"`
    DalleDress    *model.DalleDress       `json:"dalleDress"`
    PhaseAverages map[Phase]time.Duration `json:"phaseAverages"`
}

Phases

type Phase string

const (
    PhaseSetup         Phase = "setup"
    PhaseBasePrompts   Phase = "base_prompts"
    PhaseEnhance       Phase = "enhance_prompt"
    PhaseImagePrep     Phase = "image_prep"
    PhaseImageWait     Phase = "image_wait"
    PhaseImageDownload Phase = "image_download"
    PhaseAnnotate      Phase = "annotate"
    PhaseFailed        Phase = "failed"
    PhaseCompleted     Phase = "completed"
)

Environment Variables

VariableDescriptionDefault
OPENAI_API_KEYOpenAI API key for image generation, enhancement, and TTSRequired
TB_DALLE_DATA_DIRCustom data directory pathPlatform default
TB_DALLE_NO_ENHANCESet to "1" to disable prompt enhancementEnhancement enabled
TB_DALLE_ARCHIVE_RUNSSet to "1" to save progress snapshots to JSON filesDisabled
TB_CMD_LINESet to "true" to auto-open images on macOSDisabled

Examples:

export OPENAI_API_KEY="sk-..."
export TB_DALLE_DATA_DIR="/custom/dalle/data" 
export TB_DALLE_NO_ENHANCE=1
export TB_DALLE_ARCHIVE_RUNS=1

Error Types

Primary Error Types

  • prompt.OpenAIAPIError – Structured error from OpenAI API calls
    • Fields: Message (string), StatusCode (int), RequestID (string), Code (string)
    • Method: IsRetryable() bool – Determines if error should be retried

Common Error Patterns

import "github.com/TrueBlocks/trueblocks-dalle/v6/pkg/prompt"

// Check for OpenAI API errors
if apiErr, ok := err.(*prompt.OpenAIAPIError); ok {
    switch apiErr.StatusCode {
    case 401:
        // Invalid API key
    case 429:
        // Rate limited, retry with backoff
    case 500, 502, 503:
        // Server errors, safe to retry
    }
}

// Missing API key
if strings.Contains(err.Error(), "API key") {
    // Handle missing OPENAI_API_KEY
}

// Invalid inputs
if strings.Contains(err.Error(), "address required") {
    // Handle empty address
}
if strings.Contains(err.Error(), "series not found") {
    // Handle invalid series name
}

|----------|-------------| | TB_DALLE_DATA_DIR | Override base data directory root. | | OPENAI_API_KEY | Enables enhancement, image, and TTS. | | TB_DALLE_NO_ENHANCE | Skip GPT-based enhancement if 1. | | TB_DALLE_ARCHIVE_RUNS | Archive per-run JSON snapshots if 1. | | TB_CMD_LINE | If true, attempt to open annotated image (macOS). |

Error Types & Troubleshooting

  • prompt.OpenAIAPIError – Contains fields: Message, StatusCode, Code.

Common Issues & Solutions

"address required" Error

Cause: Empty or nil address parameter passed to generation functions.
Solution: Ensure address string is not empty before calling GenerateAnnotatedImage or related functions.

Silent Generation Failures

Cause: Missing OPENAI_API_KEY environment variable.
Solution: Set the environment variable: export OPENAI_API_KEY=\"sk-...\"
Note: Functions return empty paths or skip silently when no API key is present.

"seed length is less than 66" Error

Cause: Address string too short for seed generation.
Solution: Ensure address is a valid Ethereum address (42 characters with 0x prefix) or longer seed string.

Image Generation Timeouts

Cause: OpenAI API delays or network issues.
Solution: Increase lockTTL parameter in generation functions or check network connectivity.

Context Cache Issues

Cause: Memory pressure or too many cached contexts.
Solution: Use ConfigureManager to adjust MaxContexts and ContextTTL settings.

Progress Reports Return Nil

Cause: Generation not started, completed, or failed runs are pruned after first report.
Solution: Check return value and handle nil case. Use ActiveProgressReports() for ongoing monitoring.

Patterns

Typical flow:

path, err := dalle.GenerateAnnotatedImage(series, address, false, 0)
if err != nil { /* handle */ }
if audio, _ := dalle.GenerateSpeech(series, address, 0); audio != "" { /* use mp3 */ }

Next: Advanced Usage & Extensibility

Advanced Usage & Extensibility

Custom Image Backend

Replace OpenAI generation by wrapping GenerateAnnotatedImage:

  1. Call GenerateAnnotatedImage(series,address,true,0) (skipImage=true) to produce prompts without network.
  2. Read enhanced or base prompt from disk.
  3. Use external backend → write PNG to generated/<filename>.png.
  4. Call annotate.Annotate to create annotated version.

Offline Mode

Without OPENAI_API_KEY you still obtain deterministic prompt artifacts; image and enhancement phases become no-ops producing placeholders.

Adding An Attribute

  • Add database file, extend prompt.DatabaseNames / attributeNames.
  • Add accessor on DalleDress.
  • Update templates to reference new method.
  • Regenerate documentation.

Concurrency Patterns

Generate multiple distinct (series,address) pairs concurrently; per-pair lock prevents duplication. For global throttling, introduce an external worker pool.

Progress Integration

Poll GetProgress asynchronously. Each ProgressReport returns a pointer to live DalleDress (treat read-only).

Metrics & Observability

Phase averages are basic; integrate with OpenTelemetry by wrapping generation calls and annotating spans with phase transitions.

Error Injection Testing

  • Invalid API key → expect OpenAIAPIError.
  • Simulate network timeout with firewall/latency tool; observe image.post.timeout log.

Determinism Considerations

Disable enhancement (TB_DALLE_NO_ENHANCE=1) for reproducible regression tests.

Swapping Templates at Runtime

Use DalleDress.FromTemplate(customTemplateString) to experiment. Persist results to a custom directory to avoid interfering with canonical artifacts.

Security Notes

  • Path cleaning already enforced; retain checks if adding new file outputs.
  • Keep API key only in environment; avoid embedding in logs (current debug curl masks only by relying on environment—sanitize if extending).

Next: Testing & Contributing

Testing & Contributing

This chapter covers the testing architecture, contribution guidelines, and development workflows for the trueblocks-dalle library.

Test Architecture

The library includes comprehensive test coverage across multiple layers:

Unit Tests

Core Components:

  • Context creation and management (context_test.go)
  • Series operations and filtering (series_test.go)
  • DalleDress creation and templating (pkg/model/dalledress_test.go)
  • Progress tracking and metrics (pkg/progress/progress_test.go)
  • Storage operations (pkg/storage/*_test.go)
  • Image annotation (pkg/annotate/annotate_test.go)

Key Test Areas:

  • Attribute derivation determinism
  • Template execution correctness
  • Series filtering logic
  • Progress phase transitions
  • Cache management and validation
  • Error handling and recovery

Integration Tests

API Integration:

  • Text-to-speech functionality (text2speech_test.go)
  • Image generation pipeline (requires API key)
  • End-to-end generation workflow

Storage Integration:

  • File system operations
  • Directory structure creation
  • Artifact persistence and retrieval

Running Tests

Basic Test Execution

# Run all tests
go test ./...

# Run with verbose output
go test -v ./...

# Run specific package tests
go test ./pkg/progress
go test ./pkg/model

Test Categories

Offline Tests (no API required):

# Ensure no API key is set for deterministic results
unset OPENAI_API_KEY
go test ./pkg/... -short

Integration Tests (API required):

export OPENAI_API_KEY="sk-..."
go test ./... -run Integration

Benchmarks:

go test -bench=. ./pkg/storage
go test -bench=BenchmarkAttribute ./pkg/prompt

Test Configuration

Environment Variables:

# Disable enhancement for deterministic tests
export TB_DALLE_NO_ENHANCE=1

# Use test data directory
export TB_DALLE_DATA_DIR="/tmp/dalle-test"

# Enable debug logging for tests
export TB_DALLE_LOG_LEVEL=debug

Test Utilities and Helpers

Context Management

// Reset context cache between tests
func TestExample(t *testing.T) {
    defer dalle.ResetContextManagerForTest()
    // Test logic here
}

Progress Testing

// Reset progress metrics for clean test state
func TestProgressFlow(t *testing.T) {
    defer progress.ResetMetricsForTest()
    // Progress testing logic
}

Mock Infrastructure

Time Mocking (progress tests):

type mockClock struct {
    current time.Time
}

func (m *mockClock) Now() time.Time {
    return m.current
}

func (m *mockClock) Advance(d time.Duration) {
    m.current = m.current.Add(d)
}

Test Data Generation:

func generateTestSeries(t *testing.T, suffix string) Series {
    return Series{
        Suffix:     suffix,
        Purpose:    "test series",
        Adjectives: []string{"test", "mock", "example"},
        Nouns:      []string{"warrior", "scholar"},
        // ... additional test attributes
    }
}

Testing Best Practices

Deterministic Testing

Seed-Based Tests:

func TestAttributeSelection(t *testing.T) {
    tests := []struct {
        seed     string
        expected map[string]string
    }{
        {
            seed: "0x1234567890abcdef",
            expected: map[string]string{
                "adjective": "expected_adjective",
                "noun":      "expected_noun",
            },
        },
    }
    
    for _, tt := range tests {
        // Test deterministic attribute selection
    }
}

Template Testing:

func TestPromptGeneration(t *testing.T) {
    dd := &model.DalleDress{
        // Initialize with known attributes
    }
    
    result, err := dd.ExecuteTemplate(template, filter)
    assert.NoError(t, err)
    assert.Contains(t, result, "expected content")
}

Error Handling Tests

func TestErrorScenarios(t *testing.T) {
    tests := []struct {
        name          string
        setupFunc     func()
        expectedError string
    }{
        {
            name: "missing API key",
            setupFunc: func() { os.Unsetenv("OPENAI_API_KEY") },
            expectedError: "API key required",
        },
        {
            name: "invalid series",
            setupFunc: func() { /* setup invalid series */ },
            expectedError: "series not found",
        },
    }
    
    for _, tt := range tests {
        t.Run(tt.name, func(t *testing.T) {
            tt.setupFunc()
            _, err := dalle.GenerateAnnotatedImage("test", "0x123", false, 0)
            assert.Error(t, err)
            assert.Contains(t, err.Error(), tt.expectedError)
        })
    }
}

Performance Testing

func BenchmarkDatabaseLookup(b *testing.B) {
    cm := storage.GetCacheManager()
    cm.LoadOrBuild()
    
    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        db := cm.GetDatabase("adjectives")
        _ = db.Records[i%len(db.Records)]
    }
}

Contributing Guidelines

Development Workflow

  1. Issue Creation: Open an issue describing the change and rationale
  2. Fork & Branch: Create feature branch (feat/<topic>) or bugfix branch (fix/<topic>)
  3. Implementation: Write code with comprehensive tests
  4. Testing: Ensure all tests pass locally
  5. Documentation: Update relevant documentation
  6. Pull Request: Submit PR with clear description and code references

Branch Naming

feat/new-attribute-database    # New feature
fix/progress-tracking-bug      # Bug fix
docs/api-reference-update      # Documentation
refactor/storage-optimization  # Code improvement

Commit Guidelines

Format: type(scope): description

feat(prompt): add support for custom templates
fix(storage): handle permission errors gracefully
docs(api): update function signatures in reference
test(progress): add comprehensive phase testing

Code Quality Standards

Formatting:

go fmt ./...
go vet ./...
golint ./...

Testing Requirements:

  • All new code must include tests
  • Maintain or improve coverage percentage
  • Include both positive and negative test cases
  • Test error conditions and edge cases

Documentation:

  • Update book chapters for user-facing changes
  • Add inline documentation for exported functions
  • Include code examples for new features

Code Style Guidelines

General Principles:

  • Prefer early returns over deep nesting
  • Keep exported API surface minimal
  • Use structured logging with key/value pairs
  • Follow Go idioms and conventions

Error Handling:

// Good: Specific error types
if err != nil {
    return fmt.Errorf("failed to load series %s: %w", series, err)
}

// Good: Early return
if condition {
    return result, nil
}
// Continue with main logic

Logging:

// Good: Structured logging
logger.Info("generation.start", "series", series, "address", address)

// Good: Error context
logger.Error("database.load.failed", "database", dbName, "error", err)

Adding New Features

New Attribute Database

  1. Add CSV file to pkg/storage/databases/
  2. Update database extraction in cache management
  3. Add attribute method to DalleDress
  4. Update templates to use new attribute
  5. Add series filtering support
  6. Write comprehensive tests
  7. Update documentation

New Template Type

  1. Define template string in pkg/prompt/prompt.go
  2. Compile template in context initialization
  3. Add execution method to DalleDress
  4. Add output directory handling
  5. Update artifact pipeline
  6. Test template rendering

New Progress Phase

  1. Add phase constant to pkg/progress/progress.go
  2. Update OrderedPhases slice
  3. Add transition points in generation pipeline
  4. Update progress calculations
  5. Test phase ordering and timing

Performance Considerations

Optimization Guidelines

Context Caching:

  • Avoid forcing context reloads in hot paths
  • Use appropriate TTL settings for your use case
  • Monitor context cache hit rates

Database Operations:

  • Binary cache provides 50x speedup over CSV parsing
  • Validate cache integrity on startup
  • Rebuild cache automatically on corruption

Image Processing:

  • Batch operations when possible
  • Use appropriate image sizes for use case
  • Consider caching annotated images

Memory Management:

  • Context cache uses LRU eviction
  • DalleDress cache has per-context limits
  • Monitor memory usage in long-running applications

Benchmarking

# Database operations
go test -bench=BenchmarkDatabase ./pkg/storage

# Template execution
go test -bench=BenchmarkTemplate ./pkg/model

# Progress tracking
go test -bench=BenchmarkProgress ./pkg/progress

Debugging and Troubleshooting

Debug Configuration

# Enable debug logging
export TB_DALLE_LOG_LEVEL=debug

# Disable caching for testing
export TB_DALLE_NO_CACHE=1

# Use test mode (mocks API calls)
export TB_DALLE_TEST_MODE=1

Common Issues

Context Loading Failures:

  • Check series file permissions
  • Verify JSON format validity
  • Ensure data directory accessibility

Cache Problems:

  • Clear cache directory: rm -rf $TB_DALLE_DATA_DIR/cache
  • Check disk space and permissions
  • Verify embedded database integrity

API Integration Issues:

  • Validate API key format and permissions
  • Check network connectivity
  • Monitor rate limits and quotas

This comprehensive testing and contribution framework ensures the library maintains high quality while remaining accessible to new contributors.

FAQ

A living collection of concrete, code-grounded answers. If something here contradicts code, the code wins and this page should be fixed.


Does generation work without an OpenAI API key?

No. Core image and (optionally) enhancement + text‑to‑speech all rely on OpenAI endpoints. If OPENAI_API_KEY is unset the pipeline short‑circuits early with an error. You can still exercise deterministic local pieces (series filtering, attribute selection, template expansion) in tests that stub the network layer, but no image or audio artifacts will be produced.

What does “deterministic” actually guarantee?

Given the same (seed, series database ordering, enhancement disabled/enabled, template set, orientation flag) you will get identical prompts and therefore (modulo OpenAI’s model randomness) identical requests. Image pixels are not guaranteed because the upstream model is non‑deterministic. Everything up to the HTTP request body is deterministic. Enhancement injects model creativity, so disable it (TB_DALLE_NO_ENHANCE=1) for stricter reproducibility.

How are attributes chosen from the seed?

pkg/prompt/attribute.go slices the seed (converted to string, hashed, or both depending on call) across ordered slices of candidate values. Some categories (e.g. colors, art styles) may intentionally duplicate entries to weight their selection frequency. The order of the underlying database lists (loaded from storage/databases) is therefore part of the deterministic surface.

My attribute changes are ignored after editing a database file—why?

The context keeps an in‑memory copy until you call ReloadDatabases() (directly or via a new context creation). If you modify on disk JSON/CSV (depending on implementation) you must reload or create a fresh context (TTL eviction in the manager can do this automatically if you wait past expiry).

When are contexts evicted?

manager.go maintains an LRU with a time‑based staleness check. Access (read or generation) refreshes the entry. Once capacity is exceeded or TTL exceeded the least recently used entries are dropped. Any subsequent request causes reconstruction from disk.

Why can prompt enhancement be slow?

Enhancement is a separate chat completion call that: (1) builds a system + user message set, (2) waits on OpenAI latency, (3) returns a refined string. This adds an entire round trip. For batch runs disable with TB_DALLE_NO_ENHANCE=1 to save time and remove external variability.

Do I lose anything by disabling enhancement?

Only the model‑augmented rewrite layer. Base prompt quality still leverages structured templates and seed‑derived attributes. Disable it for: tests, reproducible benchmarks, or rate limit conservation.

Is orientation always respected?

Orientation is a hint. The code applies simple logic (e.g. width > height desired?) and chooses an size/aspect_ratio (depending on OpenAI API version) that matches. OpenAI may still perform internal cropping; final pixels can differ subtly.

Where are intermediate prompts stored?

Under the resolved data directory (see pkg/storage/datadir.go) in subfolders: prompt/ (base), enhanced/ (post‑enhancement), terse/ (shortened titles), title/ (final titles), plus generated/ (raw images) and annotated/ (banner overlaid). The manager orchestrates creation; files are named with the request GUID.

How do I clean up disk usage?

If TB_DALLE_ARCHIVE_RUNS is unset, old run directories may accumulate. A periodic external cleanup (e.g. cron) deleting oldest GUID folders beyond retention is safe—artifacts are immutable after creation. Just avoid deleting partial runs that are still in progress (look for absence of a completed progress marker file).

What if enhancement fails but image generation would succeed?

Failure in enhancement logs an error and (depending on implementation details) either falls back to the base prompt or aborts. Current code prefers fail‑fast so you notice silent prompt mutation issues—consult EnhancePrompt call sites. You may wrap it to downgrade to a warning if desired.

How are progress percentages computed?

pkg/progress/progress.go tracks durations per phase using an exponential moving average (EMA). Each new observed duration updates the EMA; the sum of EMAs forms a denominator. Active phase elapsed divided by its EMA plus completed EMA sums yield a coarse percent. This self‑calibrates over multiple runs.

Why does my percent jump backwards occasionally?

If a phase historically was short (small EMA) but a current run is slower, the instantaneous estimate can overshoot then normalize when the phase ends and the EMA updates. User‑facing UIs should treat percentage as approximate and can smooth with an additional client‑side moving average.

Can I add a new output artifact type?

Yes. Pick a directory name, produce the file (e.g. sketch/ for line art derivation), and add it to any archival or listing logic referencing known subfolders. Avoid breaking existing readers by leaving current names intact.

How do I ensure two parallel runs do not collide in filenames?

Each run uses a GUID namespace. Collisions would require a UUIDv4 duplication (practically impossible). Within a run deterministic names are fine because there is only one writer thread.

What concurrency guarantees does the manager provide?

Single flight per logical request (GUID or composite key) guarded by a lock map. Attribute selection and prompt construction occur inside the critical section; disk writes are sequential per run. Cross‑run concurrency is allowed as long as resources (API quota, disk IO) suffice.

How do I simulate OpenAI for tests without hitting the network?

Abstract the HTTP client (interfaces or small indirection). Provide a fake that returns canned JSON resembling DalleResponse1. Tests in this repository already stub some paths; extend them by injecting your fake into context or image generation code.

What if an image download partially fails (one of several variants)?

The current code treats image fetch as critical; a single failure can mark the run failed. To implement partial resilience, change the loop to skip failed variants, record an error in progress metadata, and continue annotation for successful images.

How do I change banner styling in annotations?

Modify pkg/annotate/annotate.go (e.g. font face, padding). Colors come from analyzing pixel luminance to choose contrasting text color. You can also precompute a palette and bypass analysis for speed.

Why is there no local model fallback?

Scope control: The package focuses on orchestration patterns, not reproduction of diffusion models. Adding a local backend would require pluggable interface abstraction (which you can add—see Advanced chapter). The code stays lean by delegating generative complexity.

Can I stream progress events instead of polling files?

Yes. Add a channel broadcast or WebSocket layer in progress when UpdatePhase is called. The current system writes snapshots to disk; streaming is an additive feature.

What licensing constraints exist on generated artifacts?

This code’s LICENSE governs the orchestration logic. Image/audio outputs follow OpenAI’s usage policies, which may evolve. Always review upstream terms; this repository does not override them.

How do I report or inspect OpenAI errors?

pkg/prompt/openai_api_error.go defines OpenAIAPIError. When the API returns structured error JSON, it populates this type. Log it or surface to clients verbatim to aid debugging (strip sensitive request IDs if needed).

Why might enhancement produce an empty string?

Strong safety filters or a mis-structured prompt can cause the model to return minimal content. Guard by validating length and falling back to the base prompt if below a threshold.

What is the fastest path to “just get an annotated image” in code?

Call GenerateAnnotatedImage(ctx, seed, seriesFilter, orientation) on a manager instance (after creating or reusing a context) and then read the annotated/ image file named with the returned GUID.


Next: References → 14-references.md

Environment Variables & Configuration

The trueblocks-dalle library supports various configuration options through environment variables, allowing customization of behavior without code changes.

Core Configuration

Required Variables

OPENAI_API_KEY

Required for image generation, prompt enhancement, and text-to-speech

export OPENAI_API_KEY="sk-proj-..."

The OpenAI API key is used for:

  • Image Generation: DALL·E 3 API calls
  • Prompt Enhancement: GPT-4 chat completions (optional)
  • Text-to-Speech: TTS-1 model audio generation (optional)

Behavior when missing:

  • Image generation fails with an error
  • Prompt enhancement is silently skipped
  • Text-to-speech returns empty string (no error)

Data Directory

TB_DALLE_DATA_DIR

Optional: Custom data directory location

export TB_DALLE_DATA_DIR="/custom/path/to/dalle-data"

Default locations:

  • macOS: ~/Library/Application Support/TrueBlocks
  • Linux: ~/.local/share/TrueBlocks
  • Windows: %APPDATA%/TrueBlocks

The data directory contains all generated artifacts, caches, and series configurations.

Generation Control

Prompt Enhancement

TB_DALLE_NO_ENHANCE

Optional: Disable OpenAI prompt enhancement

export TB_DALLE_NO_ENHANCE=1

When set to any non-empty value:

  • Skips OpenAI Chat API calls for prompt enhancement
  • Uses only the base structured prompt
  • Ensures completely deterministic generation
  • Reduces API costs and latency

Use cases:

  • Testing and development
  • Reproducible builds
  • Rate limiting concerns
  • Cost optimization

Image Parameters

TB_DALLE_ORIENTATION

Optional: Force specific image orientation

export TB_DALLE_ORIENTATION="portrait"   # or "landscape" or "square"

Default behavior: Auto-detection based on prompt content

  • Long prompts → landscape (1792×1024)
  • Medium prompts → square (1024×1024)
  • Short prompts → portrait (1024×1792)

TB_DALLE_SIZE

Optional: Override image size

export TB_DALLE_SIZE="1024x1024"

Supported DALL·E 3 sizes:

  • 1024x1024 (square)
  • 1024x1792 (portrait)
  • 1792x1024 (landscape)

Image Quality

TB_DALLE_QUALITY

Optional: Set image quality level

export TB_DALLE_QUALITY="hd"    # or "standard"
  • standard: Faster generation, lower cost
  • hd: Higher quality, more detail, higher cost

Default: standard

Debugging and Development

Logging Control

TB_DALLE_LOG_LEVEL

Optional: Control logging verbosity

export TB_DALLE_LOG_LEVEL="debug"   # or "info", "warn", "error"

Default: info

TB_DALLE_LOG_FORMAT

Optional: Log output format

export TB_DALLE_LOG_FORMAT="json"   # or "text"

Default: text

Cache Control

TB_DALLE_NO_CACHE

Optional: Disable database caching

export TB_DALLE_NO_CACHE=1

Forces rebuilding of database caches on every run. Useful for:

  • Development testing
  • Cache corruption troubleshooting
  • Ensuring fresh data

TB_DALLE_CACHE_TTL

Optional: Context cache TTL override

export TB_DALLE_CACHE_TTL="1h"   # or "30m", "2h30m", etc.

Default: 30 minutes

Controls how long series contexts remain cached in memory.

API Endpoint Configuration

OpenAI Base URL

OPENAI_BASE_URL

Optional: Custom OpenAI endpoint

export OPENAI_BASE_URL="https://api.openai.com/v1"   # default
export OPENAI_BASE_URL="https://custom-proxy.com/v1"  # custom

Useful for:

  • Corporate proxies
  • API gateways
  • Rate limiting proxies
  • Testing with mock servers

Request Timeouts

TB_DALLE_TIMEOUT

Optional: API request timeout

export TB_DALLE_TIMEOUT="300s"   # 5 minutes

Default: Varies by operation

  • Image generation: 5 minutes
  • Prompt enhancement: 30 seconds
  • Text-to-speech: 1 minute

Text-to-Speech Configuration

Voice Selection

TB_DALLE_TTS_VOICE

Optional: Default TTS voice

export TB_DALLE_TTS_VOICE="alloy"   # default

Available voices: alloy, echo, fable, onyx, nova, shimmer

TB_DALLE_TTS_MODEL

Optional: TTS model selection

export TB_DALLE_TTS_MODEL="tts-1"   # default, faster
export TB_DALLE_TTS_MODEL="tts-1-hd"   # higher quality

Progress and Metrics

Progress Archival

TB_DALLE_ARCHIVE_PROGRESS

Optional: Enable progress run archival

export TB_DALLE_ARCHIVE_PROGRESS=1

When enabled:

  • Saves detailed timing data for completed runs
  • Builds historical performance metrics
  • Enables trend analysis
  • Increases disk usage

TB_DALLE_METRICS_RETENTION

Optional: Metrics retention period

export TB_DALLE_METRICS_RETENTION="30d"   # or "7d", "90d", etc.

Default: 7 days

Security Configuration

Path Validation

TB_DALLE_STRICT_PATHS

Optional: Enable strict path validation

export TB_DALLE_STRICT_PATHS=1

Enables additional security checks on file paths to prevent directory traversal attacks.

API Key Rotation

TB_DALLE_KEY_ROTATION

Optional: Enable API key rotation

export TB_DALLE_KEY_ROTATION=1
export OPENAI_API_KEY_BACKUP="sk-backup-key..."

Automatically falls back to backup key if primary key fails.

Development Configuration

Test Mode

TB_DALLE_TEST_MODE

Development: Enable test mode

export TB_DALLE_TEST_MODE=1

When enabled:

  • Uses mock responses instead of real API calls
  • Faster execution for testing
  • Deterministic outputs
  • No API costs

TB_DALLE_MOCK_DELAY

Development: Simulate API latency

export TB_DALLE_MOCK_DELAY="2s"

Adds artificial delay to mock responses for testing timeout handling.

Configuration Validation

Runtime Checks

The library validates configuration at startup:

func validateConfig() error {
    // Check required variables
    if os.Getenv("OPENAI_API_KEY") == "" {
        return errors.New("OPENAI_API_KEY required")
    }
    
    // Validate enum values
    if orientation := os.Getenv("TB_DALLE_ORIENTATION"); orientation != "" {
        validOrientations := []string{"portrait", "landscape", "square"}
        if !contains(validOrientations, orientation) {
            return fmt.Errorf("invalid orientation: %s", orientation)
        }
    }
    
    // Parse duration values
    if ttl := os.Getenv("TB_DALLE_CACHE_TTL"); ttl != "" {
        if _, err := time.ParseDuration(ttl); err != nil {
            return fmt.Errorf("invalid cache TTL: %s", ttl)
        }
    }
    
    return nil
}

Configuration Examples

Production Setup

# Required
export OPENAI_API_KEY="sk-proj-production-key..."

# Performance optimization
export TB_DALLE_DATA_DIR="/fast-ssd/dalle-data"
export TB_DALLE_CACHE_TTL="2h"
export TB_DALLE_QUALITY="standard"

# Monitoring
export TB_DALLE_LOG_LEVEL="info"
export TB_DALLE_LOG_FORMAT="json"
export TB_DALLE_ARCHIVE_PROGRESS=1

Development Setup

# Required
export OPENAI_API_KEY="sk-proj-development-key..."

# Fast iteration
export TB_DALLE_NO_ENHANCE=1
export TB_DALLE_NO_CACHE=1
export TB_DALLE_LOG_LEVEL="debug"

# Testing
export TB_DALLE_TEST_MODE=1
export TB_DALLE_MOCK_DELAY="100ms"

Cost-Optimized Setup

# Required
export OPENAI_API_KEY="sk-proj-budget-key..."

# Minimize API costs
export TB_DALLE_NO_ENHANCE=1
export TB_DALLE_QUALITY="standard"
export TB_DALLE_SIZE="1024x1024"
export TB_DALLE_TTS_MODEL="tts-1"

Configuration Precedence

Configuration is applied in this order (highest to lowest precedence):

  1. Environment variables (highest)
  2. Code defaults (lowest)

This allows environment-specific overrides while maintaining sensible defaults.

Changelog / Design Notes

A narrative of why the system looks the way it does. Organized by themes instead of strict time order (the Git history is the authoritative chronological record).


Core Goals

  • Deterministic scaffold around inherently stochastic model calls.
  • Small, inspectable codebase (favor clarity over abstraction).
  • File‑system artifact transparency (every stage leaves a tangible trace).
  • Graceful degradation (disable enhancement, still useful; skip TTS, still complete).

Guiding Principles

  1. Code First Documentation – Book regenerated from code truths (no speculative claims).
  2. Single Responsibility – Each package holds a narrow concern (prompting, annotation, progress, etc.).
  3. Observable by Default – Prompts, titles, annotated outputs, progress snapshots all persisted.
  4. Conservative Concurrency – Simplicity beats micro‑optimizing parallel fetches at this scale.
  5. Extensibility via Composition – Add new artifact stages by composing functions, not subclassing.

Key Decisions & Trade‑Offs

Deterministic Attribute Selection

Pros: Reproducible prompt scaffold; enables cache hits and diffable changes. Cons: Less surprise/novelty without enhancement; requires curated attribute pools. Alternative rejected: Pure random selection every run (harder to test and reason about regressions).

Optional Enhancement Layer

Rationale: Keep base system self‑contained while still offering higher quality prompts when desired. Failure mode: Enhancement adds latency and an external dependency surface; disabled in CI / tests.

File System as Persistence & Log

Pros: Zero extra service dependencies; easy manual inspection; supports archival & reproducibility. Cons: No query semantics; potential disk churn. Mitigation: run pruning or archiving env var. Alternative: Database or object store abstraction (deferred until scale justifies).

EMA for Progress Estimation

Chosen over naive fixed weights: adapts to evolving performance characteristics (network latency, model changes). Trade‑off: Early runs yield noisy estimates until EMA stabilizes.

Single Image Pipeline (Sequential Variant Handling)

Instead of parallel HTTP requests: simpler error handling and deterministic progress order. Parallelization could be added later behind a flag if throughput becomes critical.

Annotation Banner Styling

Dynamic contrast analysis for legibility vs. fixed palette. Chosen to accommodate diverse image backgrounds without manual curation.

Minimal Interface Surface

Public API intentionally thin (manager functions + a few constructors). Internal refactors less likely to trigger downstream breakage.

Evolution Highlights

  • Initial scaffold: context + prompt templates + image request.
  • Added deterministic attribute engine (seed slicing) improving reproducibility.
  • Introduced progress phase tracking; later augmented with EMA statistics.
  • Added annotation stage for immediate visual branding / metadata embedding.
  • Integrated optional prompt enhancement (OpenAI chat) with opt‑out flag.
  • Added text‑to‑speech for richer multimodal artifact sets.
  • Documentation rewrite to align with real behavior (this book).

Potential Future Enhancements

AreaIdeaRationale
Backend abstractionInterface for image providerSwap OpenAI with local or alternative APIs.
Streaming progressWebSocket or SSE emitterReal‑time UI updates without polling.
Partial resilienceSkip failed image variantsImprove success rate under transient network errors.
Caching layerHash → image reuseAvoid regeneration for identical final prompts.
Local validationLint prompt templatesDetect missing template fields early.
Security hardeningStrict network client wrapperUniform retry / backoff / logging policy.
Benchmark suitePerformance regression testsTrack latency and resource trends.
Metrics exportPrometheus countersOperational observability for long‑running service.

Anti‑Goals (for now)

  • Full local model replication (scope creep; focus stays orchestration layer).
  • Complex plugin system (YAGNI; composition suffices).
  • Database dependence (file artifacts adequate until scale pressure).

Risks & Mitigations

RiskImpactMitigation
Upstream API contract changesBreaks generationVersion pin + response struct adaptation tests.
Disk growthExhaust storageScheduled pruning / archival compression.
Attribute list driftUnintended prompt shiftsVersion control + diff review on list edits.
Enhancement latency spikesSlower UXOptional disable flag + timeout wrapping.
Progress misestimationPoor UX feedbackEMA smoothing + UI disclaimers.

Testing Philosophy

  • Deterministic layers (attribute mapping, template expansion) thoroughly unit tested.
  • Network edges minimized & stubbed in tests; external calls out of critical logic for testability.
  • Visual artifact tests kept lightweight (e.g., hash banner size or presence rather than pixel perfection).

Refactoring Guidelines

  1. Preserve public function signatures unless strong justification.
  2. Introduce new feature flags / env vars for opt‑in behavior changes.
  3. Update documentation (this book) in the same commit as semantic changes.
  4. Maintain deterministic surfaces—if you introduce randomness, gate it behind an explicit parameter.

Changelog (High‑Level Summary)

(This section intentionally summarizes; consult Git log for exact commits.)

  • v0.1: Base context, prompt templates, image generation.
  • v0.2: Attribute seed mapping + series filtering.
  • v0.3: Progress tracking + EMA.
  • v0.4: Annotation banner & color analysis.
  • v0.5: Prompt enhancement opt‑in.
  • v0.6: Text‑to‑speech integration.
  • v0.7: Comprehensive documentation rewrite (current state).

Glossary

  • Dress / DalleDress – Structured concept derived from seed & attributes feeding templates.
  • Enhancement – Model‑driven rewrite of base prompt.
  • EMA – Exponential Moving Average used for phase duration estimation.
  • Run GUID – Unique identifier for one full generation pipeline execution.

How to Contribute Design Changes

Open an issue outlining: Problem → Proposed Change → Alternatives → Impact on determinism → Migration considerations. Link directly to affected code lines (permalinks). Update this file if rationale extends beyond a short commit message.


Next: (End) – You have reached the final chapter.