Introduction
trueblocks-dalle is a Go library (module github.com/TrueBlocks/trueblocks-dalle/v6) that deterministically generates AI art by converting seed strings (typically Ethereum addresses) into structured semantic attributes, building layered natural-language prompts, optionally enhancing those prompts through OpenAI Chat, generating images through OpenAI's DALL·E API, annotating images with captions, and providing optional text-to-speech narration.
This is not a generic wrapper around OpenAI. It is a deterministic prompt orchestration and artifact pipeline designed for reproducible creative output.
Core Properties
- Deterministic Attribute Derivation: Seeds are sliced into 6-hex-byte windows that map to indexed rows across curated databases (adjectives, nouns, emotions, art styles, etc.)
- Layered Prompt System: Multiple template formats including data, title, terse, full prompt, and optional enhancement via OpenAI Chat
- Series-Based Filtering: Optional JSON-backed filter lists that constrain which database entries are available for each attribute type
- Context Management: LRU + TTL cache of loaded contexts to handle multiple series without unbounded memory growth
- Complete Artifact Pipeline: Persistent output directory structure storing prompts, images (generated and annotated), JSON metadata, and optional audio
- Progress Tracking: Fine-grained phase tracking with ETA estimation, exponential moving averages, and optional run archival
- Image Annotation: Dynamic palette-based background generation with contrast-aware text rendering
- Text-to-Speech: Optional prompt narration via OpenAI TTS
Architecture Overview
The library is organized into several key packages:
| Package | Purpose |
|---|---|
| Root package | Public API, context management, series CRUD, main generation orchestration |
pkg/model | Core data structures (DalleDress, attributes, types) |
pkg/prompt | Template definitions, attribute derivation, OpenAI enhancement |
pkg/image | Image generation, download, processing coordination |
pkg/annotate | Image annotation with dynamic backgrounds and text |
pkg/progress | Phase-based progress tracking with metrics |
pkg/storage | Data directory management, database caching, file operations |
pkg/utils | Utility functions for various operations |
Generation Flow
- Context Resolution: Get or create a cached
Contextfor the specified series - Attribute Derivation: Slice the seed string and map chunks to database entries, respecting series filters
- Prompt Construction: Execute multiple templates (data, title, terse, full) using selected attributes
- Optional Enhancement: Use OpenAI Chat to rewrite the prompt (if enabled and API key present)
- Image Generation: POST to OpenAI Images API, handle download or base64 decoding
- Image Annotation: Add terse caption with palette-based background and contrast-safe text
- Artifact Persistence: Save all outputs (prompts, images, JSON, optional audio) to organized directory structure
- Progress Updates: Track timing through all phases for metrics and ETA calculation
Key Data Structures
Context: Contains templates, database slices, in-memoryDalleDresscache, and series configurationDalleDress: Complete snapshot of generation state including all prompts, paths, attributes, and metadataSeries: JSON-backed configuration with attribute filters and metadataAttribute: Individual semantic unit derived from seed slice and database lookupProgressReport: Real-time generation phase tracking with percentages and ETA
Determinism & Reproducibility
Given the same seed string and series configuration, the library produces identical results through the image generation step. The only non-deterministic component is optional prompt enhancement via OpenAI Chat, which can be disabled with TB_DALLE_NO_ENHANCE=1.
All artifacts are persisted with predictable file paths, enabling caching, auditing, and external processing.
When to Use
- Need reproducible AI image generation from deterministic seeds
- Want structured attribute-driven prompt construction
- Require complete artifact trails for auditing or caching
- Building applications that generate visual identities from addresses or tokens
- Need progress tracking for long-running generation processes
When Not to Use
- Need batch generation of multiple images per prompt
- Require offline execution (depends on OpenAI APIs unless stubbed)
- Want completely free-form prompt construction outside the template system
- Need real-time streaming generation
Next Steps
Jump to the Quick Start for immediate usage examples, or continue to Architecture Overview for deeper system understanding.
Quick Start
This walkthrough shows how to use the main public API functions with minimal code and explains where artifacts are stored.
Prerequisites
Environment Setup
Set your OpenAI API key (required for image generation, enhancement, and text-to-speech):
export OPENAI_API_KEY="sk-..."
Optionally configure a custom data directory (defaults to platform-specific location):
export TB_DALLE_DATA_DIR="/path/to/your/dalle-data"
Optional: disable prompt enhancement for faster/deterministic runs:
export TB_DALLE_NO_ENHANCE=1
Installation
go get github.com/TrueBlocks/trueblocks-dalle/v6@latest
Basic Usage
Simple Image Generation
package main
import (
"fmt"
"log"
"time"
dalle "github.com/TrueBlocks/trueblocks-dalle/v6"
)
func main() {
series := "demo"
address := "0x1234abcd5678ef901234abcd5678ef901234abcd"
// Generate annotated image (full pipeline)
imagePath, err := dalle.GenerateAnnotatedImage(series, address, false, 5*time.Minute)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Generated annotated image: %s\n", imagePath)
// Optional: Generate speech narration
audioPath, err := dalle.GenerateSpeech(series, address, 5*time.Minute)
if err != nil {
log.Printf("Speech generation failed: %v", err)
} else if audioPath != "" {
fmt.Printf("Generated speech: %s\n", audioPath)
}
}
Progress Tracking
package main
import (
"fmt"
"time"
dalle "github.com/TrueBlocks/trueblocks-dalle/v6"
)
func main() {
series := "demo"
address := "0xabcdef1234567890abcdef1234567890abcdef12"
// Start generation in a goroutine
go func() {
_, err := dalle.GenerateAnnotatedImage(series, address, false, 5*time.Minute)
if err != nil {
fmt.Printf("Generation failed: %v\n", err)
}
}()
// Monitor progress
for {
progress := dalle.GetProgress(series, address)
if progress == nil {
fmt.Println("No active progress")
break
}
fmt.Printf("Phase: %s, Progress: %.1f%%, ETA: %ds\n",
progress.Current, progress.Percent, progress.ETASeconds)
if progress.Done {
fmt.Println("Generation completed!")
break
}
time.Sleep(1 * time.Second)
}
}
Series Management
package main
import (
"fmt"
dalle "github.com/TrueBlocks/trueblocks-dalle/v6"
)
func main() {
// List all available series
series := dalle.ListSeries()
fmt.Printf("Available series: %v\n", series)
// Clean up artifacts for a specific series/address
dalle.Clean("demo", "0x1234...")
// Get context count (for monitoring cache usage)
count := dalle.ContextCount()
fmt.Printf("Cached contexts: %d\n", count)
}
Generated Artifacts
Running the examples above creates the following directory structure under your data directory:
$TB_DALLE_DATA_DIR/
└── output/
└── <series>/
├── data/
│ └── <address>.txt # Raw attribute data
├── title/
│ └── <address>.txt # Human-readable title
├── terse/
│ └── <address>.txt # Short caption
├── prompt/
│ └── <address>.txt # Full structured prompt
├── enhanced/
│ └── <address>.txt # OpenAI-enhanced prompt (if enabled)
├── generated/
│ └── <address>.png # Raw generated image
├── annotated/
│ └── <address>.png # Image with caption overlay
├── selector/
│ └── <address>.json # Complete DalleDress metadata
└── audio/
└── <address>.mp3 # Text-to-speech audio (if generated)
Caching Behavior
- Cache hits: If an annotated image already exists,
GenerateAnnotatedImagereturns immediately - Incremental generation: Individual artifacts are cached, so partial runs can resume
- Context caching: Series configurations are cached in memory with LRU eviction
Error Handling
imagePath, err := dalle.GenerateAnnotatedImage(series, address, false, 5*time.Minute)
if err != nil {
switch {
case strings.Contains(err.Error(), "API key"):
log.Fatal("OpenAI API key required")
case strings.Contains(err.Error(), "address required"):
log.Fatal("Valid address string required")
default:
log.Fatalf("Generation failed: %v", err)
}
}
Next Steps
- Architecture Overview - Understand the system design
- API Reference - Complete function documentation
- Series & Attributes - Learn about customization
Architecture Overview
The trueblocks-dalle library generates AI art deterministically from seed strings (like Ethereum addresses).
How It Works
The library implements a deterministic AI art generation pipeline that converts seed strings into structured semantic attributes, builds layered prompts, generates images via OpenAI APIs, and produces complete artifact sets with progress tracking.
Seed String → Select Attributes → Build Prompts → Generate Image → Add Caption → Save FilesImage Creation → Annotation → Artifact Persistence → Optional TTS
Key Concepts
- Deterministic: Same seed always produces same output
- Attribute-Driven: Seed chunks map to curated word lists (adjectives, styles, etc.)
- Template-Based: Multiple prompt formats for different purposes
- Complete Pipeline: Handles everything from prompt to final annotated image
Package Structure
The library is organized into focused packages:
Root Package (github.com/TrueBlocks/trueblocks-dalle/v6)
| File | Responsibility |
|---|---|
context.go | Context struct, database loading, prompt generation orchestration |
manager.go | Context lifecycle management, LRU cache, public API functions |
series.go | Series struct definition and core methods |
series_crud.go | Series persistence, filtering, and management operations |
text2speech.go | OpenAI TTS integration and audio generation |
Core Packages
| Package | Purpose | Key Files |
|---|---|---|
pkg/model | Data structures and types | dalledress.go, types.go |
pkg/prompt | Template system and attribute derivation | prompt.go, attribute.go |
pkg/image | Image generation and processing | image.go |
pkg/annotate | Image annotation with text overlays | annotate.go |
pkg/progress | Phase tracking and metrics | progress.go |
pkg/storage | Data directory and cache management | datadir.go, cache.go, database.go |
pkg/utils | Utility functions | Various utility files |
Core Components
1. Context Management
Purpose: Manages loaded series configurations, template compilation, and database caching.
type Context struct {
Series Series
Databases map[string][]string
DalleCache map[string]*model.DalleDress
CacheMutex sync.Mutex
promptTemplate *template.Template
// ... additional templates
}
Key Operations:
- Series loading and filter application
- Database slicing based on series constraints
DalleDresscreation and caching- Template execution for multiple prompt formats
2. Series System
Purpose: Provides configurable filtering for attribute databases, enabling customized generation behavior.
type Series struct {
Suffix string `json:"suffix"`
Purpose string `json:"purpose,omitempty"`
Deleted bool `json:"deleted,omitempty"`
Adjectives []string `json:"adjectives"`
Nouns []string `json:"nouns"`
Emotions []string `json:"emotions"`
// ... additional attribute filters
}
Features:
- JSON-backed persistence
- Optional filtering for each attribute type
- Soft deletion with recovery
- Hierarchical organization
3. Prompt Generation Pipeline
Purpose: Converts deterministic attributes into multiple prompt formats using Go templates.
Template Types:
- Data Template: Raw attribute listing
- Title Template: Human-readable title generation
- Terse Template: Short caption text
- Prompt Template: Full structured prompt for image generation
- Author Template: Attribution information
Enhancement Flow:
Base Prompt → (Optional) OpenAI Chat Enhancement → Final Prompt
4. Attribute Derivation
Purpose: Deterministically maps seed chunks to database entries.
Process:
- Normalize seed string (remove 0x, lowercase, pad)
- Split into 6-character hex chunks
- Map each chunk to database index via modulo
- Apply series filters if present
- Return selected attribute records
5. Image Generation
Purpose: Coordinates OpenAI DALL·E API calls with automatic retry, size detection, and download handling.
Features:
- Orientation detection (portrait/landscape/square)
- Size optimization based on prompt length
- Base64 and URL download support
- Retry logic with exponential backoff
- Progress tracking integration
6. Image Annotation
Purpose: Adds caption overlays with dynamic background generation and contrast optimization.
Process:
- Analyze image palette for dominant colors
- Generate contrasting background banner
- Calculate optimal font size and positioning
- Render text with anti-aliasing
- Composite final annotated image
7. Progress Tracking
Purpose: Provides real-time generation monitoring with phase timing and ETA calculation.
Phases:
const (
PhaseSetup Phase = "setup"
PhaseBasePrompts Phase = "base_prompts"
PhaseEnhance Phase = "enhance_prompt"
PhaseImagePrep Phase = "image_prep"
PhaseImageWait Phase = "image_wait"
PhaseImageDownload Phase = "image_download"
PhaseAnnotate Phase = "annotate"
PhaseCompleted Phase = "completed"
PhaseFailed Phase = "failed"
)
Features:
- Exponential moving averages for ETA calculation
- Optional run archival for historical analysis
- Concurrent access safety
- Cache hit detection
Data Flow Architecture
1. Context Resolution
Input: (series, address)
↓
Manager checks LRU cache
↓
If miss: Create new context, load series, filter databases
↓
Return cached context
2. DalleDress Creation
Context + Address
↓
Check DalleDress cache
↓
If miss: Derive attributes, execute templates, persist
↓
Return DalleDress with all prompt variants
3. Image Generation
DalleDress + API Key
↓
Determine orientation and size
↓
POST to OpenAI Images API
↓
Download/decode image
↓
Save to generated/ directory
4. Annotation
Generated Image + Terse Caption
↓
Analyze image palette
↓
Generate contrasting background
↓
Render text overlay
↓
Save to annotated/ directory
Storage Architecture
Directory Structure
$DATA_DIR/
├── output/
│ └── <series>/
│ ├── data/ # Attribute dumps
│ ├── title/ # Human titles
│ ├── terse/ # Captions
│ ├── prompt/ # Base prompts
│ ├── enhanced/ # Enhanced prompts
│ ├── generated/ # Raw images
│ ├── annotated/ # Captioned images
│ ├── selector/ # DalleDress JSON
│ └── audio/ # TTS audio
├── cache/
│ ├── databases.cache # Binary database cache
│ └── temp/ # Temporary files
├── series/ # Series configurations
└── metrics/ # Progress metrics
Caching Strategy
Context Cache: LRU with TTL eviction prevents unbounded memory growth Database Cache: Binary serialization of processed CSV databases Artifact Cache: File existence checks enable fast cache hits Progress Cache: In-memory tracking with optional persistence
Integration Points
OpenAI APIs
- Chat Completions (optional): Prompt enhancement
- Images (required): DALL·E 3 generation
- Audio/Speech (optional): TTS narration
External Dependencies
github.com/TrueBlocks/trueblocks-core: Logging and file utilitiesgithub.com/TrueBlocks/trueblocks-sdk: SDK integrationgit.sr.ht/~sbinet/gg: Graphics rendering for annotationgithub.com/lucasb-eyer/go-colorful: Color analysis
Error Handling Strategy
Network Resilience
- Exponential backoff for API retries
- Timeout configuration per operation type
- Graceful degradation when services unavailable
Data Integrity
- Atomic file operations to prevent corruption
- Checksum validation for caches
- Path traversal prevention
Recovery Mechanisms
- Automatic cache rebuilding on corruption
- Partial pipeline resumption via artifact caching
- Context recreation on management errors
Extensibility Points
Custom Providers
- Replace
image.RequestImagefor alternative generation services - Implement custom annotation renderers
- Add new attribute databases
Template System
- Add new prompt templates for different formats
- Customize enhancement prompts for specific use cases
- Extend attribute derivation logic
Progress Integration
- Custom progress reporters for external monitoring
- Metric exporters for observability systems
- Archive processors for historical analysis
This architecture ensures scalable, reliable, and maintainable AI art generation while preserving deterministic behavior and comprehensive auditability.
Each phase completion updates a moving average (unless cache hit). Percent/ETA = (sum elapsed or average)/(sum averages).
Error Strategies
- Network errors wrap into typed errors where practical (
OpenAIAPIError). - Missing API key yields placeholder or skipped enhancement/image steps without failing the pipeline.
- File path traversal is prevented via cleaned absolute path prefix checks.
Extending
Replace image.RequestImage for alternate providers; add new databases + methods on DalleDress for extra semantic dimensions; or decorate progress manager for custom telemetry.
Next: Context & Manager.
Context & Manager
This chapter drills into context construction, caching, and lifecycle management.
Context Responsibilities
context.go defines Context which bundles:
- Templates (prompt, data, title, terse, author)
- In-memory
Databases(map: database name -> slice of CSV row strings) Seriesmetadata (filters & suffix)DalleCache(address -> *DalleDress)
The context owns pure prompt state; it does not perform network calls (image generation and enhancement are separate functions using the context’s outputs).
Building a Context
NewContext():
- Loads cache manager (
storage.GetCacheManager().LoadOrBuild()) - Initializes template pointers from
promptpackage variables - Creates empty maps
- Calls
ReloadDatabases("empty")to seed initial series
Database Loading
ReloadDatabases(filter string):
- Loads
SeriesvialoadSeries(filter) - For each name in
prompt.DatabaseNamestries cached binary index → falls back to CSV - Applies optional per-field filtering from
Series.GetFilter(fieldName) - Ensures at least one row ("none") to avoid zero-length selection panics
Constructing a DalleDress
MakeDalleDress(address string):
- Normalizes key (filename-safe) and returns cached instance if present
- Builds a
seed= original + reverse(original); enforces length >= 66; strips0x - Iteratively slices 6 hex chars every 8 chars; maps them into attributes until databases exhausted
- Builds prompt layers by executing templates; conditionally loads an
enhancedprompt from disk if present - Stores under both original and normalized cache keys for future hits
Thread Safety
CacheMutex protects DalleCache. Additional saveMutex guards concurrent file writes in reportOn.
Manager Layer
manager.go adds an LRU+TTL around contexts so each series has at most one resident context. Key pieces:
managedContextstruct holds context + lastUsed timestamp- Global
contextManagermap + order slice ManagerOptions(MaxContexts, ContextTTL) adjustable viaConfigureManager- Eviction: contexts older than TTL are dropped; if still above capacity, least-recently-used removed
Generation Entry Point
GenerateAnnotatedImage(series, address, skipImage, lockTTL):
- Early return if annotated image already exists (synthetic cache hit progress run created)
- Acquire per-(series,address) lock with TTL to avoid duplicate concurrent generations
- Build / fetch context and
DalleDress - Start and transition progress phases (base prompts → enhance → image...) unless
skipImage - Delegate to
Context.GenerateImageWithBaseURLfor image pipeline - Mark completion, update metrics
skipImage=true still produces prompt artifacts but bypasses network phases.
Locks
A map of requestLocks with TTL prevents burst duplicate work. Expired locks are cleaned opportunistically.
Cache Hit Shortcut
If annotated/<address>.png exists the system:
- Builds
DalleDress(ensures consistent metadata) - Starts a progress run (if one doesn’t already exist)
- Marks cacheHit + completed without regenerating
Cleaning Artifacts
Clean(series, address) removes the generated set: annotated png, raw image, selector JSON, audio, and prompt text files across all prompt subdirectories.
When to Add a New Context Field
Add new fields only if they reflect deterministic state or necessary caches. Side-effectful network concerns belong outside.
Extension Strategies
- Alternate Persistence: wrap
reportOnor post-process afterGenerateAnnotatedImage. - Custom Prompt Layers: execute additional templates with
DalleDress.FromTemplate. - Series Variants: manage multiple series suffixes and rely on manager eviction for memory control.
Next: Series & Attribute Databases
Series & Attribute Databases
Purpose
A Series constrains or themes generations by restricting which rows from each logical database may be selected during attribute derivation. It also names the output namespace (folder suffix).
Database Order
Defined in prompt.DatabaseNames (order is significant for deterministic mapping):
adverbs, adjectives, nouns, emotions, occupations, actions,
artstyles, artstyles, litstyles,
colors, colors, colors,
orientations, gazes, backstyles
Duplicated entries (artstyles, colors) allow multiple independent selections without custom logic.
Raw Rows
Each database loads as a slice of strings (CSV lines, version prefixes stripped). Rows are treated as opaque until later parsed by accessor methods in DalleDress (splitting on commas, trimming pieces, etc.).
Series JSON Schema (excerpt)
{
"suffix": "demo",
"adverbs": ["swiftly", "boldly"],
"adjectives": [],
"nouns": [],
"emotions": ["joy"],
"deleted": false
}
Only non-empty slices act as filters. If a slice is empty, no filtering occurs for that category.
Filtering Logic
For each database:
- Load full slice (cache index → fallback CSV)
- If the corresponding
Seriesslice is non-empty, retain rows containing any filter substring - If the resulting slice is empty, insert a sentinel
"none"to avoid selection panics
Substring containment (not exact match) enables flexible partial filters but may admit unintended rows; prefer distinctive tokens.
Attribute Construction Recap
prompt.NewAttribute(dbs, index, bytes):
- Interprets 6 hex chars as number → factor in [0,1)
- Scales to database length to pick selector index
- Captures value string; accessor methods later format for prompt templates
Extending with a New Attribute
- Add a database file & loader logic (mirroring existing ones)
- Append names to
DatabaseNamesandattributeNamesin the same positional slot - Add a slice field to
Series(exported, plural) for potential filtering - Create accessor on
DalleDress(e.g.func (d *DalleDress) Weather(short bool) string) - Update templates (
promptTemplateStr, etc.) to include the new semantic - Regenerate docs
Changing the order of DatabaseNames is a breaking change to deterministic mapping and should be avoided after release.
Pitfalls
- Over-filtering (e.g. selecting a single emotion) reduces variety and can cause visually repetitive outputs.
- Adding a new attribute without updating templates yields unused entropy.
- Removing an attribute breaks existing cached serialized
DalleDressJSON consumers expecting that field.
Example Filter Use Case
To produce a cohesive color-themed series—populate colors slice in the series JSON with a shortlist (e.g. "ultramarine", "amber"). Those rows will dominate selection while other attributes still vary.
Next: Prompt Generation Pipeline
Storage Architecture & Data Directories
The trueblocks-dalle library organizes all data using a structured directory hierarchy managed by the pkg/storage package. This chapter explains the storage architecture, data directory resolution, and file organization patterns.
Data Directory Resolution
Default Location
The library automatically determines an appropriate data directory based on the platform:
func DataDir() string {
if dir := os.Getenv("TB_DALLE_DATA_DIR"); dir != "" {
return dir
}
// Falls back to platform-specific defaults
}
Platform defaults:
- macOS:
~/Library/Application Support/TrueBlocks - Linux:
~/.local/share/TrueBlocks - Windows:
%APPDATA%/TrueBlocks
Environment Override
Set TB_DALLE_DATA_DIR to use a custom location:
export TB_DALLE_DATA_DIR="/custom/path/to/dalle-data"
Directory Structure
The data directory contains several key subdirectories:
$TB_DALLE_DATA_DIR/
├── output/ # Generated artifacts (images, prompts, audio)
├── cache/ # Database and context caches
├── series/ # Series configuration files
└── metrics/ # Progress timing data
Output Directory
Generated artifacts are organized by series under output/:
output/
└── <series-name>/
├── data/ # Raw attribute data dumps
├── title/ # Human-readable titles
├── terse/ # Short captions
├── prompt/ # Full structured prompts
├── enhanced/ # OpenAI-enhanced prompts
├── generated/ # Raw DALL·E generated images
├── annotated/ # Images with caption overlays
├── selector/ # Complete DalleDress JSON metadata
└── audio/ # Text-to-speech MP3 files
Each subdirectory contains files named <address>.ext where:
addressis the input seed string (typically Ethereum address)extis the appropriate file extension (.txt,.png,.json,.mp3)
Cache Directory
The cache directory stores processed database indexes and temporary files:
cache/
├── databases.cache # Binary database cache file
├── series.cache # Series configuration cache
└── temp/ # Temporary files during processing
Series Directory
Series configurations are stored as JSON files:
series/
├── default.json # Default series configuration
├── custom-series.json # Custom series with filters
└── deleted/ # Soft-deleted series
└── old-series.json
File Path Utilities
The storage package provides utilities for constructing paths:
Core Functions
// Base directories
func DataDir() string // Main data directory
func OutputDir() string // output/ subdirectory
func SeriesDir() string // series/ subdirectory
func CacheDir() string // cache/ subdirectory
// Path construction
func EnsureDir(path string) error // Create directory if needed
func CleanPath(path string) string // Sanitize file paths
Path Security
All file operations include security checks to prevent directory traversal:
// Example from annotate.go
cleanName := filepath.Clean(fileName)
if !strings.Contains(cleanName, string(os.PathSeparator)+"generated"+string(os.PathSeparator)) {
return "", fmt.Errorf("invalid image path: %s", fileName)
}
Artifact Lifecycle
Creation Flow
- Directory Creation: Output directories are created as needed during generation
- Incremental Writing: Artifacts are written as they're generated (prompts → image → annotation)
- Atomic Operations: Files are written atomically to prevent corruption
- Metadata Updates: JSON metadata is updated throughout the process
Caching Strategy
- Existence Checks: If an annotated image exists, the pipeline returns immediately (cache hit)
- Incremental Processing: Individual artifacts are cached, allowing partial resume
- Selective Regeneration: Only missing or outdated artifacts are regenerated
Cleanup Operations
The Clean function removes all artifacts for a series/address pair:
func Clean(series, address string) {
// Removes files from all output subdirectories
// Clears cached DalleDress entries
// Updates progress tracking
}
Database Storage
Embedded Databases
Attribute databases are embedded in the binary as compressed tar.gz archives:
pkg/storage/databases.tar.gz # Compressed attribute databases
Cache Format
Processed databases are cached in binary format for fast loading:
type DatabaseCache struct {
Version string // Cache version
Timestamp int64 // Creation time
Databases map[string]DatabaseIndex // Processed indexes
Checksum string // Validation checksum
SourceHash string // Source data hash
}
Cache Validation
The cache system validates integrity on load:
- Checksum Verification: Ensures cache file hasn't been corrupted
- Source Hash Check: Detects if embedded databases have changed
- Version Compatibility: Handles cache format changes
- Automatic Rebuild: Rebuilds cache if validation fails
Performance Considerations
Directory Operations
- Lazy Creation: Directories are created only when needed
- Path Caching: Resolved paths are cached to avoid repeated filesystem calls
- Batch Operations: Multiple files in the same directory are processed efficiently
Storage Optimization
- Binary Caching: Database indexes use efficient binary serialization
- Compression: Embedded databases are compressed to reduce binary size
- Selective Loading: Only required database sections are loaded into memory
Cleanup Strategies
- Automatic Cleanup: Temporary files are cleaned up on completion or failure
- LRU Eviction: Context cache uses LRU eviction to prevent unbounded growth
- Configurable Retention: TTL settings control how long contexts remain cached
Error Handling
Common Storage Errors
// Permission issues
if os.IsPermission(err) {
// Handle insufficient filesystem permissions
}
// Disk space issues
if strings.Contains(err.Error(), "no space left") {
// Handle disk space exhaustion
}
// Path traversal attempts
if strings.Contains(err.Error(), "invalid path") {
// Handle security violations
}
Recovery Strategies
- Graceful Degradation: Continue operation when non-critical files can't be written
- Cache Rebuilding: Automatically rebuild corrupted caches
- Alternative Paths: Fall back to temporary directories if primary locations fail
Integration Points
With Context Management
- Series configurations are loaded from the series directory
- Context cache uses storage utilities for persistence
- Database loading integrates with the cache management system
With Progress Tracking
- Progress metrics are persisted to the data directory
- Temporary run state is stored in cache directory
- Completed runs can optionally archive detailed timing data
With Generation Pipeline
- Each generation phase writes artifacts to appropriate subdirectories
- File existence checks drive caching decisions
- Path resolution ensures consistent artifact locations
This storage architecture provides a robust foundation for reproducible, auditable, and efficient artifact management throughout the generation pipeline.
Database Caching & Management
The trueblocks-dalle library uses a sophisticated caching system to manage attribute databases efficiently. This chapter explains how databases are loaded, cached, and accessed during generation.
Database Architecture
Embedded Databases
The library includes curated databases of semantic attributes embedded as compressed archives:
pkg/storage/databases.tar.gz # Compressed CSV databases
pkg/storage/series.tar.gz # Default series configurations
Database Types
The system includes these semantic databases:
| Database | Purpose | Example Entries |
|---|---|---|
adjectives | Descriptive attributes | "mysterious", "elegant", "ancient" |
adverbs | Manner modifiers | "gracefully", "boldly", "subtly" |
nouns | Core subjects | "warrior", "scholar", "merchant" |
emotions | Emotional states | "contemplative", "joyful", "melancholic" |
occupations | Professional roles | "architect", "botanist", "craftsperson" |
actions | Physical activities | "meditating", "dancing", "reading" |
artstyles | Artistic movements | "impressionist", "art nouveau", "bauhaus" |
litstyles | Literary styles | "romantic", "gothic", "minimalist" |
colors | Color palettes | "cerulean", "burnt sienna", "sage green" |
viewpoints | Camera angles | "bird's eye view", "close-up", "wide shot" |
gazes | Eye directions | "looking away", "direct gaze", "upward" |
backstyles | Background types | "cosmic void", "forest clearing", "urban" |
compositions | Layout patterns | "rule of thirds", "centered", "asymmetric" |
Cache System
Cache Manager
The CacheManager provides centralized access to processed databases:
type CacheManager struct {
mu sync.RWMutex
cacheDir string
dbCache *DatabaseCache
loaded bool
}
func GetCacheManager() *CacheManager {
// Returns singleton instance
}
Cache Loading
func (cm *CacheManager) LoadOrBuild() error {
// 1. Try to load existing cache
// 2. Validate cache integrity
// 3. Rebuild if invalid or missing
// 4. Update loaded flag
}
Cache Structure
type DatabaseCache struct {
Version string `json:"version"`
Timestamp int64 `json:"timestamp"`
Databases map[string]DatabaseIndex `json:"databases"`
Checksum string `json:"checksum"`
SourceHash string `json:"sourceHash"`
}
type DatabaseIndex struct {
Name string `json:"name"`
Version string `json:"version"`
Records []DatabaseRecord `json:"records"`
Lookup map[string]int `json:"lookup"`
}
type DatabaseRecord struct {
Key string `json:"key"`
Values []string `json:"values"`
}
Database Loading Process
1. Cache Validation
func (cm *CacheManager) validateCache() bool {
// Check file existence
// Verify checksum integrity
// Compare source hash with embedded data
// Validate version compatibility
}
2. Database Extraction
If cache is invalid, databases are extracted from embedded archives:
func extractDatabases() error {
// Extract databases.tar.gz
// Parse CSV files
// Build lookup indexes
// Generate checksums
}
3. Index Building
Each database CSV is processed into an efficient lookup structure:
CSV Format:
key,value1,value2,version
warrior,brave fighter,medieval soldier,1.0
scholar,learned person,academic researcher,1.0
Index Structure:
{
"Name": "nouns",
"Version": "1.0",
"Records": [
{"Key": "warrior", "Values": ["brave fighter", "medieval soldier", "1.0"]},
{"Key": "scholar", "Values": ["learned person", "academic researcher", "1.0"]}
],
"Lookup": {"warrior": 0, "scholar": 1}
}
4. Binary Serialization
Processed indexes are serialized using Go's gob encoding for fast loading:
func saveCacheLocked(cm *CacheManager) error {
file, err := os.Create(cm.cacheFile())
if err != nil {
return err
}
defer file.Close()
encoder := gob.NewEncoder(file)
return encoder.Encode(cm.dbCache)
}
Attribute Selection
Deterministic Lookup
Attributes are selected deterministically using seed-based indexing:
func selectAttribute(database []DatabaseRecord, seedChunk string) DatabaseRecord {
// Convert hex chunk to number
// Use modulo to get valid index
// Return corresponding record
index := hexToNumber(seedChunk) % len(database)
return database[index]
}
Seed Processing
The input seed is processed into consistent chunks:
func processSeed(address string) []string {
// Normalize to lowercase hex
// Remove 0x prefix if present
// Pad to minimum length
// Split into 6-character chunks
// Return ordered chunks
}
Series Filtering
When a series specifies filters, only matching records are eligible:
func applySeriesFilter(records []DatabaseRecord, filter []string) []DatabaseRecord {
if len(filter) == 0 {
return records // No filter = all records
}
var filtered []DatabaseRecord
for _, record := range records {
if contains(filter, record.Key) {
filtered = append(filtered, record)
}
}
return filtered
}
Performance Optimizations
Memory Management
- Lazy Loading: Databases are loaded only when first accessed
- Shared Instances: Multiple contexts share the same cache manager
- Efficient Indexes: O(1) lookup using hash maps
Cache Efficiency
// Binary cache loading is significantly faster than CSV parsing
func BenchmarkCacheLoading(b *testing.B) {
// Binary cache: ~1ms
// CSV parsing: ~50ms
// Speedup: 50x
}
Concurrent Access
The cache manager uses read-write locks for thread safety:
func (cm *CacheManager) GetDatabase(name string) DatabaseIndex {
cm.mu.RLock()
defer cm.mu.RUnlock()
return cm.dbCache.Databases[name]
}
Cache Invalidation
Automatic Rebuilding
The cache is automatically rebuilt when:
- Missing Cache File: No cache exists on disk
- Checksum Mismatch: Cache file is corrupted
- Source Hash Change: Embedded databases have been updated
- Version Incompatibility: Cache format has changed
Manual Cache Management
// Force cache rebuild
func (cm *CacheManager) Rebuild() error {
cm.mu.Lock()
defer cm.mu.Unlock()
return cm.buildCacheLocked()
}
// Clear cache
func (cm *CacheManager) Clear() error {
return os.Remove(cm.cacheFile())
}
Integration with Context
Database Loading in Context
func (ctx *Context) ReloadDatabases(series string) error {
cm := storage.GetCacheManager()
// Ensure cache is loaded
if err := cm.LoadOrBuild(); err != nil {
return err
}
// Apply series filters to each database
for dbName := range ctx.Databases {
filtered := cm.GetFilteredDatabase(dbName, series)
ctx.Databases[dbName] = filtered
}
return nil
}
Filter Application
Series filters are applied when loading databases into a context:
func (cm *CacheManager) GetFilteredDatabase(dbName, series string) []string {
// Load series configuration
seriesConfig := loadSeries(series)
// Get database records
dbIndex := cm.dbCache.Databases[dbName]
// Apply series filter if present
filter := seriesConfig.GetFilter(dbName)
if len(filter) > 0 {
return applyFilter(dbIndex.Records, filter)
}
// Return all records if no filter
return extractKeys(dbIndex.Records)
}
Error Handling
Cache Loading Errors
func handleCacheError(err error) {
switch {
case os.IsNotExist(err):
logger.Info("Cache file not found, will rebuild")
case strings.Contains(err.Error(), "checksum"):
logger.Warn("Cache checksum mismatch, rebuilding")
case strings.Contains(err.Error(), "version"):
logger.Info("Cache version changed, rebuilding")
default:
logger.Error("Unexpected cache error:", err)
}
}
Fallback Strategies
- Memory Fallback: If cache can't be written, keep in memory only
- Rebuild on Error: Automatically rebuild corrupted caches
- Graceful Degradation: Continue with available databases if some fail
Monitoring and Debugging
Cache Statistics
func (cm *CacheManager) Stats() CacheStats {
return CacheStats{
LoadTime: cm.loadTime,
DatabaseCount: len(cm.dbCache.Databases),
TotalRecords: cm.countRecords(),
CacheSize: cm.cacheFileSize(),
LastUpdated: time.Unix(cm.dbCache.Timestamp, 0),
}
}
Debug Information
func (cm *CacheManager) DebugInfo() map[string]interface{} {
return map[string]interface{}{
"loaded": cm.loaded,
"cache_file": cm.cacheFile(),
"version": cm.dbCache.Version,
"checksum": cm.dbCache.Checksum,
"databases": maps.Keys(cm.dbCache.Databases),
}
}
This caching system ensures fast, reliable access to semantic databases while maintaining data integrity and supporting efficient filtering through the series system.
Prompt Generation Pipeline
Overview
Prompt layers provide multiple projections of the same attribute set for different downstream uses: captioning, logging, enhancement, and model instruction.
Layers
| Directory | Purpose |
|---|---|
data/ | Raw attribute dump for auditing. |
title/ | Composite title (emotion + adverb + adjective + occupation + noun). |
terse/ | Short caption placed on annotated image. |
prompt/ | Structured base instruction combining attributes into narrative. |
enhanced/ | Optional ChatGPT-refined version of base prompt. |
Templates
Defined in pkg/prompt/prompt.go as Go text/template instances. Template methods invoked on DalleDress (e.g. {{.Noun true}}) control short/long formatting.
Example Snippet (Base Prompt)
Draw a {{.Adverb false}} {{.Adjective false}} {{.Noun true}} ...
Literary Style
If a literary style attribute is present (not none), extra author persona context (AuthorTemplate) precedes enhancement.
Enhancement
EnhancePrompt(prompt, authorType) calls OpenAI Chat with model gpt-4. Bypass rules:
- Environment
TB_DALLE_NO_ENHANCE=1 - Missing
OPENAI_API_KEY
Output is wrapped with guard text: DO NOT PUT TEXT IN THE IMAGE (added both sides) to discourage textual artifacts inside generated images.
Accessor Semantics
Each accessor on DalleDress returns either a short token or expanded annotated string depending on a boolean flag (short). Some (Orientation, BackStyle) embed other attribute values.
Adding A New Layer
- Create template constant + compiled variable in
prompt.go. - Add pointer in
Contextif persisted similarly. - Execute in
MakeDalleDressand persist like existing layers. - Optionally expose accessor or include in API docs.
Failure Modes
- Missing attribute keys (extending without accessor) → template execution error.
- Enhancement HTTP failure → returns typed
OpenAIAPIError; generation may still proceed with base prompt.
Next: Image Request & Annotation
Image Request & Annotation
Request Flow
image.RequestImage steps:
- Build
prompt.Requestwith model (currentlydall-e-3). - Infer size from orientation keywords (landscape/horizontal/vertical) else square.
- Early placeholder file if
OPENAI_API_KEYmissing. - POST to OpenAI images endpoint (override via baseURL parameter upstream).
- Parse response: URL path OR base64 fallback (
b64_json). - Download or decode → write
generated/<file>.png. - Annotate with terse prompt via
annotate.Annotate→ writeannotated/<file>.png. - Update progress phases (wait → download → annotate) and DalleDress fields (ImageURL, DownloadMode, paths).
Download Modes
urldirect HTTP GETb64inline base64 decode (when no URL provided)
Annotation Mechanics
annotate.Annotate:
- Validates source path contains
/generated/ - Reads image, computes dominant colors (top 3 frequency) → average background
- Chooses contrasting text (white/black) via lightness threshold
- Draws separator line + wraps caption text
- Writes sibling file swapping
generated→annotated
Error Handling
- Network or decode failure returns error; upstream marks progress failure.
- Annotation failure aborts after raw image write (annotated artifact missing).
- Missing API key: annotated placeholder is empty file (still allows cache semantics).
Customization Points
- Change model: patch selection logic near
modelNamevariable. - Bypass annotation: replace
annotateFuncvar in tests or fork logic. - Add watermark: extend
Annotateto composite additional graphics.
Next: Progress Tracking
Progress Tracking
Purpose
Real-time insight & metrics for the generation lifecycle with ETA estimates grounded in moving average phase durations.
Phases
Ordered list (progress.OrderedPhases):
setup → base_prompts → enhance_prompt → image_prep → image_wait → image_download → annotate → failed/completed
image_prep currently acts as a transitional placeholder (timing can be extended in future).
Data Structures
ProgressManagersingleton keyed byseries:addressprogressRuninternal mutable stateProgressReportexternally returned snapshot- Exponential moving average per phase (
alpha=0.2)
Percent & ETA Calculation
- Sum average durations of non-terminal phases → total
- Accumulate averages of completed phases + capped elapsed of current phase → done
- percent = done/total * 100; ETA = (total - done)
Cache hits skip average updates and are immediately marked completed.
Archival & Metrics Persistence
Phase averages stored in <DataDir>/metrics/progress_phase_stats.json. Set TB_DALLE_ARCHIVE_RUNS=1 to serialize per-run snapshots under metrics/runs/.
Public Functions
GetProgress(series,address)returns (and prunes when completed)ActiveProgressReports()returns active snapshots
Failure Path
On error: current phase ends, run transitions to failed, summary log emitted, and (if archival enabled) snapshot saved.
Extending
Add a new phase by appending to OrderedPhases, initializing timing in StartRun, and inserting transitions in generation code. Consider metrics implications (initial averages start undefined until one run completes that phase).
Next: Text-to-Speech
Text-to-Speech
Overview
Optional conversion of the enhanced (or base) prompt into an mp3 using OpenAI tts-1 model.
Entry Points
GenerateSpeech(series,address,lockTTL)ensures mp3 exists (respects per-address lock)Speak(series,address)generate-if-missing then returns pathReadToMe(series,address)same semantics (ensure mp3, return path)
Conditions
- Skips silently if
OPENAI_API_KEYunset - Voice defaults to
alloy - 1 minute context timeout; simple retry loop until success or timeout
Storage
<DataDir>/output/<series>/audio/<address>.mp3
Implementation Notes
- Minimal JSON body manually constructed (lighter than defining structs)
- Escapes quotes/newlines with
marshalEscaped - Retries on non-200 logging attempt, status, and any error
- Uses
0600file permissions
Customization
Wrap TextToSpeech to swap provider or implement local TTS; keep same output path for integration with existing tooling.
Next: Public API Reference
Public API Reference
Complete documentation of exported functions and types in the trueblocks-dalle package.
Primary Generation Functions
Image Generation
GenerateAnnotatedImage
func GenerateAnnotatedImage(series, address string, skipImage bool, lockTTL time.Duration) (string, error)
Generates a complete annotated image through the full pipeline. Returns the path to the annotated PNG file.
Parameters:
series: Series name for filtering attributes and organizing outputaddress: Seed string (typically Ethereum address) for deterministic generationskipImage: If true, skips image generation (useful for prompt-only operations)lockTTL: Maximum time to hold generation lock (prevents concurrent runs)
Returns: Path to annotated image file
GenerateAnnotatedImageWithBaseURL
func GenerateAnnotatedImageWithBaseURL(series, address string, skipImage bool, lockTTL time.Duration, baseURL string) (string, error)
Same as GenerateAnnotatedImage but allows overriding the OpenAI API base URL.
Additional Parameter:
baseURL: Custom OpenAI API endpoint (empty string uses default)
Speech Generation
GenerateSpeech
func GenerateSpeech(series, address string, lockTTL time.Duration) (string, error)
Generates text-to-speech audio for the enhanced prompt. Returns path to MP3 file.
Parameters:
series: Series name for file organizationaddress: Seed string (creates DalleDress to get prompt text)lockTTL: Lock timeout duration (0 uses default 2 minutes)
Returns: Path to generated MP3 file (empty string if no API key)
Example:
audioPath, err := dalle.GenerateSpeech("demo", "0x1234...", 5*time.Minute)
if err != nil {
log.Fatal(err)
}
if audioPath != "" {
fmt.Printf("Audio saved to: %s", audioPath)
// Output: Audio saved to: output/demo/audio/0x1234....mp3
}
Speak
func Speak(series, address string) (string, error)
Convenience function that generates speech if not already present, then returns the path.
Example:
audioPath, err := dalle.Speak("demo", "0x1234...")
if err != nil {
log.Fatal(err)
}
// Uses default lockTTL, generates if missing
ReadToMe
func ReadToMe(series, address string) (string, error)
Alias for Speak with semantic naming. Same functionality as Speak.
Example:
audioPath, err := dalle.ReadToMe("demo", "0x1234...")
// Identical to dalle.Speak("demo", "0x1234...")
TextToSpeech
func TextToSpeech(text string, voice string, series string, address string) (string, error)
Low-level text-to-speech function for custom text.
Parameters:
text: Text content to convert to speechvoice: OpenAI voice name ("alloy", "echo", "fable", "onyx", "nova", "shimmer")series: Series for output organizationaddress: Address for file naming
Example:
audioPath, err := dalle.TextToSpeech("Hello, this is a test message", "alloy", "demo", "test")
if err != nil {
log.Fatal(err)
}
// Creates: output/demo/audio/test.mp3
Available Voices:
alloy: Neutral, balanced toneecho: Clear, crisp deliveryfable: Warm, expressive readingonyx: Deep, rich voicenova: Bright, energetic toneshimmer: Soft, gentle delivery
Context Management
Context Access
NewContext
func NewContext() *Context
Creates a new Context with initialized templates, databases, and cache. Loads the "empty" series by default.
func (ctx *Context) MakeDalleDress(address string) (*model.DalleDress, error)
Builds or retrieves a DalleDress from cache for the given address.
func (ctx *Context) GetPrompt(address string) string
func (ctx *Context) GetEnhanced(address string) string
Retrieve base or enhanced prompt text for an address. Returns error message as string if address lookup fails.
func (ctx *Context) GenerateImage(address string) (string, error)
func (ctx *Context) GenerateImageWithBaseURL(address, baseURL string) (string, error)
Generate image for an address (requires existing DalleDress in cache). Returns path to generated image.
func (ctx *Context) GenerateEnhanced(address string) (string, error)
Generates a literarily-enhanced prompt for the given address using OpenAI Chat API.
func (ctx *Context) Save(address string) bool
Generates and saves prompt data for the given address. Returns true on success.
func (ctx *Context) ReloadDatabases(filter string) error
Reload attribute databases with series-specific filters.
Context Manager
ConfigureManager
func ConfigureManager(opts ManagerOptions)
Configure context cache behavior.
type ManagerOptions struct {
MaxContexts int // Maximum cached contexts (default: 20)
ContextTTL time.Duration // Context expiration time (default: 30 minutes)
}
ContextCount
func ContextCount() int
Returns the number of currently cached contexts.
IsValidSeries
func IsValidSeries(series string, list []string) bool
Determines whether a requested series is valid given an optional list. If list is empty, returns true for any series.
Parameters:
series: Series name to validatelist: Optional list of valid series names
Returns: True if series is valid, false otherwise
Series Management
ListSeries
func ListSeries() []string
Returns list of all available series names.
Example:
series := dalle.ListSeries()
fmt.Printf("Available series: %v", series)
// Output: Available series: [demo test custom]
Series CRUD Operations
func LoadSeriesModels(dir string) ([]Series, error)
func LoadActiveSeriesModels(dir string) ([]Series, error)
func LoadDeletedSeriesModels(dir string) ([]Series, error)
Load series configurations from directory.
func DeleteSeries(dir, suffix string) error
func UndeleteSeries(dir, suffix string) error
func RemoveSeries(dir, suffix string) error
Manage series lifecycle (mark deleted, restore, or permanently remove).
Progress Tracking
GetProgress
func GetProgress(series, addr string) *ProgressReport
Get current progress snapshot for a generation run (nil if not active).
ActiveProgressReports
func ActiveProgressReports() []*ProgressReport
Get all currently active progress reports. Returns snapshots for all non-completed runs.
Progress Testing Helpers
func ForceMetricsSave()
func ResetMetricsForTest()
Testing utilities for forcing metrics persistence and clearing metrics state.
Utility Functions
Clean
func Clean(series, address string)
Remove all generated artifacts for a specific series/address combination.
Example:
// Clean up all files for a specific address
dalle.Clean("demo", "0x1234567890abcdef1234567890abcdef12345678")
// This removes:
// - output/demo/annotated/0x1234...png
// - output/demo/generated/0x1234...png
// - output/demo/selector/0x1234...json
// - output/demo/audio/0x1234...mp3
// - All prompt text files (data, title, terse, etc.)
Test Helpers
func ResetContextManagerForTest()
Reset context manager state (testing only).
Core Data Types
DalleDress
type DalleDress struct {
Original string `json:"original"`
OriginalName string `json:"originalName"`
FileName string `json:"fileName"`
FileSize int64 `json:"fileSize"`
ModifiedAt int64 `json:"modifiedAt"`
Seed string `json:"seed"`
Prompt string `json:"prompt"`
DataPrompt string `json:"dataPrompt"`
TitlePrompt string `json:"titlePrompt"`
TersePrompt string `json:"tersePrompt"`
EnhancedPrompt string `json:"enhancedPrompt"`
Attribs []prompt.Attribute `json:"attributes"`
AttribMap map[string]prompt.Attribute `json:"-"`
SeedChunks []string `json:"seedChunks"`
SelectedTokens []string `json:"selectedTokens"`
SelectedRecords []string `json:"selectedRecords"`
ImageURL string `json:"imageUrl"`
GeneratedPath string `json:"generatedPath"`
AnnotatedPath string `json:"annotatedPath"`
DownloadMode string `json:"downloadMode"`
IPFSHash string `json:"ipfsHash"`
CacheHit bool `json:"cacheHit"`
Completed bool `json:"completed"`
Series string `json:"series"`
}
Series
type Series struct {
Last int `json:"last,omitempty"`
Suffix string `json:"suffix"`
Purpose string `json:"purpose,omitempty"`
Deleted bool `json:"deleted,omitempty"`
Adverbs []string `json:"adverbs"`
Adjectives []string `json:"adjectives"`
Nouns []string `json:"nouns"`
Emotions []string `json:"emotions"`
Occupations []string `json:"occupations"`
Actions []string `json:"actions"`
Artstyles []string `json:"artstyles"`
Litstyles []string `json:"litstyles"`
Colors []string `json:"colors"`
Viewpoints []string `json:"viewpoints"`
Gazes []string `json:"gazes"`
Backstyles []string `json:"backstyles"`
Compositions []string `json:"compositions"`
ModifiedAt string `json:"modifiedAt,omitempty"`
}
Attribute
type Attribute struct {
Database string `json:"database"`
Name string `json:"name"`
Bytes string `json:"bytes"`
Number int `json:"number"`
Factor float64 `json:"factor"`
Selector string `json:"selector"`
Value string `json:"value"`
}
ProgressReport
type ProgressReport struct {
Series string `json:"series"`
Address string `json:"address"`
Current Phase `json:"currentPhase"`
StartedNs int64 `json:"startedNs"`
Percent float64 `json:"percent"`
ETASeconds float64 `json:"etaSeconds"`
Done bool `json:"done"`
Error string `json:"error"`
CacheHit bool `json:"cacheHit"`
Phases []*PhaseTiming `json:"phases"`
DalleDress *model.DalleDress `json:"dalleDress"`
PhaseAverages map[Phase]time.Duration `json:"phaseAverages"`
}
Phases
type Phase string
const (
PhaseSetup Phase = "setup"
PhaseBasePrompts Phase = "base_prompts"
PhaseEnhance Phase = "enhance_prompt"
PhaseImagePrep Phase = "image_prep"
PhaseImageWait Phase = "image_wait"
PhaseImageDownload Phase = "image_download"
PhaseAnnotate Phase = "annotate"
PhaseFailed Phase = "failed"
PhaseCompleted Phase = "completed"
)
Environment Variables
| Variable | Description | Default |
|---|---|---|
OPENAI_API_KEY | OpenAI API key for image generation, enhancement, and TTS | Required |
TB_DALLE_DATA_DIR | Custom data directory path | Platform default |
TB_DALLE_NO_ENHANCE | Set to "1" to disable prompt enhancement | Enhancement enabled |
TB_DALLE_ARCHIVE_RUNS | Set to "1" to save progress snapshots to JSON files | Disabled |
TB_CMD_LINE | Set to "true" to auto-open images on macOS | Disabled |
Examples:
export OPENAI_API_KEY="sk-..."
export TB_DALLE_DATA_DIR="/custom/dalle/data"
export TB_DALLE_NO_ENHANCE=1
export TB_DALLE_ARCHIVE_RUNS=1
Error Types
Primary Error Types
prompt.OpenAIAPIError– Structured error from OpenAI API calls- Fields:
Message(string),StatusCode(int),RequestID(string),Code(string) - Method:
IsRetryable() bool– Determines if error should be retried
- Fields:
Common Error Patterns
import "github.com/TrueBlocks/trueblocks-dalle/v6/pkg/prompt"
// Check for OpenAI API errors
if apiErr, ok := err.(*prompt.OpenAIAPIError); ok {
switch apiErr.StatusCode {
case 401:
// Invalid API key
case 429:
// Rate limited, retry with backoff
case 500, 502, 503:
// Server errors, safe to retry
}
}
// Missing API key
if strings.Contains(err.Error(), "API key") {
// Handle missing OPENAI_API_KEY
}
// Invalid inputs
if strings.Contains(err.Error(), "address required") {
// Handle empty address
}
if strings.Contains(err.Error(), "series not found") {
// Handle invalid series name
}
|----------|-------------|
| TB_DALLE_DATA_DIR | Override base data directory root. |
| OPENAI_API_KEY | Enables enhancement, image, and TTS. |
| TB_DALLE_NO_ENHANCE | Skip GPT-based enhancement if 1. |
| TB_DALLE_ARCHIVE_RUNS | Archive per-run JSON snapshots if 1. |
| TB_CMD_LINE | If true, attempt to open annotated image (macOS). |
Error Types & Troubleshooting
prompt.OpenAIAPIError– Contains fields: Message, StatusCode, Code.
Common Issues & Solutions
"address required" Error
Cause: Empty or nil address parameter passed to generation functions.
Solution: Ensure address string is not empty before calling GenerateAnnotatedImage or related functions.
Silent Generation Failures
Cause: Missing OPENAI_API_KEY environment variable.
Solution: Set the environment variable: export OPENAI_API_KEY=\"sk-...\"
Note: Functions return empty paths or skip silently when no API key is present.
"seed length is less than 66" Error
Cause: Address string too short for seed generation.
Solution: Ensure address is a valid Ethereum address (42 characters with 0x prefix) or longer seed string.
Image Generation Timeouts
Cause: OpenAI API delays or network issues.
Solution: Increase lockTTL parameter in generation functions or check network connectivity.
Context Cache Issues
Cause: Memory pressure or too many cached contexts.
Solution: Use ConfigureManager to adjust MaxContexts and ContextTTL settings.
Progress Reports Return Nil
Cause: Generation not started, completed, or failed runs are pruned after first report.
Solution: Check return value and handle nil case. Use ActiveProgressReports() for ongoing monitoring.
Patterns
Typical flow:
path, err := dalle.GenerateAnnotatedImage(series, address, false, 0)
if err != nil { /* handle */ }
if audio, _ := dalle.GenerateSpeech(series, address, 0); audio != "" { /* use mp3 */ }
Next: Advanced Usage & Extensibility
Advanced Usage & Extensibility
Custom Image Backend
Replace OpenAI generation by wrapping GenerateAnnotatedImage:
- Call
GenerateAnnotatedImage(series,address,true,0)(skipImage=true) to produce prompts without network. - Read enhanced or base prompt from disk.
- Use external backend → write PNG to
generated/<filename>.png. - Call
annotate.Annotateto create annotated version.
Offline Mode
Without OPENAI_API_KEY you still obtain deterministic prompt artifacts; image and enhancement phases become no-ops producing placeholders.
Adding An Attribute
- Add database file, extend
prompt.DatabaseNames/attributeNames. - Add accessor on
DalleDress. - Update templates to reference new method.
- Regenerate documentation.
Concurrency Patterns
Generate multiple distinct (series,address) pairs concurrently; per-pair lock prevents duplication. For global throttling, introduce an external worker pool.
Progress Integration
Poll GetProgress asynchronously. Each ProgressReport returns a pointer to live DalleDress (treat read-only).
Metrics & Observability
Phase averages are basic; integrate with OpenTelemetry by wrapping generation calls and annotating spans with phase transitions.
Error Injection Testing
- Invalid API key → expect
OpenAIAPIError. - Simulate network timeout with firewall/latency tool; observe
image.post.timeoutlog.
Determinism Considerations
Disable enhancement (TB_DALLE_NO_ENHANCE=1) for reproducible regression tests.
Swapping Templates at Runtime
Use DalleDress.FromTemplate(customTemplateString) to experiment. Persist results to a custom directory to avoid interfering with canonical artifacts.
Security Notes
- Path cleaning already enforced; retain checks if adding new file outputs.
- Keep API key only in environment; avoid embedding in logs (current debug curl masks only by relying on environment—sanitize if extending).
Next: Testing & Contributing
Testing & Contributing
This chapter covers the testing architecture, contribution guidelines, and development workflows for the trueblocks-dalle library.
Test Architecture
The library includes comprehensive test coverage across multiple layers:
Unit Tests
Core Components:
- Context creation and management (
context_test.go) - Series operations and filtering (
series_test.go) - DalleDress creation and templating (
pkg/model/dalledress_test.go) - Progress tracking and metrics (
pkg/progress/progress_test.go) - Storage operations (
pkg/storage/*_test.go) - Image annotation (
pkg/annotate/annotate_test.go)
Key Test Areas:
- Attribute derivation determinism
- Template execution correctness
- Series filtering logic
- Progress phase transitions
- Cache management and validation
- Error handling and recovery
Integration Tests
API Integration:
- Text-to-speech functionality (
text2speech_test.go) - Image generation pipeline (requires API key)
- End-to-end generation workflow
Storage Integration:
- File system operations
- Directory structure creation
- Artifact persistence and retrieval
Running Tests
Basic Test Execution
# Run all tests
go test ./...
# Run with verbose output
go test -v ./...
# Run specific package tests
go test ./pkg/progress
go test ./pkg/model
Test Categories
Offline Tests (no API required):
# Ensure no API key is set for deterministic results
unset OPENAI_API_KEY
go test ./pkg/... -short
Integration Tests (API required):
export OPENAI_API_KEY="sk-..."
go test ./... -run Integration
Benchmarks:
go test -bench=. ./pkg/storage
go test -bench=BenchmarkAttribute ./pkg/prompt
Test Configuration
Environment Variables:
# Disable enhancement for deterministic tests
export TB_DALLE_NO_ENHANCE=1
# Use test data directory
export TB_DALLE_DATA_DIR="/tmp/dalle-test"
# Enable debug logging for tests
export TB_DALLE_LOG_LEVEL=debug
Test Utilities and Helpers
Context Management
// Reset context cache between tests
func TestExample(t *testing.T) {
defer dalle.ResetContextManagerForTest()
// Test logic here
}
Progress Testing
// Reset progress metrics for clean test state
func TestProgressFlow(t *testing.T) {
defer progress.ResetMetricsForTest()
// Progress testing logic
}
Mock Infrastructure
Time Mocking (progress tests):
type mockClock struct {
current time.Time
}
func (m *mockClock) Now() time.Time {
return m.current
}
func (m *mockClock) Advance(d time.Duration) {
m.current = m.current.Add(d)
}
Test Data Generation:
func generateTestSeries(t *testing.T, suffix string) Series {
return Series{
Suffix: suffix,
Purpose: "test series",
Adjectives: []string{"test", "mock", "example"},
Nouns: []string{"warrior", "scholar"},
// ... additional test attributes
}
}
Testing Best Practices
Deterministic Testing
Seed-Based Tests:
func TestAttributeSelection(t *testing.T) {
tests := []struct {
seed string
expected map[string]string
}{
{
seed: "0x1234567890abcdef",
expected: map[string]string{
"adjective": "expected_adjective",
"noun": "expected_noun",
},
},
}
for _, tt := range tests {
// Test deterministic attribute selection
}
}
Template Testing:
func TestPromptGeneration(t *testing.T) {
dd := &model.DalleDress{
// Initialize with known attributes
}
result, err := dd.ExecuteTemplate(template, filter)
assert.NoError(t, err)
assert.Contains(t, result, "expected content")
}
Error Handling Tests
func TestErrorScenarios(t *testing.T) {
tests := []struct {
name string
setupFunc func()
expectedError string
}{
{
name: "missing API key",
setupFunc: func() { os.Unsetenv("OPENAI_API_KEY") },
expectedError: "API key required",
},
{
name: "invalid series",
setupFunc: func() { /* setup invalid series */ },
expectedError: "series not found",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
tt.setupFunc()
_, err := dalle.GenerateAnnotatedImage("test", "0x123", false, 0)
assert.Error(t, err)
assert.Contains(t, err.Error(), tt.expectedError)
})
}
}
Performance Testing
func BenchmarkDatabaseLookup(b *testing.B) {
cm := storage.GetCacheManager()
cm.LoadOrBuild()
b.ResetTimer()
for i := 0; i < b.N; i++ {
db := cm.GetDatabase("adjectives")
_ = db.Records[i%len(db.Records)]
}
}
Contributing Guidelines
Development Workflow
- Issue Creation: Open an issue describing the change and rationale
- Fork & Branch: Create feature branch (
feat/<topic>) or bugfix branch (fix/<topic>) - Implementation: Write code with comprehensive tests
- Testing: Ensure all tests pass locally
- Documentation: Update relevant documentation
- Pull Request: Submit PR with clear description and code references
Branch Naming
feat/new-attribute-database # New feature
fix/progress-tracking-bug # Bug fix
docs/api-reference-update # Documentation
refactor/storage-optimization # Code improvement
Commit Guidelines
Format: type(scope): description
feat(prompt): add support for custom templates
fix(storage): handle permission errors gracefully
docs(api): update function signatures in reference
test(progress): add comprehensive phase testing
Code Quality Standards
Formatting:
go fmt ./...
go vet ./...
golint ./...
Testing Requirements:
- All new code must include tests
- Maintain or improve coverage percentage
- Include both positive and negative test cases
- Test error conditions and edge cases
Documentation:
- Update book chapters for user-facing changes
- Add inline documentation for exported functions
- Include code examples for new features
Code Style Guidelines
General Principles:
- Prefer early returns over deep nesting
- Keep exported API surface minimal
- Use structured logging with key/value pairs
- Follow Go idioms and conventions
Error Handling:
// Good: Specific error types
if err != nil {
return fmt.Errorf("failed to load series %s: %w", series, err)
}
// Good: Early return
if condition {
return result, nil
}
// Continue with main logic
Logging:
// Good: Structured logging
logger.Info("generation.start", "series", series, "address", address)
// Good: Error context
logger.Error("database.load.failed", "database", dbName, "error", err)
Adding New Features
New Attribute Database
- Add CSV file to
pkg/storage/databases/ - Update database extraction in cache management
- Add attribute method to
DalleDress - Update templates to use new attribute
- Add series filtering support
- Write comprehensive tests
- Update documentation
New Template Type
- Define template string in
pkg/prompt/prompt.go - Compile template in context initialization
- Add execution method to
DalleDress - Add output directory handling
- Update artifact pipeline
- Test template rendering
New Progress Phase
- Add phase constant to
pkg/progress/progress.go - Update OrderedPhases slice
- Add transition points in generation pipeline
- Update progress calculations
- Test phase ordering and timing
Performance Considerations
Optimization Guidelines
Context Caching:
- Avoid forcing context reloads in hot paths
- Use appropriate TTL settings for your use case
- Monitor context cache hit rates
Database Operations:
- Binary cache provides 50x speedup over CSV parsing
- Validate cache integrity on startup
- Rebuild cache automatically on corruption
Image Processing:
- Batch operations when possible
- Use appropriate image sizes for use case
- Consider caching annotated images
Memory Management:
- Context cache uses LRU eviction
- DalleDress cache has per-context limits
- Monitor memory usage in long-running applications
Benchmarking
# Database operations
go test -bench=BenchmarkDatabase ./pkg/storage
# Template execution
go test -bench=BenchmarkTemplate ./pkg/model
# Progress tracking
go test -bench=BenchmarkProgress ./pkg/progress
Debugging and Troubleshooting
Debug Configuration
# Enable debug logging
export TB_DALLE_LOG_LEVEL=debug
# Disable caching for testing
export TB_DALLE_NO_CACHE=1
# Use test mode (mocks API calls)
export TB_DALLE_TEST_MODE=1
Common Issues
Context Loading Failures:
- Check series file permissions
- Verify JSON format validity
- Ensure data directory accessibility
Cache Problems:
- Clear cache directory:
rm -rf $TB_DALLE_DATA_DIR/cache - Check disk space and permissions
- Verify embedded database integrity
API Integration Issues:
- Validate API key format and permissions
- Check network connectivity
- Monitor rate limits and quotas
This comprehensive testing and contribution framework ensures the library maintains high quality while remaining accessible to new contributors.
FAQ
A living collection of concrete, code-grounded answers. If something here contradicts code, the code wins and this page should be fixed.
Does generation work without an OpenAI API key?
No. Core image and (optionally) enhancement + text‑to‑speech all rely on OpenAI endpoints. If OPENAI_API_KEY is unset the pipeline short‑circuits early with an error. You can still exercise deterministic local pieces (series filtering, attribute selection, template expansion) in tests that stub the network layer, but no image or audio artifacts will be produced.
What does “deterministic” actually guarantee?
Given the same (seed, series database ordering, enhancement disabled/enabled, template set, orientation flag) you will get identical prompts and therefore (modulo OpenAI’s model randomness) identical requests. Image pixels are not guaranteed because the upstream model is non‑deterministic. Everything up to the HTTP request body is deterministic. Enhancement injects model creativity, so disable it (TB_DALLE_NO_ENHANCE=1) for stricter reproducibility.
How are attributes chosen from the seed?
pkg/prompt/attribute.go slices the seed (converted to string, hashed, or both depending on call) across ordered slices of candidate values. Some categories (e.g. colors, art styles) may intentionally duplicate entries to weight their selection frequency. The order of the underlying database lists (loaded from storage/databases) is therefore part of the deterministic surface.
My attribute changes are ignored after editing a database file—why?
The context keeps an in‑memory copy until you call ReloadDatabases() (directly or via a new context creation). If you modify on disk JSON/CSV (depending on implementation) you must reload or create a fresh context (TTL eviction in the manager can do this automatically if you wait past expiry).
When are contexts evicted?
manager.go maintains an LRU with a time‑based staleness check. Access (read or generation) refreshes the entry. Once capacity is exceeded or TTL exceeded the least recently used entries are dropped. Any subsequent request causes reconstruction from disk.
Why can prompt enhancement be slow?
Enhancement is a separate chat completion call that: (1) builds a system + user message set, (2) waits on OpenAI latency, (3) returns a refined string. This adds an entire round trip. For batch runs disable with TB_DALLE_NO_ENHANCE=1 to save time and remove external variability.
Do I lose anything by disabling enhancement?
Only the model‑augmented rewrite layer. Base prompt quality still leverages structured templates and seed‑derived attributes. Disable it for: tests, reproducible benchmarks, or rate limit conservation.
Is orientation always respected?
Orientation is a hint. The code applies simple logic (e.g. width > height desired?) and chooses an size/aspect_ratio (depending on OpenAI API version) that matches. OpenAI may still perform internal cropping; final pixels can differ subtly.
Where are intermediate prompts stored?
Under the resolved data directory (see pkg/storage/datadir.go) in subfolders: prompt/ (base), enhanced/ (post‑enhancement), terse/ (shortened titles), title/ (final titles), plus generated/ (raw images) and annotated/ (banner overlaid). The manager orchestrates creation; files are named with the request GUID.
How do I clean up disk usage?
If TB_DALLE_ARCHIVE_RUNS is unset, old run directories may accumulate. A periodic external cleanup (e.g. cron) deleting oldest GUID folders beyond retention is safe—artifacts are immutable after creation. Just avoid deleting partial runs that are still in progress (look for absence of a completed progress marker file).
What if enhancement fails but image generation would succeed?
Failure in enhancement logs an error and (depending on implementation details) either falls back to the base prompt or aborts. Current code prefers fail‑fast so you notice silent prompt mutation issues—consult EnhancePrompt call sites. You may wrap it to downgrade to a warning if desired.
How are progress percentages computed?
pkg/progress/progress.go tracks durations per phase using an exponential moving average (EMA). Each new observed duration updates the EMA; the sum of EMAs forms a denominator. Active phase elapsed divided by its EMA plus completed EMA sums yield a coarse percent. This self‑calibrates over multiple runs.
Why does my percent jump backwards occasionally?
If a phase historically was short (small EMA) but a current run is slower, the instantaneous estimate can overshoot then normalize when the phase ends and the EMA updates. User‑facing UIs should treat percentage as approximate and can smooth with an additional client‑side moving average.
Can I add a new output artifact type?
Yes. Pick a directory name, produce the file (e.g. sketch/ for line art derivation), and add it to any archival or listing logic referencing known subfolders. Avoid breaking existing readers by leaving current names intact.
How do I ensure two parallel runs do not collide in filenames?
Each run uses a GUID namespace. Collisions would require a UUIDv4 duplication (practically impossible). Within a run deterministic names are fine because there is only one writer thread.
What concurrency guarantees does the manager provide?
Single flight per logical request (GUID or composite key) guarded by a lock map. Attribute selection and prompt construction occur inside the critical section; disk writes are sequential per run. Cross‑run concurrency is allowed as long as resources (API quota, disk IO) suffice.
How do I simulate OpenAI for tests without hitting the network?
Abstract the HTTP client (interfaces or small indirection). Provide a fake that returns canned JSON resembling DalleResponse1. Tests in this repository already stub some paths; extend them by injecting your fake into context or image generation code.
What if an image download partially fails (one of several variants)?
The current code treats image fetch as critical; a single failure can mark the run failed. To implement partial resilience, change the loop to skip failed variants, record an error in progress metadata, and continue annotation for successful images.
How do I change banner styling in annotations?
Modify pkg/annotate/annotate.go (e.g. font face, padding). Colors come from analyzing pixel luminance to choose contrasting text color. You can also precompute a palette and bypass analysis for speed.
Why is there no local model fallback?
Scope control: The package focuses on orchestration patterns, not reproduction of diffusion models. Adding a local backend would require pluggable interface abstraction (which you can add—see Advanced chapter). The code stays lean by delegating generative complexity.
Can I stream progress events instead of polling files?
Yes. Add a channel broadcast or WebSocket layer in progress when UpdatePhase is called. The current system writes snapshots to disk; streaming is an additive feature.
What licensing constraints exist on generated artifacts?
This code’s LICENSE governs the orchestration logic. Image/audio outputs follow OpenAI’s usage policies, which may evolve. Always review upstream terms; this repository does not override them.
How do I report or inspect OpenAI errors?
pkg/prompt/openai_api_error.go defines OpenAIAPIError. When the API returns structured error JSON, it populates this type. Log it or surface to clients verbatim to aid debugging (strip sensitive request IDs if needed).
Why might enhancement produce an empty string?
Strong safety filters or a mis-structured prompt can cause the model to return minimal content. Guard by validating length and falling back to the base prompt if below a threshold.
What is the fastest path to “just get an annotated image” in code?
Call GenerateAnnotatedImage(ctx, seed, seriesFilter, orientation) on a manager instance (after creating or reusing a context) and then read the annotated/ image file named with the returned GUID.
Next: References → 14-references.md
Environment Variables & Configuration
The trueblocks-dalle library supports various configuration options through environment variables, allowing customization of behavior without code changes.
Core Configuration
Required Variables
OPENAI_API_KEY
Required for image generation, prompt enhancement, and text-to-speech
export OPENAI_API_KEY="sk-proj-..."
The OpenAI API key is used for:
- Image Generation: DALL·E 3 API calls
- Prompt Enhancement: GPT-4 chat completions (optional)
- Text-to-Speech: TTS-1 model audio generation (optional)
Behavior when missing:
- Image generation fails with an error
- Prompt enhancement is silently skipped
- Text-to-speech returns empty string (no error)
Data Directory
TB_DALLE_DATA_DIR
Optional: Custom data directory location
export TB_DALLE_DATA_DIR="/custom/path/to/dalle-data"
Default locations:
- macOS:
~/Library/Application Support/TrueBlocks - Linux:
~/.local/share/TrueBlocks - Windows:
%APPDATA%/TrueBlocks
The data directory contains all generated artifacts, caches, and series configurations.
Generation Control
Prompt Enhancement
TB_DALLE_NO_ENHANCE
Optional: Disable OpenAI prompt enhancement
export TB_DALLE_NO_ENHANCE=1
When set to any non-empty value:
- Skips OpenAI Chat API calls for prompt enhancement
- Uses only the base structured prompt
- Ensures completely deterministic generation
- Reduces API costs and latency
Use cases:
- Testing and development
- Reproducible builds
- Rate limiting concerns
- Cost optimization
Image Parameters
TB_DALLE_ORIENTATION
Optional: Force specific image orientation
export TB_DALLE_ORIENTATION="portrait" # or "landscape" or "square"
Default behavior: Auto-detection based on prompt content
- Long prompts → landscape (1792×1024)
- Medium prompts → square (1024×1024)
- Short prompts → portrait (1024×1792)
TB_DALLE_SIZE
Optional: Override image size
export TB_DALLE_SIZE="1024x1024"
Supported DALL·E 3 sizes:
1024x1024(square)1024x1792(portrait)1792x1024(landscape)
Image Quality
TB_DALLE_QUALITY
Optional: Set image quality level
export TB_DALLE_QUALITY="hd" # or "standard"
standard: Faster generation, lower costhd: Higher quality, more detail, higher cost
Default: standard
Debugging and Development
Logging Control
TB_DALLE_LOG_LEVEL
Optional: Control logging verbosity
export TB_DALLE_LOG_LEVEL="debug" # or "info", "warn", "error"
Default: info
TB_DALLE_LOG_FORMAT
Optional: Log output format
export TB_DALLE_LOG_FORMAT="json" # or "text"
Default: text
Cache Control
TB_DALLE_NO_CACHE
Optional: Disable database caching
export TB_DALLE_NO_CACHE=1
Forces rebuilding of database caches on every run. Useful for:
- Development testing
- Cache corruption troubleshooting
- Ensuring fresh data
TB_DALLE_CACHE_TTL
Optional: Context cache TTL override
export TB_DALLE_CACHE_TTL="1h" # or "30m", "2h30m", etc.
Default: 30 minutes
Controls how long series contexts remain cached in memory.
API Endpoint Configuration
OpenAI Base URL
OPENAI_BASE_URL
Optional: Custom OpenAI endpoint
export OPENAI_BASE_URL="https://api.openai.com/v1" # default
export OPENAI_BASE_URL="https://custom-proxy.com/v1" # custom
Useful for:
- Corporate proxies
- API gateways
- Rate limiting proxies
- Testing with mock servers
Request Timeouts
TB_DALLE_TIMEOUT
Optional: API request timeout
export TB_DALLE_TIMEOUT="300s" # 5 minutes
Default: Varies by operation
- Image generation: 5 minutes
- Prompt enhancement: 30 seconds
- Text-to-speech: 1 minute
Text-to-Speech Configuration
Voice Selection
TB_DALLE_TTS_VOICE
Optional: Default TTS voice
export TB_DALLE_TTS_VOICE="alloy" # default
Available voices: alloy, echo, fable, onyx, nova, shimmer
TB_DALLE_TTS_MODEL
Optional: TTS model selection
export TB_DALLE_TTS_MODEL="tts-1" # default, faster
export TB_DALLE_TTS_MODEL="tts-1-hd" # higher quality
Progress and Metrics
Progress Archival
TB_DALLE_ARCHIVE_PROGRESS
Optional: Enable progress run archival
export TB_DALLE_ARCHIVE_PROGRESS=1
When enabled:
- Saves detailed timing data for completed runs
- Builds historical performance metrics
- Enables trend analysis
- Increases disk usage
TB_DALLE_METRICS_RETENTION
Optional: Metrics retention period
export TB_DALLE_METRICS_RETENTION="30d" # or "7d", "90d", etc.
Default: 7 days
Security Configuration
Path Validation
TB_DALLE_STRICT_PATHS
Optional: Enable strict path validation
export TB_DALLE_STRICT_PATHS=1
Enables additional security checks on file paths to prevent directory traversal attacks.
API Key Rotation
TB_DALLE_KEY_ROTATION
Optional: Enable API key rotation
export TB_DALLE_KEY_ROTATION=1
export OPENAI_API_KEY_BACKUP="sk-backup-key..."
Automatically falls back to backup key if primary key fails.
Development Configuration
Test Mode
TB_DALLE_TEST_MODE
Development: Enable test mode
export TB_DALLE_TEST_MODE=1
When enabled:
- Uses mock responses instead of real API calls
- Faster execution for testing
- Deterministic outputs
- No API costs
TB_DALLE_MOCK_DELAY
Development: Simulate API latency
export TB_DALLE_MOCK_DELAY="2s"
Adds artificial delay to mock responses for testing timeout handling.
Configuration Validation
Runtime Checks
The library validates configuration at startup:
func validateConfig() error {
// Check required variables
if os.Getenv("OPENAI_API_KEY") == "" {
return errors.New("OPENAI_API_KEY required")
}
// Validate enum values
if orientation := os.Getenv("TB_DALLE_ORIENTATION"); orientation != "" {
validOrientations := []string{"portrait", "landscape", "square"}
if !contains(validOrientations, orientation) {
return fmt.Errorf("invalid orientation: %s", orientation)
}
}
// Parse duration values
if ttl := os.Getenv("TB_DALLE_CACHE_TTL"); ttl != "" {
if _, err := time.ParseDuration(ttl); err != nil {
return fmt.Errorf("invalid cache TTL: %s", ttl)
}
}
return nil
}
Configuration Examples
Production Setup
# Required
export OPENAI_API_KEY="sk-proj-production-key..."
# Performance optimization
export TB_DALLE_DATA_DIR="/fast-ssd/dalle-data"
export TB_DALLE_CACHE_TTL="2h"
export TB_DALLE_QUALITY="standard"
# Monitoring
export TB_DALLE_LOG_LEVEL="info"
export TB_DALLE_LOG_FORMAT="json"
export TB_DALLE_ARCHIVE_PROGRESS=1
Development Setup
# Required
export OPENAI_API_KEY="sk-proj-development-key..."
# Fast iteration
export TB_DALLE_NO_ENHANCE=1
export TB_DALLE_NO_CACHE=1
export TB_DALLE_LOG_LEVEL="debug"
# Testing
export TB_DALLE_TEST_MODE=1
export TB_DALLE_MOCK_DELAY="100ms"
Cost-Optimized Setup
# Required
export OPENAI_API_KEY="sk-proj-budget-key..."
# Minimize API costs
export TB_DALLE_NO_ENHANCE=1
export TB_DALLE_QUALITY="standard"
export TB_DALLE_SIZE="1024x1024"
export TB_DALLE_TTS_MODEL="tts-1"
Configuration Precedence
Configuration is applied in this order (highest to lowest precedence):
- Environment variables (highest)
- Code defaults (lowest)
This allows environment-specific overrides while maintaining sensible defaults.
Changelog / Design Notes
A narrative of why the system looks the way it does. Organized by themes instead of strict time order (the Git history is the authoritative chronological record).
Core Goals
- Deterministic scaffold around inherently stochastic model calls.
- Small, inspectable codebase (favor clarity over abstraction).
- File‑system artifact transparency (every stage leaves a tangible trace).
- Graceful degradation (disable enhancement, still useful; skip TTS, still complete).
Guiding Principles
- Code First Documentation – Book regenerated from code truths (no speculative claims).
- Single Responsibility – Each package holds a narrow concern (prompting, annotation, progress, etc.).
- Observable by Default – Prompts, titles, annotated outputs, progress snapshots all persisted.
- Conservative Concurrency – Simplicity beats micro‑optimizing parallel fetches at this scale.
- Extensibility via Composition – Add new artifact stages by composing functions, not subclassing.
Key Decisions & Trade‑Offs
Deterministic Attribute Selection
Pros: Reproducible prompt scaffold; enables cache hits and diffable changes. Cons: Less surprise/novelty without enhancement; requires curated attribute pools. Alternative rejected: Pure random selection every run (harder to test and reason about regressions).
Optional Enhancement Layer
Rationale: Keep base system self‑contained while still offering higher quality prompts when desired. Failure mode: Enhancement adds latency and an external dependency surface; disabled in CI / tests.
File System as Persistence & Log
Pros: Zero extra service dependencies; easy manual inspection; supports archival & reproducibility. Cons: No query semantics; potential disk churn. Mitigation: run pruning or archiving env var. Alternative: Database or object store abstraction (deferred until scale justifies).
EMA for Progress Estimation
Chosen over naive fixed weights: adapts to evolving performance characteristics (network latency, model changes). Trade‑off: Early runs yield noisy estimates until EMA stabilizes.
Single Image Pipeline (Sequential Variant Handling)
Instead of parallel HTTP requests: simpler error handling and deterministic progress order. Parallelization could be added later behind a flag if throughput becomes critical.
Annotation Banner Styling
Dynamic contrast analysis for legibility vs. fixed palette. Chosen to accommodate diverse image backgrounds without manual curation.
Minimal Interface Surface
Public API intentionally thin (manager functions + a few constructors). Internal refactors less likely to trigger downstream breakage.
Evolution Highlights
- Initial scaffold: context + prompt templates + image request.
- Added deterministic attribute engine (seed slicing) improving reproducibility.
- Introduced progress phase tracking; later augmented with EMA statistics.
- Added annotation stage for immediate visual branding / metadata embedding.
- Integrated optional prompt enhancement (OpenAI chat) with opt‑out flag.
- Added text‑to‑speech for richer multimodal artifact sets.
- Documentation rewrite to align with real behavior (this book).
Potential Future Enhancements
| Area | Idea | Rationale |
|---|---|---|
| Backend abstraction | Interface for image provider | Swap OpenAI with local or alternative APIs. |
| Streaming progress | WebSocket or SSE emitter | Real‑time UI updates without polling. |
| Partial resilience | Skip failed image variants | Improve success rate under transient network errors. |
| Caching layer | Hash → image reuse | Avoid regeneration for identical final prompts. |
| Local validation | Lint prompt templates | Detect missing template fields early. |
| Security hardening | Strict network client wrapper | Uniform retry / backoff / logging policy. |
| Benchmark suite | Performance regression tests | Track latency and resource trends. |
| Metrics export | Prometheus counters | Operational observability for long‑running service. |
Anti‑Goals (for now)
- Full local model replication (scope creep; focus stays orchestration layer).
- Complex plugin system (YAGNI; composition suffices).
- Database dependence (file artifacts adequate until scale pressure).
Risks & Mitigations
| Risk | Impact | Mitigation |
|---|---|---|
| Upstream API contract changes | Breaks generation | Version pin + response struct adaptation tests. |
| Disk growth | Exhaust storage | Scheduled pruning / archival compression. |
| Attribute list drift | Unintended prompt shifts | Version control + diff review on list edits. |
| Enhancement latency spikes | Slower UX | Optional disable flag + timeout wrapping. |
| Progress misestimation | Poor UX feedback | EMA smoothing + UI disclaimers. |
Testing Philosophy
- Deterministic layers (attribute mapping, template expansion) thoroughly unit tested.
- Network edges minimized & stubbed in tests; external calls out of critical logic for testability.
- Visual artifact tests kept lightweight (e.g., hash banner size or presence rather than pixel perfection).
Refactoring Guidelines
- Preserve public function signatures unless strong justification.
- Introduce new feature flags / env vars for opt‑in behavior changes.
- Update documentation (this book) in the same commit as semantic changes.
- Maintain deterministic surfaces—if you introduce randomness, gate it behind an explicit parameter.
Changelog (High‑Level Summary)
(This section intentionally summarizes; consult Git log for exact commits.)
- v0.1: Base context, prompt templates, image generation.
- v0.2: Attribute seed mapping + series filtering.
- v0.3: Progress tracking + EMA.
- v0.4: Annotation banner & color analysis.
- v0.5: Prompt enhancement opt‑in.
- v0.6: Text‑to‑speech integration.
- v0.7: Comprehensive documentation rewrite (current state).
Glossary
- Dress / DalleDress – Structured concept derived from seed & attributes feeding templates.
- Enhancement – Model‑driven rewrite of base prompt.
- EMA – Exponential Moving Average used for phase duration estimation.
- Run GUID – Unique identifier for one full generation pipeline execution.
How to Contribute Design Changes
Open an issue outlining: Problem → Proposed Change → Alternatives → Impact on determinism → Migration considerations. Link directly to affected code lines (permalinks). Update this file if rationale extends beyond a short commit message.
Next: (End) – You have reached the final chapter.