Skip to main content

The Core Problem

In naive LLM simulations, entities magically know things they shouldn’t:

Temporal Anachronism

Character references future event that hasn’t happened yet

Information Telepathy

Character knows private conversation they didn’t witness

Source Amnesia

Character states fact with no traceable origin

Omniscient Entities

All characters share narrator’s knowledge
The fundamental insight: Entities shouldn’t magically know things. Every piece of knowledge should have a traceable origin—who learned what, from whom, when, with what confidence.

M3: Exposure Event Tracking

What is an Exposure Event?

An exposure event is a logged record of knowledge acquisition:
ExposureEvent:
    entity_id: str                # Who learned
    event_type: EventType         # How they learned
    information: str              # What they learned
    source: Optional[str]         # From whom/what
    timestamp: datetime           # When
    confidence: float             # How certain (0.0-1.0)
    timepoint_id: str             # Where in causal chain

Event Types

Entity directly observed an eventExample: “Madison witnessed Washington’s speech at convention”Confidence: High (0.9-1.0)

The Validation Constraint

Iron Law: entity.knowledge_state ⊆ {e.information for e in entity.exposure_events where e.timestamp ≤ query_timestamp}An entity cannot know something without a recorded exposure event explaining how they learned it.

Example: Constitutional Convention

Scenario Timeline

1

May 14, 1787: Madison Creates Virginia Plan

ExposureEvent(
    entity_id="james_madison",
    event_type="experienced",
    information="Virginia Plan constitutional framework",
    source="self",  # Madison is the creator
    timestamp="1787-05-14",
    confidence=1.0,
    timepoint_id="virginia_plan_drafting"
)
2

May 25: Madison Shares with Washington

# Madison's telling
ExposureEvent(
    entity_id="george_washington",
    event_type="told",
    information="Virginia Plan constitutional framework",
    source="james_madison",
    timestamp="1787-05-25",
    confidence=0.85,  # Trust in Madison
    timepoint_id="washington_madison_meeting"
)
3

May 29: Washington References Plan ✅

# VALID: Washington has exposure from Step 2
dialog_turn = DialogTurn(
    speaker="george_washington",
    content="As Madison's proposal outlines, we need strong federal powers",
    knowledge_references=["Virginia Plan constitutional framework"],
    timestamp="1787-05-29"
)

# Validation passes:
knowledge_accessible = check_exposure_events(
    entity="george_washington",
    knowledge="Virginia Plan constitutional framework",
    query_time="1787-05-29"
)
# → Returns exposure from May 25
4

May 29: Jefferson References Plan ❌

# INVALID: Jefferson not present, no exposure
dialog_turn = DialogTurn(
    speaker="thomas_jefferson",
    content="Madison's Virginia Plan is too centralist",
    knowledge_references=["Virginia Plan constitutional framework"],
    timestamp="1787-05-29"
)

# Validation FAILS:
knowledge_accessible = check_exposure_events(
    entity="thomas_jefferson",
    knowledge="Virginia Plan constitutional framework",
    query_time="1787-05-29"
)
# → No exposure events found
# → ValidationError: Temporal anachronism detected
Jefferson was in Paris as ambassador during the convention. He cannot know about internal deliberations.

Causal Audit Trail

Exposure Events Form a DAG

Exposure event directed acyclic graph
Nodes are information items, edges are causal relationships (who learned from whom).

Walking the Graph

def trace_knowledge_origin(entity_id: str, knowledge: str, store: GraphStore):
    """Walk exposure graph backward to find ultimate source."""
    events = store.get_exposure_events(entity_id, information=knowledge)
    
    if not events:
        return None  # No provenance!
    
    # Walk backward through sources
    path = []
    current_entity = entity_id
    
    while current_entity:
        event = events[0]  # Most recent
        path.append({
            "entity": current_entity,
            "source": event.source,
            "type": event.event_type,
            "confidence": event.confidence,
            "timestamp": event.timestamp
        })
        
        if event.source == "self" or event.source is None:
            break  # Reached origin
        
        current_entity = event.source
        events = store.get_exposure_events(current_entity, information=knowledge)
    
    return path

Counterfactual Reasoning

Exposure graphs enable “what if” queries:
# Remove the exposure event
store.delete_exposure_event(
    entity_id="george_washington",
    information="Virginia Plan constitutional framework",
    timestamp="1787-05-25"
)

# Re-run simulation from May 26 forward
branch = create_counterfactual_branch(
    parent_timeline=baseline,
    intervention_point="may_25_meeting",
    intervention=Intervention(
        type="knowledge_removal",
        target="george_washington",
        parameters={"knowledge": "Virginia Plan constitutional framework"}
    )
)

# Washington's May 29 dialog now CANNOT reference Virginia Plan
# System generates alternative dialog without that knowledge
This enables causal impact analysis: How much did this specific knowledge transfer matter?

M19: Knowledge Extraction Agent

The Problem with Naive Extraction

Early approaches used capitalization heuristics:
# BROKEN: Naive extraction
def extract_knowledge_references(content: str) -> List[str]:
    words = content.split()
    knowledge_items = []
    for word in words:
        clean = word.strip('.,!?;:"\'-()[]{}')  
        if clean and len(clean) > 3 and clean[0].isupper():
            knowledge_items.append(clean.lower())
    return list(set(knowledge_items))

# Result from dialog:
# ["we'll", "thanks", "what", "michael", "i've"]  # GARBAGE
This catches sentence-initial words, contractions, common words, names without context—all useless.

The M19 Solution: LLM-Based Extraction

An LLM agent receives:
  1. Dialog turns to analyze
  2. Causal graph context (existing knowledge)
  3. Entity metadata (who’s speaking, who’s listening)
It returns structured KnowledgeItem objects:
KnowledgeItem:
    content: str                # Complete semantic unit
    speaker: str                # Entity who communicated
    listeners: List[str]        # Entities who received it
    category: str               # fact, decision, opinion, plan, revelation, question, agreement
    confidence: float           # 0.0-1.0, extraction confidence
    context: Optional[str]      # Why this matters
    causal_relevance: float     # 0.0-1.0, importance for causal chain

What Gets Extracted

  • Facts: “The meeting is scheduled for 3pm Tuesday”
  • Decisions: “The board approved the $2M budget increase”
  • Revelations: “Sarah revealed the prototype failed last week”
  • Plans: “We’ll launch the product in Q3 2025”
  • Agreements: “Everyone agreed to postpone until we have more data”

Knowledge Categories

CategoryDescriptionExampleCausal Relevance
factVerifiable information”The competitor filed patent #8,123,456”High (0.8-1.0)
decisionCommunicated choice”We decided to pivot to B2B”Very High (0.9-1.0)
opinionSubjective view”I think the design needs work”Medium (0.4-0.6)
planIntended future action”We’ll hire 3 engineers in Q2”High (0.7-0.9)
revelationNew info changing understanding”The acquisition talks fell through”Very High (0.9-1.0)
questionQuery revealing information”Did you know about the layoffs?”Low-Medium (0.3-0.5)
agreementConsensus reached”We all agree on the pricing strategy”High (0.7-0.9)

RAG-Aware Prompting

The extraction agent receives causal context from existing exposure events:
def build_causal_context(entities, store):
    """Build context from existing knowledge for extraction agent."""
    context = []
    for entity in entities:
        # Get recent exposure events
        exposures = store.get_exposure_events(entity.entity_id, limit=10)
        
        # Include static knowledge
        static = entity.entity_metadata.get("knowledge_state", [])
        
        context.append({
            "entity": entity.entity_id,
            "known_facts": [e.information for e in exposures],
            "static_knowledge": static
        })
    
    return context
This enables the agent to:
  1. Avoid redundant extraction: Don’t store facts already in system
  2. Recognize novel information: New facts worth storing
  3. Understand relationships: How new knowledge connects to existing

M4: Constraint Enforcement

Five Conservation Laws

Timepoint Pro enforces consistency using conservation-law metaphors:
Law: Knowledge state cannot exceed exposure history
def validate_information(entity, context):
    knowledge = set(entity.knowledge_state)
    exposure = set(e.information for e in context["exposure_history"])
    violations = knowledge - exposure
    return ValidationResult(
        valid=len(violations) == 0,
        violations=list(violations)
    )
Analogy: Information is conserved like energy—can’t create it from nothing
Law: Entities have bounded cognitive/physical energy per timepoint
def validate_energy(entity, actions):
    total_cost = sum(action.energy_cost for action in actions)
    available = entity.cognitive_tensor.energy_budget
    
    return ValidationResult(
        valid=total_cost <= available,
        violations=[f"Energy deficit: {total_cost - available:.1f}"]
    )
Analogy: Can’t spend more energy than you have
Law: Personality traits persist; sudden changes require justification
def validate_behavior(entity, new_behavior, timespan):
    old_traits = entity.behavior_vector
    new_traits = new_behavior.behavior_vector
    
    delta = np.linalg.norm(new_traits - old_traits)
    max_change = 0.1 * timespan.days  # 10% per day max
    
    return ValidationResult(
        valid=delta <= max_change,
        violations=[f"Behavior shift too rapid: {delta:.2f} > {max_change:.2f}"]
    )
Analogy: Momentum—entities have inertia, can’t change direction instantly
Law: Physical limitations constrain behavior
def validate_biological(entity, action):
    violations = []
    
    if action.requires_mobility and entity.physical_tensor.mobility < 0.3:
        violations.append("Action requires mobility entity lacks")
    
    if action.location_required and entity.physical_tensor.location != action.location:
        violations.append("Entity not at required location")
    
    return ValidationResult(
        valid=len(violations) == 0,
        violations=violations
    )
Analogy: Physical constraints are hard limits
Law: Information propagates along relationship edges
def validate_network_flow(knowledge_item, source, target, graph):
    # Check if path exists in relationship graph
    path = nx.shortest_path(graph, source, target)
    
    if not path:
        return ValidationResult(
            valid=False,
            violations=[f"No information path from {source} to {target}"]
        )
    
    # Check trust levels along path
    min_trust = min(graph[u][v]["trust_level"] for u, v in zip(path[:-1], path[1:]))
    
    return ValidationResult(
        valid=min_trust > 0.3,  # Threshold for information flow
        violations=[f"Trust too low along path: {min_trust:.2f}"]
    )
Analogy: Information flows like water through pipes (relationship network)

Castaway Colony Example

# Check all constraints
validate_information(sharma, context)
# ✅ Sharma has exposure: "power coupling location" from Day 3 debris survey

validate_energy(sharma, [repair_action])
# ✅ repair_action costs 40 energy, Sharma has 65 available

validate_biological(sharma, repair_action)
# ✅ Sharma's mobility is 0.8 (healthy), location matches debris field

# Action proceeds
Important: Specific numerical values (O₂ rates, radiation levels, etc.) in simulation output are LLM-generated narrative, not computed by the engine. The engine enforces structural constraints (information conservation, energy budgets, behavioral inertia), not physics calculations.

PORTAL Mode: Causal Time Filtering

The Challenge

In PORTAL mode (backward reasoning), characters exist at multiple timepoints but with different causal positions. A character in 2028 cannot know about events from 2030 in their past.

Knowledge Stripping

def filter_knowledge_by_causal_time(entity, timepoint, store):
    """Remove knowledge from causally inaccessible timepoints."""
    # Walk causal_parent chain to build ancestor set
    ancestors = set()
    current = timepoint
    
    while current:
        ancestors.add(current.timepoint_id)
        parent_id = current.causal_parent
        current = store.get_timepoint(parent_id) if parent_id else None
    
    # Filter exposure events to only ancestors
    accessible_events = [
        e for e in store.get_exposure_events(entity.entity_id)
        if e.timepoint_id in ancestors
    ]
    
    return accessible_events
Scenario: Presidential campaign portal, endpoint 2040, character at 2028 stepFull Knowledge Graph:
  • “Campaign strategy meeting July 2027” ✅ Accessible
  • “Primary victory March 2028” ✅ Accessible
  • “Running mate selection June 2029” ❌ Not yet happened
  • “General election debate Oct 2029” ❌ Not yet happened
  • “Inauguration January 2040” ❌ Endpoint, not accessible
Filtered Knowledge (what character actually knows in 2028):
  • Campaign strategy
  • Primary victory
Dialog Generation: Uses only filtered knowledge, character cannot reference or fear events from 2029+

Integration with Dialog Synthesis (M11)

Knowledge extraction happens automatically during dialog:
# In synthesize_dialog():

# 1. Generate dialog (M11)
dialog_data = llm.generate_dialog(prompt, max_tokens=2000)

# 2. Extract knowledge using M19 agent
extraction_result = extract_knowledge_from_dialog(
    dialog_turns=dialog_data.turns,
    entities=entities,
    timepoint=timepoint,
    llm=llm,
    store=store
)

# 3. Create exposure events (M19→M3)
exposure_events = create_exposure_events_from_knowledge(
    extraction_result=extraction_result,
    timepoint=timepoint,
    store=store
)

# 4. Validate all new knowledge references
for turn in dialog_data.turns:
    for knowledge_ref in turn.knowledge_references:
        validate_information(turn.speaker, {"knowledge": knowledge_ref})
        # Raises ValidationError if no exposure event

Preventing Anachronisms: The Complete Pipeline

1

Dialog Generation with Context

M11 generates dialog using entity’s filtered knowledge state (only causally accessible items)
2

Knowledge Extraction

M19 extracts semantic knowledge items from dialog turns
3

Exposure Event Creation

M3 creates exposure events for all listeners:
  • event_type="told"
  • source=speaker
  • confidence based on speaker’s credibility
4

Constraint Validation

M4 validates all knowledge references:
  • Information conservation
  • Network flow
  • Temporal ordering (PORTAL mode)
5

Causal Graph Update

Exposure events added to DAG, enabling future tracing
Knowledge provenance pipeline

API Examples

from generation import GraphStore

store = GraphStore("simulation.db")

# Get all exposure events for entity
events = store.get_exposure_events("george_washington")

for event in events:
    print(f"{event.timestamp}: {event.event_type}")
    print(f"  Learned: {event.information}")
    print(f"  From: {event.source}")
    print(f"  Confidence: {event.confidence:.2f}")

Next Steps

Temporal Modes

How PORTAL mode uses causal filtering

Fidelity Management

Resolution levels and TTM tensors

All 19 Mechanisms

Complete technical architecture