HCI Research

A Context-Aware Interface for Iterative Image Generation and Editing

Apr 17, 2026

Abstract

Current AI image generation systems rely on prompt-based interaction for both creation and refinement. While effective for initial generation, this approach introduces friction during iteration, requiring users to repeatedly restate intentions in text. This work presents a prototype interface in which the system analyzes generated images and offers context-aware editing suggestions. Beyond improving efficiency, the approach is framed as a step toward augmenting human creativity, where the system participates in the iterative process by introducing relevant possibilities based on its understanding of the current state.

1. Introduction

The primary limitation of current generative systems is no longer their ability to produce outputs, but their ability to support thought.

Most image generation tools operate through prompting. Users describe an idea, receive an output, and refine it by rewriting the description. This creates a workflow where iteration depends on repeated articulation.

However, creative work does not primarily operate through repeated description. It develops through interaction—seeing something, reacting to it, and adjusting it.

This paper explores an alternative interaction model that supports this process more directly.

2. Human Interaction as a Reference Model

Human collaboration is not built on a single mode of communication. It relies on a combination of:

Language for expressing intent
Pointing and selection for directing attention
Shared artifacts for maintaining context over time

These elements allow ideas to evolve incrementally. A concept is introduced, observed, modified, and expanded through interaction with others and with the artifact itself.

Importantly, productivity and creativity increase when multiple perspectives or possibilities are present. A collaborator does not simply execute instructions; they introduce alternatives, suggest directions, and expand the space of consideration.

This dynamic is largely absent in current AI image generation workflows.

3. Problem

Prompt-based systems reduce interaction to a single channel: language.

The typical workflow requires:

Writing a prompt
Evaluating the result
Rewriting the prompt

This creates two key limitations:

Loss of continuity: Each refinement is treated as a new request
Isolation of ideas: The system does not introduce alternatives unless explicitly prompted

As a result, the user carries the full burden of iteration. The system generates, but does not participate in the process of refinement.

4. Approach

This work introduces a context-aware interaction layer designed to address this limitation.

After generating an image, the system analyzes its output and identifies key elements within the scene. Based on this interpretation, it presents a set of relevant editing options.

These options are not generic controls, but are derived from what the system perceives in the image. For example:

Identified subjects and their attributes
Detected objects and accessories
Lighting conditions and environment
Visual style characteristics

The system then surfaces possible adjustments related to these elements.

5. Interaction Model

The interaction shifts from instruction to exchange.

Instead of requiring the user to specify every change, the system introduces possible directions for refinement. The user responds by selecting or adjusting these suggestions.

This creates a loop:

The system generates an output
The system interprets the output
The system proposes possible modifications
The user reacts and refines

Iteration becomes a continuous process grounded in the current state of the image.

6. Augmenting Creative Thought

The significance of this approach extends beyond efficiency.

By introducing context-aware suggestions, the system acts as a secondary source of ideas. It does not replace user intent, but expands the set of available possibilities.

This mirrors a fundamental aspect of human collaboration, where creativity emerges not only from internal reasoning, but from exposure to alternative directions.

The system, in this case, provides:

Immediate feedback based on the current artifact
Alternative pathways that may not have been initially considered
A shared context, where both user and system operate on the same visual state

Rather than functioning as a tool that executes commands, the system begins to operate as a participant in the creative process.

7. Discussion

7.1 Interaction as Augmentation

The goal is not to automate creativity, but to support it. By reducing the effort required to explore variations, the system allows users to focus on decision-making rather than articulation.

7.2 Multiple Streams of Possibility

Creativity often benefits from parallel ideas. Context-aware suggestions introduce additional “streams” of potential changes, similar to how collaborators contribute alternatives in a shared environment.

7.3 Reducing Dependence on Language

While prompting remains useful for initial generation, it is less effective for iterative refinement. Direct interaction with the artifact provides a more natural mode of control.

8. Limitations

The approach depends on accurate interpretation of generated images. If the system fails to identify relevant elements, the usefulness of its suggestions is reduced.

Additionally, overly prescriptive suggestions may limit exploration if not designed carefully.

9. Conclusion

Prompt-based systems are effective for initiating image generation but insufficient for supporting iterative creative workflows.

By introducing a context-aware interaction layer, this work demonstrates how AI systems can better align with how ideas develop—through observation, adjustment, and exposure to alternatives.

The broader implication is that the role of AI in creative tools may shift from generating outputs to augmenting the process of thinking itself.