Image Generation and Prompt Engineering

In this notebook, we generate images from text prompts using a diffusion model, and explore how the words we choose — including style modifiers — change what the model creates.

import openai
import os
import base64
from PIL import Image
from io import BytesIO
from IPython.display import display

client = openai.OpenAI(
  base_url='https://openrouter.ai/api/v1',
  api_key=os.environ["OPENROUTER_API_KEY"]
)

MODEL = "black-forest-labs/flux.2-klein-4b"

def display_image(image_url):
  url = image_url
  if url.startswith("data:"):
    url = url.split(",", 1)[1]
  image_data = base64.b64decode(url)
  image = Image.open(BytesIO(image_data))
  display(image)

Part 1: Your First Generated Image

Edit the prompt below and run the cell to generate an image. Try to be descriptive — the more detail you give the model, the more control you have over the result.

PROMPT = "A Mongolian Bankhar dog running across the steppe at sunset" #@param

response = client.chat.completions.create(
  model=MODEL,
  messages=[{"role": "user", "content": PROMPT}],
  extra_body={"modalities": ["image"]}
)

image_url = response.choices[0].message.images[0]["image_url"]["url"]
display_image(image_url)

Take a moment to reflect on the image you generated.

How closely did the generated image match what you imagined? What would you change about the prompt to get a different or better result?

Part 2: Style Modifiers

One of the most powerful prompt engineering techniques is adding style modifiers — words that tell the model how to render the scene, not just what to show.

Compare these two prompts: - A cat sitting on a chair - A cat sitting on a chair, oil painting style

The subject is identical, but the visual result is completely different. Try it below — use the same prompt from Part 1 and experiment with different styles.

PROMPT = "A Mongolian Bankhar dog running across the steppe at sunset" #@param
STYLE = "oil painting" #@param ["oil painting", "watercolor", "anime", "photorealistic", "pixel art"]
LIGHTING = "studio lighting" #@param ["studio lighting", "dramatic shadows", "soft diffused"]
COMPOSITION = "close up" #@param ["close up", "wide-angle", "rule of thirds"]

styled_prompt = f"{PROMPT}, {STYLE} style, {LIGHTING}, {COMPOSITION}"
print(f"Sending prompt: {styled_prompt}")

response = client.chat.completions.create(
  model=MODEL,
  messages=[{"role": "user", "content": styled_prompt}],
  extra_body={"modalities": ["image"]}
)

image_url = response.choices[0].message.images[0]["image_url"]["url"]
display_image(image_url)

Compare the styled image with the one from Part 1.

How did adding a style modifier change the result? Which style did you find most interesting, and why? What other style words might you want to try?

{ “question_type”: “true_false”, “question”: “Text-to-image models use the diffusion process to generate images from a text prompt.”, “answer”: “True”, “submitted_answer”: “” }

{ “question_type”: “multiple_choice”, “question”: “What is the effect of adding a style modifier like ‘watercolor’ to an image generation prompt?”, “options”: [ { “key”: “a”, “text”: “It changes the subject of the image” }, { “key”: “b”, “text”: “It controls the size of the generated image” }, { “key”: “c”, “text”: “It changes how the image looks while keeping the same subject” }, { “key”: “d”, “text”: “It makes the model run faster” } ], “answer”: “c”, “submitted_answer”: “” }

{ “question_type”: “freeform”, “question”: “What term describes words added to a prompt — like ‘oil painting’ or ‘anime’ — that control the visual style of the generated image?”, “answer”: “style modifiers”, “submitted_answer”: “” }