Gemini Image Gen Operator
Overview
Section titled “Overview”The Gemini Image Gen LOP allows you to generate images using Google’s Gemini models, specifically leveraging the experimental gemini-2.0-flash-exp-image-generation
model accessed through LiteLLM. It takes a text prompt (and optionally an input image or context from a Context Grabber) and generates an image, saving it to a specified directory and logging the process in a history table.
Requirements
Section titled “Requirements”- Python Packages:
litellm
: For interacting with the Gemini API.Pillow
(PIL): For image processing.opencv-python
(cv2): For image conversion.numpy
: Required by OpenCV.- Install these via the ChatTD Python Manager.
- ChatTD Operator: Required for dependency management (package installation) and asynchronous task execution. Ensure the
ChatTD Operator
parameter on the ‘About’ page points to your configured ChatTD instance. - Gemini API Key: A valid API key from Google AI Studio is required. Obtain one and enter it into the
Gemini API Key
parameter.
Input/Output
Section titled “Input/Output”Inputs
Section titled “Inputs”- Input DAT (
input_prompt
, optional): IfPrompt Source
is set toinput_dat
, this table is used to construct the prompt. It should contain rows withrole
(‘user’ or ‘assistant’) andmessage
columns. - Input Image TOP (optional): Connected via the
Input Image (Optional)
parameter. Used for image-to-image tasks (if supported by the model/prompting). - Context Grabber COMP (optional): Connected via the
Context Grabber (Optional)
parameter. Allows adding context (text and images) from another operator to the prompt.
Outputs
Section titled “Outputs”- Image Files: Generated images are saved as PNG files in the directory specified by
Output Directory
(or a default location within the ChatTD environment). - Metadata Files: JSON files containing details about each generation job (prompt, timestamp, model, paths) are saved alongside the images.
- History DAT (
history_dat
): An internal table logging each generation attempt, including job ID, prompt, timestamp, status, model used, and paths to the generated image and metadata files. - Viewer TOP (
image_viewer
): Displays the image selected by theDisplay Image Index
parameter.
Parameters
Section titled “Parameters”Page: Gemini
Section titled “Page: Gemini” Gemini API Key (Apikey)
op('gemini_image_gen').par.Apikey
Str - Default:
API KEY LOADED
Get API Key (Getapikey)
op('gemini_image_gen').par.Getapikey
Pulse - Default:
None
Input Image (Optional) (Inputimage)
op('gemini_image_gen').par.Inputimage
TOP - Default:
None
Context Grabber (Optional) (Contextgrabber)
op('gemini_image_gen').par.Contextgrabber
COMP - Default:
None
Prompt (Prompt)
op('gemini_image_gen').par.Prompt
Str - Default:
None
Generate Image (Generate)
op('gemini_image_gen').par.Generate
Pulse - Default:
None
Output Directory (Outputdir)
op('gemini_image_gen').par.Outputdir
Folder - Default:
gemini_images_test
Status (Status)
op('gemini_image_gen').par.Status
Str - Default:
GeminiImageGen
Active (Active)
op('gemini_image_gen').par.Active
Toggle - Default:
0
- Options:
- off, on
Display Image Index (Displayimage)
op('gemini_image_gen').par.Displayimage
Int - Default:
1
- Range:
- 1 to N/A
- Slider Range:
- 1 to N/A
Auto-select Last Image (Setdisplay)
op('gemini_image_gen').par.Setdisplay
Toggle - Default:
1
- Options:
- off, on
Generate on Input Change (Onin1)
op('gemini_image_gen').par.Onin1
Toggle - Default:
1
- Options:
- off, on
Page: About
Section titled “Page: About” Bypass (Bypass)
op('gemini_image_gen').par.Bypass
Toggle - Default:
0
- Options:
- off, on
Show Built-in Parameters (Showbuiltin)
op('gemini_image_gen').par.Showbuiltin
Toggle - Default:
0
- Options:
- off, on
Version (Version)
op('gemini_image_gen').par.Version
Str - Default:
1.0.0
Last Updated (Lastupdated)
op('gemini_image_gen').par.Lastupdated
Str - Default:
2025-05-02
Creator (Creator)
op('gemini_image_gen').par.Creator
Str - Default:
dotsimulate
Website (Website)
op('gemini_image_gen').par.Website
Str - Default:
https://dotsimulate.com
ChatTD Operator (Chattd)
op('gemini_image_gen').par.Chattd
OP - Default:
/dot_lops/ChatTD
Usage Examples
Section titled “Usage Examples”Basic Image Generation
Section titled “Basic Image Generation”1. Enter your Gemini API Key in the 'Gemini API Key' parameter.2. Ensure 'Prompt Source' is set to 'parameter'.3. Enter your desired prompt in the 'Prompt' parameter (e.g., "A photorealistic cat wearing sunglasses riding a skateboard").4. Pulse the 'Generate Image' button.5. Monitor the 'Status' parameter.6. View the generated image in the operator viewer or the specified 'Output Directory'.
Generating from Input DAT
Section titled “Generating from Input DAT”1. Set 'Prompt Source' to 'input_dat'.2. Create a Table DAT with columns 'role' and 'message'.3. Add rows with roles 'user' or 'assistant' and your prompt message(s).4. Connect this DAT to the first input of the GeminiImageGen operator.5. Ensure 'Generate on Input Change' is enabled if you want automatic generation, otherwise pulse 'Generate Image'.
Using an Input Image
Section titled “Using an Input Image”1. Connect a TOP containing your input image to the 'Input Image (Optional)' parameter.2. Craft your 'Prompt' to instruct the model on how to use the input image (e.g., "Edit this image to make the sky purple", "Describe this image in detail"). Specific prompt techniques depend on the model's capabilities.3. Pulse 'Generate Image'.
Using a Context Grabber
Section titled “Using a Context Grabber”1. Connect a configured Context Grabber operator to the 'Context Grabber (Optional)' parameter.2. The text and images collected by the Context Grabber will be automatically included in the prompt sent to Gemini.3. Enter a main instruction in the 'Prompt' parameter if needed.4. Pulse 'Generate Image'.
Technical Notes
Section titled “Technical Notes”- API Key: Your Gemini API key is stored securely in a configuration file within your ChatTD environment or retrieved from the ChatTD Key Manager.
- Dependencies: Requires
litellm
,Pillow
,opencv-python
, andnumpy
. Use ChatTD’s Python Manager to install these. - File Saving: Images are saved as PNG files. Metadata is saved as JSON.
- Asynchronous Operation: Image generation happens asynchronously via ChatTD’s TDAsyncIO, preventing TouchDesigner from freezing.
- Response Handling: The operator extracts the base64 image data from the API response. If the response contains text alongside the image, this text is stored in the
response_text
column of thehistory_dat
table. - Input Image Encoding: Input TOPs are converted to base64-encoded JPEG data URIs before being sent to the API.
Related Operators
Section titled “Related Operators”- ChatTD: Provides core services like dependency management, API key management, and asynchronous task execution required by this operator.
- Context Grabber: Can be used to provide additional text and image context to the generation prompt.