Producing a fixed image and Gemini

0 0 16 minutes read

Before we go inside:

I am an engineer in Google Cloud. Thoughts and ideas mentioned here They are totally mine.
The perfect source code of this article, including future updates, available for this program under the Apache License 2.0.
All new photos in this article were produced in Gemini Nano banana using the Proop-of-Concept Generance tested pipe.
You can try the Gemini free on Google Ai Studio. Please note that API access to the Programmatic API in Nano Banana It is As-You-GO service.

Challenge 🔥

We all have the pictures that are worth the interaction of different situations. This generally can mean to correct pictures, complicated (if not possible) work that requires certain skills and tools. This means managed by our histories that are full of forgotten values or unused. Types of state-of-ART empire appeared that we can reurd the problem.

So, can we take a new life on our visual issues?

Let's try to get rid of this challenge for the following steps:

1️⃣ Start from the savings image we wish to use and
2️⃣ Remove the character to create a new reference picture
3️⃣ Generating a series of pictures to illustrate the character's trip, only using the motivation and new goods

In this case, we will examine the skills “Gemini 2.5 Flash image”, also known as “Nanana Banana” 🍌.

🏁 To set

🐍 Python Packages

We will use the following packages:

google-genai: Google Gen Python SDK makes us call gemini with a few of the coding lines
networkx With the management of the graph

We will also use the following leanings:

pillow including matplotlib By seeing data
tenacity Application Management

%pip install --quiet "google-genai>=1.38.0" "networkx[default]"

🤖 Gen AI SDK

Create a google.genai Client:

from google import genai

check_environment()

client = genai.Client()

Check your configuration:

check_configuration(client)

Using the Vertex AI API with project "…" in location "global"

🧠 Gemini model

In this challenge, we will choose the latest Gemini Gemini Model 2.5 Flash Flash (right now on view):

GEMINI_2_5_FLASH_IMAGE = "gemini-2.5-flash-image-preview"

💡 “Gemini 2.5 Flash Image” is also known as “Nanana Banas” 🍌

🛠️ Hids

Describe specific services to produce and show images: 🔽

import IPython.display
import tenacity
from google.genai.errors import ClientError
from google.genai.types import GenerateContentConfig, PIL_Image

GEMINI_2_5_FLASH_IMAGE = "gemini-2.5-flash-image-preview"
GENERATION_CONFIG = GenerateContentConfig(response_modalities=["TEXT", "IMAGE"])


def generate_content(sources: list[PIL_Image], prompt: str) -> PIL_Image | None:
    prompt = prompt.strip()
    contents = [*sources, prompt] if sources else prompt

    response = None
    for attempt in get_retrier():
        with attempt:
            response = client.models.generate_content(
                model=GEMINI_2_5_FLASH_IMAGE,
                contents=contents,
                config=GENERATION_CONFIG,
            )

    if not response or not response.candidates:
        return None
    if not (content := response.candidates[0].content):
        return None
    if not (parts := content.parts):
        return None

    image: PIL_Image | None = None
    for part in parts:
        if part.text:
            display_markdown(part.text)
            continue
        assert (sdk_image := part.as_image())
        assert (image := sdk_image._pil_image)
        display_image(image)

    return image


def get_retrier() -> tenacity.Retrying:
    return tenacity.Retrying(
        stop=tenacity.stop_after_attempt(7),
        wait=tenacity.wait_incrementing(start=10, increment=1),
        retry=should_retry_request,
        reraise=True,
    )


def should_retry_request(retry_state: tenacity.RetryCallState) -> bool:
    if not retry_state.outcome:
        return False
    err = retry_state.outcome.exception()
    if not isinstance(err, ClientError):
        return False
    print(f"❌ ClientError {err.code}: {err.message}")

    retry = False
    match err.code:
        case 400 if err.message is not None and " try again " in err.message:
            # Workshop: Cloud Storage accessed for the first time (service agent provisioning)
            retry = True
        case 429:
            # Workshop: temporary project with 1 QPM quota
            retry = True
    print(f"🔄 Retry: {retry}")

    return retry


def display_markdown(markdown: str) -> None:
    IPython.display.display(IPython.display.Markdown(markdown))


def display_image(image: PIL_Image) -> None:
    IPython.display.display(image)

🖼️ estate

Let's explain the goods of our character and moral activities:

import enum
from collections.abc import Sequence
from dataclasses import dataclass


class AssetId(enum.StrEnum):
    ARCHIVE = "0_archive"
    ROBOT = "1_robot"
    MOUNTAINS = "2_mountains"
    VALLEY = "3_valley"
    FOREST = "4_forest"
    CLEARING = "5_clearing"
    ASCENSION = "6_ascension"
    SUMMIT = "7_summit"
    BRIDGE = "8_bridge"
    HAMMOCK = "9_hammock"


@dataclass
class Asset:
    id: str
    source_ids: Sequence[str]
    prompt: str
    pil_image: PIL_Image


class Assets(dict[str, Asset]):
    def set_asset(self, asset: Asset) -> None:
        # Note: This replaces any existing asset (if needed, add guardrails to auto-save|keep all versions)
        self[asset.id] = asset


def generate_image(source_ids: Sequence[str], prompt: str, new_id: str = "") -> None:
    sources = [assets[source_id].pil_image for source_id in source_ids]
    prompt = prompt.strip()
    image = generate_content(sources, prompt)
    if image and new_id:
        assets.set_asset(Asset(new_id, source_ids, prompt, image))


assets = Assets()

📦 Saves of Reference

Now we can download our reference history and make it our first asset: 🔽

import urllib.request

import PIL.Image
import PIL.ImageOps

ARCHIVE_URL = "


def load_archive() -> None:
    image = get_image_from_url(ARCHIVE_URL)
    # Keep original details in 16:9 landscape aspect ratio (arbitrary)
    image = crop_expand_if_needed(image, 1344, 768)
    assets.set_asset(Asset(AssetId.ARCHIVE, [], "", image))
    display_image(image)


def get_image_from_url(image_url: str) -> PIL_Image:
    with urllib.request.urlopen(image_url) as response:
        return PIL.Image.open(response)


def crop_expand_if_needed(image: PIL_Image, dst_w: int, dst_h: int) -> PIL_Image:
    src_w, src_h = image.size
    if dst_w < src_w or dst_h < src_h:
        crop_l, crop_t = (src_w - dst_w) // 2, (src_h - dst_h) // 2
        image = image.crop((crop_l, crop_t, crop_l + dst_w, crop_t + dst_h))
        src_w, src_h = image.size
    if src_w < dst_w or src_h < dst_h:
        off_l, off_t = (dst_w - src_w) // 2, (dst_h - src_h) // 2
        borders = (off_l, off_t, dst_w - src_w - off_l, dst_h - src_h - off_t)
        image = PIL.ImageOps.expand(image, borders, fill="white")

    assert image.size == (dst_w, dst_h)
    return image

load_archive()

💡 Gemini will keep the closest measure of the last picture status. As a result, we were counting the form of archive to 1344 × 768 The pixels (closely 16:9) To save original information (no deduction) and keep the same solution in all our scenes to come. Gemini can produce 1024 × 1024 Photos (1:1) but their own 16:9, 9:16, 4:3besides 3:4 the same (according to the tokens).

This archive is produced in July 2024 by the BETA's version of Imagen 3, Developed with “After white, a little toy toy with a blue robot hand. The hearers are soft and crafty …”. The result looked really good but at the time, there was no service and no consensus. As a result, this was a beautiful generation of shooting a generation and a beautiful little robot that seemed unefinently …

Let's try to get our little robot:

source_ids = [AssetId.ARCHIVE]
prompt = "Extract the robot as is, without its shadow, replacing everything with a solid white fill."

generate_image(source_ids, prompt)

Done in Gemini Nano Banana by the author

⚠️ robot is completely removed, but this is actually this good backward removal, most models can do. This agreement uses the words from the Graphics software, and now we can show in terms of the image formation. And it's not a good idea to try to use traditional binary mask, as the edges of the object and shadows handed out some important information about shape, formation, positions and lighting.

Let's go back to our historical store to produce an advanced motive instead, and produce a sheet of characters …

🪄 characters sheet

Gemini has local understanding, so it is able to give different views while keeping visible features. Let us produce a sheet of front / back sheet again, as our small robot will continue the journey, and add a backpack at the same time:

source_ids = [AssetId.ARCHIVE]
prompt = """
- Scene: Robot character sheet.
- Left: Front view of the extracted robot.
- Right: Back view of the extracted robot (seamless back).
- The robot wears a same small, brown-felt backpack, with a tiny polished-brass buckle and simple straps in both views. The backpack straps are visible in both views.
- Background: Pure white.
- Text: On the top, caption the image "ROBOT CHARACTER SHEET" and, on the bottom, caption the views "FRONT VIEW" and "BACK VIEW".
"""
new_id = AssetId.ROBOT

generate_image(source_ids, prompt, new_id)

💡 A few expressions:

Prompt is describing this incident in accordance with construction, as often used in studios media.
If we try consecutive generations, fluctuate, and all aspects of stored robots.
Our Prompte has some details of the sheep bag features, but we will find different backpacks with all the unspecified.
Due to the simple, we added the backpack right to the character's paper but, the actual pipe production, we may make it part of a different addic option.
Controlling Backpack and Coverage Look, we can also use the reference picture and “transform the backfare into the National Version.

This new property can now serve as a design indicator in our future generations.

✨ The original state

Let's start with the mountainous condition:

source_ids = [AssetId.ROBOT]
prompt = """
- Image 1: Robot character sheet.
- Scene: Macro photography of a beautifully crafted miniature diorama.
- Background: Soft-focus of a panoramic range of interspersed, dome-like felt mountains, in various shades of medium blue/green, with curvy white snowcaps, extending over the entire horizon.
- Foreground: In the bottom-left, the robot stands on the edge of a medium-gray felt cliff, viewed from a 3/4 back angle, looking out over a sea of clouds (made of white cotton).
- Lighting: Studio, clean and soft.
"""
new_id = AssetId.MOUNTAINS

generate_image(source_ids, prompt, new_id)

The shape of a mountain is described as a “Dome-Like” so we can stand in one of the coming conferences.

It is important to spend some time in the first place as, in the outcome of Cascading, will explain the appearance of all our story. Take time to analyze or try a few times to get the best variations.

From now on, generation input will be a character sheet and reference sheet …

✨ Subsequent scenes

Let's get a valley robot:

source_ids = [AssetId.ROBOT, AssetId.MOUNTAINS]
prompt = """
- Image 1: Robot character sheet.
- Image 2: Previous scene.
- The robot has descended from the cliff to a gray felt valley. It stands in the center, seen directly from the back. It is holding/reading a felt map with outstretched arms.
- Large smooth, round, felt rocks in various beige/gray shades are visible on the sides.
- Background: The distant mountain range. A thin layer of clouds obscures its base and the end of the valley.
- Lighting: Golden hour light, soft and diffused.
"""
new_id = AssetId.VALLEY

generate_image(source_ids, prompt, new_id)

💡 A few notes:

Details provided for our installations images ("Image 1:…", "Image 2:…") It is important. Apart from them, “robot” can refer to any 3 robots to the installation photos (2 on the letter page, 1 in the previous location). For them, we show that it is the same robot. In case of confusion, we can more highly specify with "the [entity] from image [number]".
On the other hand, because we did not give a definitive description of the district, consecutive applications will provide a different, exciting, creative effects (we can choose our preferences or do more directly to determine more).
Here, we checked the opposite, which converts the entire event.

Then, we can move forward to this area:

source_ids = [AssetId.ROBOT, AssetId.VALLEY]
prompt = """
- Image 1: Robot character sheet.
- Image 2: Previous scene.
- The robot goes on and faces a dense, infinite forest of simple, giant, thin trees, that fills the entire background.
- The trees are made from various shades of light/medium/dark green felt.
- The robot is on the right, viewed from a 3/4 rear angle, no longer holding the map, with both hands clasped to its ears in despair.
- On the left & right bottom sides, rocks (similar to image 2) are partially visible.
"""
new_id = AssetId.FOREST

generate_image(source_ids, prompt, new_id)

💡 in interest:

We can put the character, change its view, even that “even his” arms to become more sound.
“No MAP is analyzing the MAP” prevents the model in trying to keep it from the previous place (eg a robot in the bottom of the map).
We did not give light information: to light, quality, and correction is stored from the previous location.

Let's get past the forest:

source_ids = [AssetId.ROBOT, AssetId.FOREST]
prompt = """
- Image 1: Robot character sheet.
- Image 2: Previous scene.
- The robot goes through the dense forest and emerges into a clearing, pushing aside two tree trunks.
- The robot is in the center, now seen from the front view.
- The ground is made of green felt, with flat patches of white felt snow. Rocks are no longer visible.
"""
new_id = AssetId.CLEARING

generate_image(source_ids, prompt, new_id)

We have changed the soil but I did not provide any more information for the visionary: The model will usually save most of the trees.

Now that the district defects are over, to go to the mountains, using the original scene as our return of the site:

source_ids = [AssetId.ROBOT, AssetId.MOUNTAINS]
prompt = """
- Image 1: Robot character sheet.
- Image 2: Previous scene.
- Close-up of the robot now climbing the peak of a medium-green mountain and reaching its summit.
- The mountain is right in the center, with the robot on its left slope, viewed from a 3/4 rear angle.
- The robot has both feet on the mountain and is using two felt ice axes (brown handles, gray heads), reaching the snowcap.
- Horizon: The distant mountain range.
"""
new_id = AssetId.ASCENSION

generate_image(source_ids, prompt, new_id)

The mountain is closer, equipped after a blank background.

Let us go up to the conference:

source_ids = [AssetId.ROBOT, AssetId.ASCENSION]
prompt = """
- Image 1: Robot character sheet.
- Image 2: Previous scene.
- The robot reaches the top and stands on the summit, seen in the front view, in close-up.
- It is no longer holding the ice axes, which are planted upright in the snow on each side.
- It has both arms raised in sign of victory.
"""
new_id = AssetId.SUMMIT

generate_image(source_ids, prompt, new_id)

💡 This is a reasonable follow-up but also a good, different view.

Now, let's try something different from the incident backup:

source_ids = [AssetId.ROBOT, AssetId.SUMMIT]
prompt = """
- Image 1: Robot character sheet.
- Image 2: Previous scene.
- Remove the ice axes.
- Move the center mountain to the left edge of the image and add a slightly taller medium-blue mountain to the right edge.
- Suspend a stylized felt bridge between the two mountains: Its deck is made of thick felt planks in various wood shades.
- Place the robot on the center of the bridge with one arm pointing toward the blue mountain.
- View: Close-up.
"""
new_id = AssetId.BRIDGE

generate_image(source_ids, prompt, new_id)

💡 in interest:

This important motivation is composed of this incident by subject to actions. Sometimes it is easier than the meaning.
The new mountain added as instructed, and both are different and agree.
The bridge is adhering to the curses and seems to obey the laws of physics.
“Delete the” Instruction “is here for a reason.
It is also possible to get a robot to walk in a bridge, seen on the side (that has never been productive), but it is difficult to travel regularly from left to right. Adding left and right views on the letter sheet should fix this.

Let's produce the last place and let the robot rest well:

source_ids = [AssetId.ROBOT, AssetId.BRIDGE]
prompt = """
- Image 1: Robot character sheet.
- Image 2: Previous scene.
- The robot is sleeping peacefully (both eyes changed into a "closed" state), in a comfortable brown-and-tan tartan hammock that has replaced the bridge.
"""
new_id = AssetId.HAMMOCK

generate_image(source_ids, prompt, new_id)

💡 in interest:

In this case, immediately, and also applies to the priority.
The block of a bridge-Hammock is quite fun and keeps the attachment to the mountain brainstorm.
Robot change is amazing and, because it is not seen in this position before.
The eyes are closed by the hardest information of regular access (may need a few attempts), perhaps because we collect many different changes (and we delude model's attention). For full controls and over-determining effects, we can focus on important changes above the Itirative steps, or create various character sheets before.

We illustrate our story with nine new pictures of unchanges! Let's take a step back to understand what we have made …

🗺️ Visical viewing

We now have a collection of photos, from archives to new archives produced.

Let's add a specific data observation to find a better idea of completed steps …

🔗 targeted graph

Our new assets are all related, connected to one or more “produced from” links. From the data view, this is a target graph.

We can build a compatible graph with the target using networkx Library:

import networkx as nx


def build_graph(assets: Assets) -> nx.DiGraph:
    graph = nx.DiGraph(assets=assets)
    # Nodes
    for asset in assets.values():
        graph.add_node(asset.id, asset=asset)
    # Edges
    for asset in assets.values():
        for source_id in asset.source_ids:
            graph.add_edge(source_id, asset.id)
    return graph


asset_graph = build_graph(assets)
print(asset_graph)

DiGraph with 10 nodes and 16 edges

Let's set the most commonly used property in the center and show some assets around: 🔽

import matplotlib.pyplot as plt


def display_basic_graph(graph: nx.Graph) -> None:
    pos = compute_node_positions(graph)
    color = "#4285F4"
    options = dict(
        node_color=color,
        edge_color=color,
        arrowstyle="wedge",
        with_labels=True,
        font_size="small",
        bbox=dict(ec="black", fc="white", alpha=0.7),
    )
    nx.draw(graph, pos, **options)
    plt.show()


def compute_node_positions(graph: nx.Graph) -> dict[str, tuple[float, float]]:
    # Put the most connected node in the center
    center_node = most_connected_node(graph)
    edge_nodes = set(graph) - {center_node}
    pos = nx.circular_layout(graph.subgraph(edge_nodes))
    pos[center_node] = (0.0, 0.0)
    return pos


def most_connected_node(graph: nx.Graph) -> str:
    if not graph.nodes():
        return ""
    centrality_by_id = nx.degree_centrality(graph)
    return max(centrality_by_id, key=lambda s: centrality_by_id.get(s, 0.0))

display_basic_graph(asset_graph)

That is a summary of our various steps. It would be good if we can see ourselves for our property …

🌟 property graph

Let us come into a culture matplotlib Jobs Giving Graphs Neids have an attractive asset: 🔽

import typing
from collections.abc import Iterator
from io import BytesIO
from pathlib import Path

import PIL.Image
import PIL.ImageDraw
from google.genai.types import PIL_Image
from matplotlib.axes import Axes
from matplotlib.backends.backend_agg import FigureCanvasAgg
from matplotlib.figure import Figure
from matplotlib.image import AxesImage
from matplotlib.patches import Patch
from matplotlib.text import Annotation
from matplotlib.transforms import Bbox, TransformedBbox


@enum.unique
class ImageFormat(enum.StrEnum):
    # Matches PIL.Image.Image.format
    WEBP = enum.auto()
    PNG = enum.auto()
    GIF = enum.auto()


def yield_generation_graph_frames(
    graph: nx.DiGraph,
    animated: bool,
) -> Iterator[PIL_Image]:
    def get_fig_ax() -> tuple[Figure, Axes]:
        factor = 1.0
        figsize = (16 * factor, 9 * factor)
        fig, ax = plt.subplots(figsize=figsize)
        fig.tight_layout(pad=3)
        handles = [
            Patch(color=COL_OLD, label="Archive"),
            Patch(color=COL_NEW, label="Generated"),
        ]
        ax.legend(handles=handles, loc="lower right")
        ax.set_axis_off()
        return fig, ax

    def prepare_graph() -> None:
        arrows = nx.draw_networkx_edges(graph, pos, ax=ax)
        for arrow in arrows:
            arrow.set_visible(False)

    def get_box_size() -> tuple[float, float]:
        xlim_l, xlim_r = ax.get_xlim()
        ylim_t, ylim_b = ax.get_ylim()
        factor = 0.08
        box_w = (xlim_r - xlim_l) * factor
        box_h = (ylim_b - ylim_t) * factor
        return box_w, box_h

    def add_axes() -> Axes:
        xf, yf = tr_figure(pos[node])
        xa, ya = tr_axes([xf, yf])
        x_y_w_h = (xa - box_w / 2.0, ya - box_h / 2.0, box_w, box_h)
        a = plt.axes(x_y_w_h)
        a.set_title(
            asset.id,
            loc="center",
            backgroundcolor="#FFF8",
            fontfamily="monospace",
            fontsize="small",
        )
        a.set_axis_off()
        return a

    def draw_box(color: str, image: bool) -> AxesImage:
        if image:
            result = pil_image.copy()
        else:
            result = PIL.Image.new("RGB", image_size, color="white")
        xy = ((0, 0), image_size)
        # Draw box outline
        draw = PIL.ImageDraw.Draw(result)
        draw.rounded_rectangle(xy, box_r, outline=color, width=outline_w)
        # Make everything outside the box outline transparent
        mask = PIL.Image.new("L", image_size, 0)
        draw = PIL.ImageDraw.Draw(mask)
        draw.rounded_rectangle(xy, box_r, fill=0xFF)
        result.putalpha(mask)
        return a.imshow(result)

    def draw_prompt() -> Annotation:
        text = f"Prompt:n{asset.prompt}"
        margin = 2 * outline_w
        image_w, image_h = image_size
        bbox = Bbox([[0, margin], [image_w - margin, image_h - margin]])
        clip_box = TransformedBbox(bbox, a.transData)
        return a.annotate(
            text,
            xy=(0, 0),
            xytext=(0.06, 0.5),
            xycoords="axes fraction",
            textcoords="axes fraction",
            verticalalignment="center",
            fontfamily="monospace",
            fontsize="small",
            linespacing=1.3,
            annotation_clip=True,
            clip_box=clip_box,
        )

    def draw_edges() -> None:
        STYLE_STRAIGHT = "arc3"
        STYLE_CURVED = "arc3,rad=0.15"
        for parent in graph.predecessors(node):
            edge = (parent, node)
            color = COL_NEW if assets[parent].prompt else COL_OLD
            style = STYLE_STRAIGHT if center_node in edge else STYLE_CURVED
            nx.draw_networkx_edges(
                graph,
                pos,
                [edge],
                width=2,
                edge_color=color,
                style="dotted",
                ax=ax,
                connectionstyle=style,
            )

    def get_frame() -> PIL_Image:
        canvas = typing.cast(FigureCanvasAgg, fig.canvas)
        canvas.draw()
        image_size = canvas.get_width_height()
        image_bytes = canvas.buffer_rgba()
        return PIL.Image.frombytes("RGBA", image_size, image_bytes).convert("RGB")

    COL_OLD = "#34A853"
    COL_NEW = "#4285F4"
    assets = graph.graph["assets"]
    center_node = most_connected_node(graph)
    pos = compute_node_positions(graph)
    fig, ax = get_fig_ax()
    prepare_graph()
    box_w, box_h = get_box_size()
    tr_figure = ax.transData.transform  # Data → display coords
    tr_axes = fig.transFigure.inverted().transform  # Display → figure coords

    for node, data in graph.nodes(data=True):
        if animated:
            yield get_frame()
        # Edges and sub-plot
        asset = data["asset"]
        pil_image = asset.pil_image
        image_size = pil_image.size
        box_r = min(image_size) * 25 / 100  # Radius for rounded rect
        outline_w = min(image_size) * 5 // 100
        draw_edges()
        a = add_axes()  # a is used in sub-functions
        # Prompt
        if animated and asset.prompt:
            box = draw_box(COL_NEW, image=False)
            prompt = draw_prompt()
            yield get_frame()
            box.set_visible(False)
            prompt.set_visible(False)
        # Generated image
        color = COL_NEW if asset.prompt else COL_OLD
        draw_box(color, image=True)

    plt.close()
    yield get_frame()


def draw_generation_graph(
    graph: nx.DiGraph,
    format: ImageFormat,
) -> BytesIO:
    frames = list(yield_generation_graph_frames(graph, animated=False))
    assert len(frames) == 1
    frame = frames[0]

    params: dict[str, typing.Any] = dict()
    match format:
        case ImageFormat.WEBP:
            params.update(lossless=True)

    image_io = BytesIO()
    frame.save(image_io, format, **params)

    return image_io


def draw_generation_graph_animation(
    graph: nx.DiGraph,
    format: ImageFormat,
) -> BytesIO:
    frames = list(yield_generation_graph_frames(graph, animated=True))
    assert 1 <= len(frames)

    if format == ImageFormat.GIF:
        # Dither all frames with the same palette to optimize the animation
        # The animation is cumulative, so most colors are in the last frame
        method = PIL.Image.Quantize.MEDIANCUT
        palettized = frames[-1].quantize(method=method)
        frames = [frame.quantize(method=method, palette=palettized) for frame in frames]

    # The animation will be played in a loop: start cycling with the most complete frame
    first_frame = frames[-1]
    next_frames = frames[:-1]
    INTRO_DURATION = 3000
    FRAME_DURATION = 1000
    durations = [INTRO_DURATION] + [FRAME_DURATION] * len(next_frames)
    params: dict[str, typing.Any] = dict(
        save_all=True,
        append_images=next_frames,
        duration=durations,
        loop=0,
    )
    match format:
        case ImageFormat.GIF:
            params.update(optimize=False)
        case ImageFormat.WEBP:
            params.update(lossless=True)

    image_io = BytesIO()
    first_frame.save(image_io, format, **params)

    return image_io


def display_generation_graph(
    graph: nx.DiGraph,
    format: ImageFormat | None = None,
    animated: bool = False,
    save_image: bool = False,
) -> None:
    if format is None:
        format = ImageFormat.WEBP if running_in_colab_env else ImageFormat.PNG
    if animated:
        image_io = draw_generation_graph_animation(graph, format)
    else:
        image_io = draw_generation_graph(graph, format)

    image_bytes = image_io.getvalue()
    IPython.display.display(IPython.display.Image(image_bytes))

    if save_image:
        stem = "graph_animated" if animated else "graph"
        Path(f"./{stem}.{format.value}").write_bytes(image_bytes)

We can now show our graph of generating:

display_generation_graph(asset_graph)

The challenge is completed

We have been able to generate a complete set of new fixed pictures and banana pictures and we learned a few things along the way:

Pictures also prove that they are worth a thousand words: Now it is very easy to produce new photos from existing commands and simple commands.
We can create or edit images according to the composition (to make us all art directors).
We can use descriptive or important instructions.
The Model Spatial Understanding allows 3D ministers.
We can add a text to our checkpile (letters sheet) and look at the text in our entry (pre-check / back).
Reformation can be stored in very different levels: character, scene, texture, lighting, camera angle / type …
The generation process is there but feels like 10x-100x as soon as possible for accessing the best hope results.
Now it is possible to breathe a new life in our archives!

The following follow-up steps:

The procedure that we followed were actually a pipe for generation. It can be developed with Automation (eg to change node with its interest or diversity separate from transit (eg.
Due to simplicity and testing, promoting simple. In the production area, they could have a structured structure with a formal set of parameters.
We have described squares as if in the picture studio. Almost any other actual art style may be possible (Photoralistic, Abstract, 2D …).
Our assets can be made sufficient for the promotion of the implementation of the photo metadata. For details, see the “Asset Metadata section in the registry (link below).

Like a bonus, let us end with the images of our journey, and graphs graph and shows a glimpse of our commandments:

display_generation_graph(asset_graph, animated=True)

More!

Want to deep?

Thanks for reading. Looking forward to the old one!

Source link

nimda 11 hours ago

0 0 16 minutes read

Producing a fixed image and Gemini

Challenge 🔥

🏁 To set

🐍 Python Packages

🤖 Gen AI SDK

🧠 Gemini model

🛠️ Hids

🖼️ estate

📦 Saves of Reference

🪄 characters sheet

✨ The original state

✨ Subsequent scenes

🗺️ Visical viewing

🔗 targeted graph

🌟 property graph

The challenge is completed

More!

nimda

Leave a Reply Cancel reply

Google AI issuing MLE-Star: State Engineering Agent to work with Autory A Tasks

Servicess MCP brings correcting AWS running AWS travel within modern IDs

Unlocking RAG’s Potential with ModernBERT

Meet the Aiuugural AI of Google Govtech Startup Cohort

The Ultimate Guide to ChatGPT: What You Need to Know

Be Part of the AI Revolution at the Chatbot Conference Tomorrow! | by Cassandra C.

Botober 2024

Virtual Personas for Language Models with An Anthology of Backstories – Berkeley Artificial Intelligence Research Blog

Machine Learning Interview Questions and Answers

Challenge 🔥

🏁 To set

🐍 Python Packages

🤖 Gen AI SDK

🧠 Gemini model

🛠️ Hids

🖼️ estate

📦 Saves of Reference

🪄 characters sheet

✨ The original state

✨ Subsequent scenes

🗺️ Visical viewing

🔗 targeted graph

🌟 property graph

The challenge is completed

More!

nimda

Subscribe to our mailing list to get the new updates!

Ukugijima ucwaningo olujulile lwe-AI Agents e-Amazon Bedrock Agentcore

Codetum implementation in the END-TOD TransFormer Model Optimization with Gigniging Face Optimum, OnNX Tranne, and Quilinition

Related Articles

Meet the Aiuugural AI of Google Govtech Startup Cohort

New Google Labs checking for ideas

Google Ai Plus puts on 40 countries

The Art of Asking Good Questions

Leave a Reply Cancel reply

Google AI issuing MLE-Star: State Engineering Agent to work with Autory A Tasks

Servicess MCP brings correcting AWS running AWS travel within modern IDs

Unlocking RAG’s Potential with ModernBERT

Meet the Aiuugural AI of Google Govtech Startup Cohort

The Ultimate Guide to ChatGPT: What You Need to Know

Be Part of the AI ​​Revolution at the Chatbot Conference Tomorrow! | by Cassandra C.

Botober 2024

Virtual Personas for Language Models with An Anthology of Backstories – Berkeley Artificial Intelligence Research Blog

Machine Learning Interview Questions and Answers

Be Part of the AI Revolution at the Chatbot Conference Tomorrow! | by Cassandra C.