InvokeAI - Canvas Fundamentals

24 Sept 202338:06

TLDRThe video script offers an in-depth tutorial on utilizing the unified canvas feature of Invoke AI for a seamless end-to-end creative workflow. It emphasizes the importance of the bounding box for precise image generation, detailing its operation, and demonstrates various techniques such as denoising, mask adjustments, and infilling methods. The creator also explores advanced features like scale before processing and coherence pass for higher quality results. The video includes a live demonstration of refining an image through multiple iterations, highlighting the flexibility and control of the canvas tools in achieving a desired output. The script concludes by encouraging experimentation with the canvas tools for effective communication of the user's creative intent.


  • 🎨 The unified canvas is a feature that enables end-to-end creative workflows for content generation.
  • πŸ–ΌοΈ The bounding box is a crucial tool in the canvas that controls the area for generating new imagery and content.
  • πŸ”„ Resizing and moving the bounding box allows for better compositions and focused detailing work.
  • πŸ’‘ The AI model's understanding is influenced by the context provided in the initial image, which helps it match the prompt to the generated content.
  • πŸ“Έ Using a mask and high denoising strength can produce detailed results, but it requires adjusting the prompt to fit the focused area.
  • πŸ”„ The canvas offers tools like control net and import image from canvas for easier detail regeneration.
  • 🌐 The system can generate new images or composite new information into selected areas using the bounding box data and mask.
  • πŸ” The 'scale before processing' feature improves the quality of regeneration by focusing on small details at a higher resolution.
  • 🎭 Mask adjustments and coherence pass are essential for seamless compositing of regenerated content back into the original image.
  • πŸ–ŒοΈ Various infill techniques like tile, patch, match, llama, and CV2 are available for extending or changing the composition of an image.
  • πŸš€ Experimentation with the canvas tools is recommended for achieving the best results and understanding the capabilities of the system.

Q & A

  • What is the primary function of the bounding box in the canvas tool?

    -The bounding box in the canvas tool is used to control where and how new imagery and content are generated. It can be moved and resized to create better compositions and focus on specific areas of the image.

  • How does the bounding box limit the AI model's view when working with an image?

    -When the bounding box is resized and focused on a specific area of the image, it limits the AI model's view to that area, causing it to generate content based only on the context provided within the bounding box.

  • What is the purpose of the denoising process in the canvas tool?

    -The denoising process is about providing the right type of context and structural hints in the initial image to help the AI model understand how to work with the prompt and generate content that matches the user's vision.

  • How can the mask tool be used in conjunction with the bounding box?

    -The mask tool can be used to select a specific area within the bounding box, allowing the system to create a new image using the bounding box data and composite the new information into the selected area, based on the mask settings.

  • What are some of the infill techniques available in the canvas tool?

    -The canvas tool offers several infill techniques, including tile, patch, match, llama, and CV2. These options allow users to extend an image, change the aspect ratio, or composition of the resulting content.

  • What is the scale before processing feature and how does it improve the quality of regeneration?

    -The scale before processing feature allows users to focus on a small area of details and improve the quality of the regeneration by performing it at a higher resolution. This technique is useful for getting extra details and high-quality background elements.

  • How does the compositing process work with mask adjustments and coherence pass in the canvas tool?

    -The compositing process involves regenerating the area inside the bounding box and pasting it back into the original image based on the mask settings. Mask adjustments allow users to blur the mask when merging new data into the original image, while the coherence pass is a second image-to-image process that clears up any rough edges or seams introduced in the infill or regeneration process.

  • What are some tips for using the canvas tool effectively?

    -Effective use of the canvas tool involves experimenting with different infill techniques, adjusting mask settings, and using the scale before processing feature for detailed areas. It's also important to adjust the prompt to match the focus area when working with smaller generations and to play around with compositing options to achieve the desired result.

  • How can the canvas tool be used for storyboarding or concept art creation?

    -The canvas tool can be used for storyboarding or concept art by drawing on the canvas to guide structure and color, using control nets to refine structure and details, and regenerating content to achieve the desired composition and aspect ratio. It allows for iterative refinement and addition of details to create a final image that matches the creator's vision.

  • What is the role of the unified canvas in supporting an end-to-end workflow for creative vision?

    -The unified canvas plays a crucial role in supporting an end-to-end workflow for creative vision by providing a set of tools that allow users to generate, edit, and refine imagery and content. It enables users to control the generation process, adjust compositions, and add details to create high-quality, detailed outputs that align with their creative vision.



🎨 Introduction to the Unified Canvas

The video begins with an introduction to the unified canvas, a feature of Invoke AI that supports an end-to-end workflow for realizing creative visions. The canvas is highlighted for its ability to control the generation of new imagery and content through the bounding box. The video creator uses a space fighter image to demonstrate how the bounding box can be manipulated to edit different parts of the image. The importance of the bounding box in controlling the AI's view and understanding of the image is discussed, emphasizing the need to adjust the bounding box for detailed work and better compositions.


πŸ› οΈ Utilizing the Bounding Box and Denoising Process

This paragraph delves deeper into the functionality of the bounding box and the denoising process. The creator explains how resizing the bounding box can limit what the AI model sees, affecting the quality of the generated imagery. The concept of providing the right context and structural hints through the initial image is introduced to aid the AI in understanding the prompt. The video also explores the impact of high denoising strengths on the quality of the generated content and the importance of adjusting the prompt for smaller generations. Additionally, the paragraph discusses the use of control nets and the benefits of low denoising strengths for maintaining structure.


πŸ–ŒοΈ Canvas Generation Methods and Infill Techniques

The paragraph covers the various generation methods available with the bounding box, such as generating new images or using existing pixel data. It explains how passing in a mask with the brush tool affects the generation process. The video also introduces different infill techniques like tiling, patch matching, llama, and CV2, and how they can be used to achieve specific results. The creator appreciates the contribution of an Invoke AI community member for creating a graph that visualizes these techniques. The paragraph concludes with an explanation of the 'scale before processing' feature, which enhances the quality of detailed areas in the image.


🎭 Advanced Techniques for High-Quality Results

In this paragraph, the focus is on advanced features like mask adjustments and coherence pass, which are crucial for high-quality results. The creator discusses how the compositing process works when regenerating an area or expanding an image within the canvas. The options for the coherence pass, including unmasked, masked, and mask edge, are explained. The paragraph emphasizes the importance of experimenting with these options to understand their impact on the final image. The creator shares their personal preference for the unmasked coherence pass and encourages viewers to explore these features.


πŸš€ Practical Demonstration of Canvas Features

The creator jumps into a practical demonstration of using the canvas to generate new content. They start by generating an image using text-to-image and then refining it using the canvas. The process of brushing in a general idea, adjusting the bounding box, and using different denoising strengths is shown. The video highlights the iterative process of refining the image, including removing unwanted elements and adding details. The creator also discusses the importance of focusing on specific areas for regeneration and adjusting the prompt accordingly.


🌟 Final Touches and Composition Adjustments

The paragraph focuses on the final stages of the image creation process. The creator discusses the need to adjust the composition and aspect ratio of the image, using the bounding box and delete functions to refine the final look. They explain how to use the patch match infill technique to extend the image and maintain a high denoising strength for new details. The video also touches on the importance of fixing any inconsistencies introduced during the generation process. The creator shares their satisfaction with the final image, which meets their initial vision for a romantic space operatic drama.


🎨 Fun Experimentation and Exploration of Canvas Tools

The video concludes with the creator having fun and generating content using the canvas tools. They share a rough sketch of a mage character and use a control net to refine the structure and details. The creator discusses the impact of prompts on the generation process and demonstrates how to use the eraser tool to infill areas with existing colors from the composition. They explore different infill settings like tile, llama, and CV2 to achieve the desired result. The video ends with the creator encouraging viewers to experiment with the canvas tools and share their creations.



πŸ’‘Unified Canvas

The Unified Canvas is a feature that enables end-to-end workflow for content creation, particularly in generating imagery and visual content. It is a central tool in the invoke AI system that allows users to manipulate and refine their creative projects with precision and control. In the video, the canvas is used to demonstrate how to generate and edit images, focusing on various operations like bounding box selection, content regeneration, and aspect ratio adjustments.

πŸ’‘Bounding Box

The Bounding Box is a critical feature within the canvas that allows users to select and manipulate specific areas of an image. It acts as a movable and resizable frame that users can adjust to focus on particular elements within the image. This tool is essential for controlling where new content is generated and how existing content is edited.

πŸ’‘Denoising Process

The Denoising Process refers to the AI's ability to interpret and refine the initial image based on the context and structural hints provided. It involves the AI 'squinting' at the image and trying to understand how to align the generated content with the user's prompt. The process is influenced by the quality of the initial image and the context given to the AI, which helps it generate content that matches the user's creative vision.

πŸ’‘Control Net

A Control Net is a tool within the canvas that helps maintain the structure and details of an image during the regeneration process. It is used to ensure that the AI model adheres to the initial sketch or outline provided by the user, allowing for more precise control over the final output. The Control Net can be adjusted to influence the degree of control it exerts over the generated content.


Inpainting is a technique used within the canvas to fill in or regenerate missing or selected parts of an image based on the surrounding content and context. This process uses the AI model to generate new content that matches the style, color, and structure of the existing image, effectively 'painting' in the missing areas to create a seamless and coherent final image.


A Mask in the context of the canvas is a tool that allows users to select specific areas of an image for manipulation. It can be used to isolate parts of the image for regeneration, ensuring that only the desired areas are affected. Masks can be adjusted for various purposes, such as changing the compositing process or focusing on detailed work.


Compositing is the process of combining multiple images or parts of images to create a single, cohesive image. In the canvas, this involves blending the regenerated content with the existing image, taking into account the mask settings and other factors to ensure a seamless integration. The canvas offers several compositing options, such as mask adjustments and coherence pass, to fine-tune the final output.

πŸ’‘Aspect Ratio

Aspect Ratio refers to the proportional relationship between the width and height of an image. In the context of the canvas, adjusting the aspect ratio allows users to change the composition and layout of the image to better fit their creative vision. This can involve cropping, resizing, or repositioning the image to achieve a desired look and balance.

πŸ’‘Infilling Techniques

Infilling Techniques are methods used to fill in empty or selected areas of an image with new content or color data. The canvas offers various infill options, such as tile, patch, match, llama, and CV2, each designed to achieve different visual effects when generating content. These techniques help users extend or alter their images in a way that complements the existing content.

πŸ’‘Scale Before Processing

Scale Before Processing is a feature that allows users to zoom in on a specific area of an image and generate content at a higher resolution before fitting it back into the original image. This technique enhances the quality of the regeneration by providing more detail and clarity for smaller or more intricate parts of the image.


A Prompt is a set of instructions or descriptions provided to the AI model to guide the generation of content. In the context of the canvas, prompts play a crucial role in communicating the user's creative intent and influencing the output of the AI. They can be adjusted and refined to better match the desired outcome.


The introduction of the bounding box feature in the canvas, which is crucial for controlling the generation of new imagery and content within the tool.

The demonstration of how resizing and moving the bounding box can lead to better compositions and more focused detailing work.

The importance of understanding the AI model's limited view when focusing on specific areas of an image, which affects the quality of generation.

The concept of denoising process and its role in providing the right context and structural hints for the AI model to work effectively with the initial image.

The use of masks and the high denoising strength to achieve close-to-desired results in smaller, focused areas of the canvas.

The introduction of control net on the canvas, which simplifies the process of regenerating smaller details with more control.

Explanation of the different generation methods available with the bounding box, including new image generation, using existing pixel data, and infilling empty spaces.

The scale before processing feature that enhances the quality of regeneration by focusing on small details at a higher resolution.

The compositing process and mask adjustments, which allow for seamless integration of regenerated content back into the original image.

The coherence pass, a secondary generation process that smooths out rough edges and seams for a more polished final image.

The practical application of the canvas tools demonstrated through a step-by-step creation of a character and background, showcasing the versatility of the canvas.

The recommendation to experiment with different infill techniques such as tile, patch, match, llama, and CV2 to achieve desired results.

The creative process of refining an image through multiple iterations, adjusting prompts, and using various canvas tools to achieve a high-quality, unique output.

The discussion on the impact of specific words in prompts, such as 'wizard', on the AI model's output and how to adjust prompts for better results.

The encouragement for users to explore and experiment with the entire set of canvas tools to find the best ways to communicate their intent to the AI model.