The Best New Way to Create Consistent Characters In Stable Diffusion

12 Jan 202403:12

TLDRThe video script outlines a step-by-step guide on creating consistent character images using ControlNet and face ID adapters. It instructs viewers to update their extensions, download specific models, and use the Realistic Vision prompt to generate a girl's image with a yellow shirt. The tutorial further explains how to adjust the character's appearance and clothing, like armor or a blue dress, to maintain consistency across different scenes, such as in front of a castle or in a forest. The video encourages engagement by inviting viewers to like and subscribe.


  • 🎨 Preparing for character creation involves updating the control net to the latest version.
  • πŸ”— Downloading Face ID IP adapters and placing them in the ControlNet Models folder is essential for character consistency.
  • πŸ“‚ Organizing different Face IDs in separate folders, such as 'Laura', streamlines the process.
  • πŸ”„ Restarting to a stable diffusion checkpoint is necessary after setting up the extensions.
  • πŸ–ŒοΈ Using a simple prompt like 'a girl in a yellow shirt, smiling' with the best quality setting can produce a realistic masterpiece.
  • 🌟 The 'realistic Vision' checkpoint is mentioned as the one used for creating the character.
  • πŸ”§ Config UI allows for customization of the character's features but is not available in automatic 1111.
  • πŸ–ΌοΈ The first control net should have the character's face picture with the control type set to Face ID Plus.
  • πŸ”„ Adjusting the model number can help achieve the desired strength of facial features.
  • πŸ‘— Changing the character's clothing and background, such as to armor in front of a castle, maintains consistency.
  • πŸ’ƒ Controlling gesture can be done through a second control net with an open POS pre-processor and different pictures for various poses.

Q & A

  • What is the main topic of the video?

    -The main topic of the video is about creating consistent characters in an AI-based image generation platform using control nets.

  • What is the first step the video suggests to prepare for character creation?

    -The first step is to update the control net to the latest version and download specific IP adapters called Face ID.

  • Where should the downloaded Face ID adapters be placed?

    -The Face ID adapters should be placed in the Web UI extensions control net models folder.

  • What is the name of the checkpoint used in the video for realistic images?

    -The checkpoint used in the video is called 'Realistic Vision'.

  • How does the video creator describe the initial prompt used for generating the character's face?

    -The initial prompt used is described as very simple, with the words 'a girl, yellow shirt, smiling, Masterpiece B, best quality'.

  • What happens if the pre-processor and the model do not match in the control net?

    -If the pre-processor and the model do not match, the control net will not work properly, and the desired output will not be achieved.

  • How does the video demonstrate changing the character's appearance?

    -The video demonstrates changing the character's appearance by adjusting the control net settings, such as the number associated with the face's intensity, and by changing the clothes and background.

  • What is the purpose of the second control net in the video?

    -The purpose of the second control net is to control the character's gesture by using a picture and selecting an appropriate pre-processor like DW open pose.

  • What are the key elements of the video that contribute to a consistent character design?

    -The key elements include using the latest version of the control net, selecting the right IP adapters, matching the pre-processor with the model, and adjusting settings to achieve the desired character appearance and gesture.

  • How can viewers engage with the video creator's content?

    -Viewers can engage by liking the video and subscribing to the channel for more content.



🎨 Character Consistency in Art with Automatic 1111

The paragraph introduces a method to create consistent character designs using Automatic 1111. It starts with an update on control nets and downloading specific IP adapters known as Face ID. The process involves adding these adapters to the Web UI extensions control net models folder and restarting the diffusion model. The user is guided to use a specific checkpoint, realistic Vision, with a simple prompt to generate a character image. The tutorial then explains how to adjust the character's appearance and clothing, such as changing to armor in front of a castle, while maintaining consistency. Additionally, it covers how to control the character's gesture using a second control net with an open POS pre-processor. The paragraph concludes with a call to like and subscribe for more content.



πŸ’‘Control Net

Control Net refers to a system used in the video for managing and directing the generation of images with specific characteristics. It is a key component in the process of creating consistent character images, as it allows the user to input a reference image and guide the generation to match certain features. In the context of the video, the Control Net is updated and utilized with IP adapters called Face ID to maintain the consistency of the character's face across different scenes and outfits.

πŸ’‘Face ID

Face ID is a term used in the video to describe a specific type of IP adapter that aids in recognizing and generating facial features consistently. It is an essential tool for the process of character creation and image generation, ensuring that the character's face remains the same across various images. The video instructs downloading Face ID adapters and using them within the Control Net to achieve the desired consistency in facial appearance.

πŸ’‘Web UI Extensions

Web UI Extensions refer to the additional features or tools that are added to a web interface to enhance its functionality. In the context of the video, these extensions are used to manage and control the image generation process, particularly in relation to the character's facial features and overall appearance. The Web UI Extensions are where the downloaded Face ID adapters are pasted, and they play a crucial role in the customization and control of the generated images.

πŸ’‘Resetting to Stable

Resetting to Stable refers to the process of reverting to a stable or reliable version of a system or software after making changes or updates. In the video, this step is important to ensure that the image generation process is reliable and that the character's features are consistent with the intended design. It involves restarting the system to apply the changes and ensure that the new extensions are working correctly.


Diffusion, in the context of the video, refers to a method or process used in image generation, particularly in creating realistic and detailed images from textual descriptions or prompts. It is a technique that allows for the generation of high-quality visual content based on specific inputs, such as the character's appearance and the desired scene. The video mentions using a diffusion checkpoint called 'Realistic Vision' for generating the character's image.

πŸ’‘The Prompt

The Prompt is the textual description or input used to guide the image generation process. It contains specific details about the desired output, such as the character's appearance, clothing, and the setting of the image. In the video, the prompt is used to create a consistent character image with specific features, such as a girl in a yellow shirt smiling, and it is a crucial element in achieving the intended result.


Consistency in the video refers to the ability to maintain uniform and predictable characteristics of the generated images, particularly the character's facial features and overall appearance. It is important for creating a cohesive and believable visual narrative, as it allows the viewer to recognize the same character across different scenes and outfits. The video demonstrates techniques and tools, such as Control Net and Face ID, that are used to ensure consistency in image generation.

πŸ’‘Gesture Control

Gesture Control refers to the manipulation or direction of a character's body posture, movements, or expressions in the generated images. In the video, it is achieved by using a second Control Net with a specific pre-processor and model, allowing the user to input an image and guide the character's pose and gesture accordingly. This feature enhances the versatility and realism of the generated content by adding dynamic elements to the character's portrayal.

πŸ’‘Config UI

Config UI stands for Configuration User Interface, which is a system or interface that allows users to customize and adjust settings related to the image generation process. In the video, it is used to manage the character's appearance and the environment in which they are placed, such as changing clothes and backgrounds. The Config UI provides a user-friendly way to interact with the system and achieve the desired visual outcomes.

πŸ’‘Image Generation

Image Generation is the process of creating visual content, such as images or pictures, using computational methods and algorithms. In the video, image generation is achieved through a combination of Control Net, Face ID, and diffusion techniques, which work together to produce realistic and detailed images based on the input prompts and settings. The process allows for the creation of consistent character images across various scenes and settings, as well as the control of gestures and poses.

πŸ’‘Character Creation

Character Creation refers to the process of designing and developing characters for visual narratives, such as stories, games, or animations. In the video, character creation involves using various tools and techniques, like Control Net and Face ID, to generate and maintain consistent character images with specific features, expressions, and outfits. The goal is to create a believable and engaging character that can be used in different contexts while retaining its recognizable identity.


Creating consistent characters in automatic 1111

Preparation involves updating the control net to the latest version

Downloading face ID IP adapters and placing them in the control net models folder

Restarting to stable diffusion with the checkpoint 'Realistic Vision'

Using a simple prompt like 'a girl in a yellow shirt, smiling' to generate a masterpiece

Adjusting the control settings to achieve the desired character look

Selecting the appropriate pre-processor and model to match, such as 'Face ID Plus'

Generating a picture with a character face using the control net

Changing the character's clothing to armor and setting the scene in front of a castle

Controlling the character's gesture with the second control net

Choosing different pre-processors like 'DW open pose' for gesture control

Experimenting with various clothing options, such as a blue long dress in the forest

Altering the gesture by changing the input picture for the control net

The process allows for maintaining character consistency while changing outfits and poses

The tutorial provides a practical guide for users interested in character creation and customization

The video encourages viewers to like and subscribe for more content