Image from the official website of Synthesia

AI Avatar Maker: Transform Your Images into Stunning AI Avatars

4 min readAug 13, 2024

In today’s digital world, AI avatars are becoming increasingly popular, especially in media where news anchors and presenters use them to create a unique and engaging experience. But what’s the science behind these avatars, and how can you make one yourself?

Creating an AI avatar from multiple images and videos is a multi-step process.

In this guide, I’ll walk you through the process of creating an AI avatar using multiple images and videos, from setting up your environment to animating your avatar with voiceovers.

The AI Avatar Creator app features a user-friendly interface that guides you through the process of creating custom avatars. The app starts with a straightforward drag-and-drop interface for uploading images and videos. Users can upload multiple images in formats like JPG and PNG and videos in MP4, AVI, or MPEG4 formats, each with a file size limit of 200MB. Once the media is uploaded, you can easily animate your avatar with a single click. Additionally, the app provides an option to add a personalized voiceover by entering the desired text, making it a seamless experience to bring your AI avatar to life. Below is a snapshot of the app’s interface:

Project Structure
Environment Setup
Utility Modules
Main Streamlit App
Running the App
Command to run the Streamlit app
Conclusion

1. Project Structure

The project is organized into a directory with utility functions separated into individual Python files. This modular approach keeps the code clean and manageable.

ai_avatar_app/
│
├── main.py                 # Main Streamlit app file
├── utils/
│   ├── image_processing.py # Image processing functions
│   ├── video_processing.py # Video processing functions
│   ├── animation.py        # Avatar animation functions
│   ├── tts.py              # Text-to-speech functions
│   └── __init__.py         # Init file for utils module
├── requirements.txt        # Required Python libraries
└── README.md               # Project documentation

main.py: The core Streamlit application file.
utils/: Directory containing utility functions split into different files.
requirements.txt: A list of dependencies required to run the project.

2. Environment Setup

The requirements.txt file is used to specify all the libraries and dependencies your project requires. It ensures that anyone running your project will have the correct packages installed.

`requirements.txt`

streamlit
opencv-python
pillow
numpy
dlib
moviepy
pyttsx3

3. Utility Modules

Each utility module handles a specific task in the avatar creation process.

`utils/image_processing.py`: Handling Image Processing Tasks

This module contains functions to process images, such as extracting faces and creating composite images from multiple faces.

import cv2
import dlib
from PIL import Image
import numpy as np

detector = dlib.get_frontal_face_detector()

def extract_face(image_path):
    image = cv2.imread(image_path)
    faces = detector(image)
    if len(faces) > 0:
        face = faces[0]
        x, y, w, h = face.left(), face.top(), face.width(), face.height()
        face_image = image[y:y+h, x:x+w]
        return cv2.cvtColor(face_image, cv2.COLOR_BGR2RGB)
    else:
        return None

def create_composite_image(image_paths):
    faces = []
    for path in image_paths:
        face = extract_face(path)
        if face is not None:
            faces.append(face)

    if faces:
        composite_face = np.hstack(faces)
        composite_image = Image.fromarray(composite_face)
        return composite_image
    else:
        return None

`utils/video_processing.py`: Extracting Frames from Videos

This module handles video processing tasks like extracting frames from a video file.

import os
from moviepy.editor import VideoFileClip
from PIL import Image

def extract_frames(video_path, output_dir, num_frames=10):
    clip = VideoFileClip(video_path)
    duration = clip.duration
    interval = duration / num_frames
    
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)

    for i in range(num_frames):
        frame_time = i * interval
        frame = clip.get_frame(frame_time)
        frame_image = Image.fromarray(frame)
        frame_image.save(f"{output_dir}/frame_{i}.jpg")

`utils/animation.py`: Creating Animations from Images

This module creates animations from the processed images.

import cv2
import time
import numpy as np

def animate_avatar(images):
    cv2.namedWindow('Avatar Animation', cv2.WINDOW_NORMAL)
    for i in range(50):
        for image in images:
            cv2.imshow('Avatar Animation', image)
            time.sleep(0.1)
            if cv2.waitKey(1) & 0xFF == ord('q'):
                break
    cv2.destroyAllWindows()

def prepare_images_for_animation(face_images):
    return [cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR) for img in face_images]

`utils/tts.py`: Generating Text-to-Speech Output

This module handles converting text to speech.

import pyttsx3

def text_to_speech(text, save_path=None):
    engine = pyttsx3.init()
    engine.say(text)
    
    if save_path:
        engine.save_to_file(text, save_path)
    
    engine.runAndWait()

4. Main Streamlit App

The main.py file is where the Streamlit app is developed. This file creates the user interface and ties together all the utility functions.

`main.py`: The Primary Interface for Users

import streamlit as st
from utils.image_processing import create_composite_image
from utils.video_processing import extract_frames
from utils.animation import animate_avatar, prepare_images_for_animation
from utils.tts import text_to_speech
import os

st.title("AI Avatar Creator")

# Step 1: Image Upload
st.header("Step 1: Upload Images")
uploaded_images = st.file_uploader("Choose images", type=["jpg", "jpeg", "png"], accept_multiple_files=True)

if uploaded_images:
    image_paths = []
    for uploaded_image in uploaded_images:
        image_path = f"temp/{uploaded_image.name}"
        with open(image_path, "wb") as f:
            f.write(uploaded_image.getbuffer())
        image_paths.append(image_path)

    composite_image = create_composite_image(image_paths)
    if composite_image:
        st.image(composite_image, caption="Composite Avatar")
        composite_image.save('composite_avatar.jpg')

# Step 2: Video Upload
st.header("Step 2: Upload Video")
uploaded_video = st.file_uploader("Choose a video", type=["mp4", "avi"])

if uploaded_video:
    video_path = f"temp/{uploaded_video.name}"
    with open(video_path, "wb") as f:
        f.write(uploaded_video.getbuffer())
    
    frame_dir = "frames"
    extract_frames(video_path, frame_dir)
    st.success(f"Frames extracted to {frame_dir}")

# Step 3: Animate Avatar
st.header("Step 3: Animate Avatar")
if st.button("Animate"):
    face_images = [Image.open(f'temp/{img.name}') for img in uploaded_images]
    opencv_images = prepare_images_for_animation(face_images)
    animate_avatar(opencv_images)

# Step 4: Add Voiceover
st.header("Step 4: Add Voiceover")
text_input = st.text_area("Enter text for the avatar to speak", value="Hello! I am your AI avatar.")
if st.button("Generate Voiceover"):
    text_to_speech(text_input, 'avatar_voiceover.mp3')
    st.audio('avatar_voiceover.mp3')

5. Running the App

To run the Streamlit app, use the following command:

streamlit run main.py

Conclusion

With this step-by-step guide, you can create your AI avatar using images and videos. From processing images to animating your avatar with a voiceover, you now have the tools and knowledge to bring your digital creations to life. Whether for fun or professional use, this guide will help you create a dynamic and interactive avatar that can speak and animate just like those you see in the media.

AI Avatar Maker: Transform Your Images into Stunning AI Avatars

Table of Contents

1. Project Structure

2. Environment Setup

`requirements.txt`

3. Utility Modules

`utils/image_processing.py`: Handling Image Processing Tasks

`utils/video_processing.py`: Extracting Frames from Videos

`utils/animation.py`: Creating Animations from Images

`utils/tts.py`: Generating Text-to-Speech Output

4. Main Streamlit App

`main.py`: The Primary Interface for Users

5. Running the App

Conclusion

Written by Bhavik Jikadara

No responses yet

AI Avatar Maker: Transform Your Images into Stunning AI Avatars

Table of Contents

1. Project Structure

2. Environment Setup

requirements.txt

3. Utility Modules

utils/image_processing.py: Handling Image Processing Tasks

utils/video_processing.py: Extracting Frames from Videos

utils/animation.py: Creating Animations from Images

utils/tts.py: Generating Text-to-Speech Output

4. Main Streamlit App

main.py: The Primary Interface for Users

5. Running the App

Conclusion

Written by Bhavik Jikadara

No responses yet

`requirements.txt`

`utils/image_processing.py`: Handling Image Processing Tasks

`utils/video_processing.py`: Extracting Frames from Videos

`utils/animation.py`: Creating Animations from Images

`utils/tts.py`: Generating Text-to-Speech Output

`main.py`: The Primary Interface for Users