AI Avatar Maker: Transform Your Images into Stunning AI Avatars
In today’s digital world, AI avatars are becoming increasingly popular, especially in media where news anchors and presenters use them to create a unique and engaging experience. But what’s the science behind these avatars, and how can you make one yourself?
Creating an AI avatar from multiple images and videos is a multi-step process.
In this guide, I’ll walk you through the process of creating an AI avatar using multiple images and videos, from setting up your environment to animating your avatar with voiceovers.
The AI Avatar Creator app features a user-friendly interface that guides you through the process of creating custom avatars. The app starts with a straightforward drag-and-drop interface for uploading images and videos. Users can upload multiple images in formats like JPG and PNG and videos in MP4, AVI, or MPEG4 formats, each with a file size limit of 200MB. Once the media is uploaded, you can easily animate your avatar with a single click. Additionally, the app provides an option to add a personalized voiceover by entering the desired text, making it a seamless experience to bring your AI avatar to life. Below is a snapshot of the app’s interface:
Table of Contents
- Project Structure
- Environment Setup
- Utility Modules
- Main Streamlit App
- Running the App
- Command to run the Streamlit app
- Conclusion
1. Project Structure
The project is organized into a directory with utility functions separated into individual Python files. This modular approach keeps the code clean and manageable.
ai_avatar_app/
│
├── main.py # Main Streamlit app file
├── utils/
│ ├── image_processing.py # Image processing functions
│ ├── video_processing.py # Video processing functions
│ ├── animation.py # Avatar animation functions
│ ├── tts.py # Text-to-speech functions
│ └── __init__.py # Init file for utils module
├── requirements.txt # Required Python libraries
└── README.md # Project documentation
main.py
: The core Streamlit application file.utils/
: Directory containing utility functions split into different files.requirements.txt
: A list of dependencies required to run the project.
2. Environment Setup
The requirements.txt
file is used to specify all the libraries and dependencies your project requires. It ensures that anyone running your project will have the correct packages installed.
requirements.txt
streamlit
opencv-python
pillow
numpy
dlib
moviepy
pyttsx3
3. Utility Modules
Each utility module handles a specific task in the avatar creation process.
utils/image_processing.py
: Handling Image Processing Tasks
This module contains functions to process images, such as extracting faces and creating composite images from multiple faces.
import cv2
import dlib
from PIL import Image
import numpy as np
detector = dlib.get_frontal_face_detector()
def extract_face(image_path):
image = cv2.imread(image_path)
faces = detector(image)
if len(faces) > 0:
face = faces[0]
x, y, w, h = face.left(), face.top(), face.width(), face.height()
face_image = image[y:y+h, x:x+w]
return cv2.cvtColor(face_image, cv2.COLOR_BGR2RGB)
else:
return None
def create_composite_image(image_paths):
faces = []
for path in image_paths:
face = extract_face(path)
if face is not None:
faces.append(face)
if faces:
composite_face = np.hstack(faces)
composite_image = Image.fromarray(composite_face)
return composite_image
else:
return None
utils/video_processing.py
: Extracting Frames from Videos
This module handles video processing tasks like extracting frames from a video file.
import os
from moviepy.editor import VideoFileClip
from PIL import Image
def extract_frames(video_path, output_dir, num_frames=10):
clip = VideoFileClip(video_path)
duration = clip.duration
interval = duration / num_frames
if not os.path.exists(output_dir):
os.makedirs(output_dir)
for i in range(num_frames):
frame_time = i * interval
frame = clip.get_frame(frame_time)
frame_image = Image.fromarray(frame)
frame_image.save(f"{output_dir}/frame_{i}.jpg")
utils/animation.py
: Creating Animations from Images
This module creates animations from the processed images.
import cv2
import time
import numpy as np
def animate_avatar(images):
cv2.namedWindow('Avatar Animation', cv2.WINDOW_NORMAL)
for i in range(50):
for image in images:
cv2.imshow('Avatar Animation', image)
time.sleep(0.1)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cv2.destroyAllWindows()
def prepare_images_for_animation(face_images):
return [cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR) for img in face_images]
utils/tts.py
: Generating Text-to-Speech Output
This module handles converting text to speech.
import pyttsx3
def text_to_speech(text, save_path=None):
engine = pyttsx3.init()
engine.say(text)
if save_path:
engine.save_to_file(text, save_path)
engine.runAndWait()
4. Main Streamlit App
The main.py
file is where the Streamlit app is developed. This file creates the user interface and ties together all the utility functions.
main.py
: The Primary Interface for Users
import streamlit as st
from utils.image_processing import create_composite_image
from utils.video_processing import extract_frames
from utils.animation import animate_avatar, prepare_images_for_animation
from utils.tts import text_to_speech
import os
st.title("AI Avatar Creator")
# Step 1: Image Upload
st.header("Step 1: Upload Images")
uploaded_images = st.file_uploader("Choose images", type=["jpg", "jpeg", "png"], accept_multiple_files=True)
if uploaded_images:
image_paths = []
for uploaded_image in uploaded_images:
image_path = f"temp/{uploaded_image.name}"
with open(image_path, "wb") as f:
f.write(uploaded_image.getbuffer())
image_paths.append(image_path)
composite_image = create_composite_image(image_paths)
if composite_image:
st.image(composite_image, caption="Composite Avatar")
composite_image.save('composite_avatar.jpg')
# Step 2: Video Upload
st.header("Step 2: Upload Video")
uploaded_video = st.file_uploader("Choose a video", type=["mp4", "avi"])
if uploaded_video:
video_path = f"temp/{uploaded_video.name}"
with open(video_path, "wb") as f:
f.write(uploaded_video.getbuffer())
frame_dir = "frames"
extract_frames(video_path, frame_dir)
st.success(f"Frames extracted to {frame_dir}")
# Step 3: Animate Avatar
st.header("Step 3: Animate Avatar")
if st.button("Animate"):
face_images = [Image.open(f'temp/{img.name}') for img in uploaded_images]
opencv_images = prepare_images_for_animation(face_images)
animate_avatar(opencv_images)
# Step 4: Add Voiceover
st.header("Step 4: Add Voiceover")
text_input = st.text_area("Enter text for the avatar to speak", value="Hello! I am your AI avatar.")
if st.button("Generate Voiceover"):
text_to_speech(text_input, 'avatar_voiceover.mp3')
st.audio('avatar_voiceover.mp3')
5. Running the App
To run the Streamlit app, use the following command:
streamlit run main.py
Conclusion
With this step-by-step guide, you can create your AI avatar using images and videos. From processing images to animating your avatar with a voiceover, you now have the tools and knowledge to bring your digital creations to life. Whether for fun or professional use, this guide will help you create a dynamic and interactive avatar that can speak and animate just like those you see in the media.