Image Processing using OpenCV: A Step-by-Step Guide

6 min readSep 5, 2024

Image processing is a crucial part of modern fields like AI, computer vision, and robotics. OpenCV, a powerful open-source library, allows developers to handle complex image tasks with ease. This blog will guide you through essential image processing techniques using Python, covering everything from reading and displaying images, converting color spaces, and resizing images, to more advanced tasks like edge detection, contour detection, and thresholding. With hands-on examples, you’ll learn to manipulate and enhance images effortlessly.

1. Introduction to OpenCV
2. Reading and Displaying Images
3. Converting Between Color Spaces
4. Resizing an Image
5. Rotating an Image
6. Blurring an Image
7. Edge Detection Using Canny Algorithm
8. Drawing Shapes and Adding Text
9. Thresholding
10. Contours Detection
Conclusion

1. Introduction to OpenCV

OpenCV is a widely used open-source computer vision library that allows developers to manipulate images and video streams with minimal effort. It’s the go-to solution for many tasks such as image recognition, filtering, edge detection, and more. OpenCV is also cross-platform, making it a perfect choice for large-scale AI and machine-learning applications.

1.1 Installing OpenCV

Before we start, let’s make sure OpenCV is installed. You can easily install it using pip:

pip install opencv-python

2. Reading and Displaying Images

To get started with image processing, you first need to load and display an image.

2.1 Code Example:

# Read the image from file
image = cv2.imread('image.jpg')
# Display the image in a window
cv2.imshow('Image', image)
# Wait for a key press and close the window
cv2.waitKey(0)
cv2.destroyAllWindows()

2.2 Explanation:

cv2.imread(): Reads the image from the specified file. It returns an image in the form of a matrix.
cv2.imshow(): Displays the image in a window. The first argument is the window name, and the second is the image matrix.
cv2.waitKey(0): This function waits indefinitely for a key press before closing the window.
cv2.destroyAllWindows(): This ensures that all windows are properly closed when the user exits the program.

3. Converting Between Color Spaces

OpenCV loads images in BGR format (Blue, Green, Red). However, other color spaces, such as grayscale, are often useful. Converting between these color spaces is one of the most common operations in image processing.

3.1 Code Example: Convert an image to grayscale

# Load the image
image = cv2.imread('image.jpg')
# Convert to grayscale
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Save the grayscale image
cv2.imwrite('gray_image.jpg', gray_image)

3.2 Explanation:

cv2.cvtColor(): Converts an image from one color space to another. In this case, we are converting from BGR to grayscale.

4. Resizing an Image

Sometimes, it’s essential to resize an image, either for storage reasons or to standardize dimensions for machine learning models.

4.1 Code Example: Resize the image

# Load the image
image = cv2.imread('image.jpg')
# Resize the image to 300x200
resized_image = cv2.resize(image, (300, 200))
# Save the resized image
cv2.imwrite('resized_image.jpg', resized_image)

4.2 Explanation:

cv2.resize(): Resizes the input image to the specified width and height. In this example, the image is resized to 300x200 pixels.

5. Rotating an Image

Rotation is another crucial aspect of image manipulation, especially in tasks such as object detection and image alignment.

5.1 Code Example: Rotate the image by 90 degrees

# Load the image
image = cv2.imread('image.jpg')
# Get the image's height and width
(h, w) = image.shape[:2]
# Define the center of the image
center = (w // 2, h // 2)
# Rotation matrix
matrix = cv2.getRotationMatrix2D(center, 90, 1.0)
# Perform the rotation
rotated_image = cv2.warpAffine(image, matrix, (w, h))
# Save the rotated image
cv2.imwrite('rotated_image.jpg', rotated_image)

5.2 Explanation:

cv2.getRotationMatrix2D(): Creates a matrix that describes the rotation. It takes the center, angle of rotation (in degrees), and the scaling factor.
cv2.warpAffine(): Applies the transformation matrix to the image, rotating it accordingly.

6. Blurring an Image

Blurring is an essential technique for noise reduction, object detection, and feature extraction.

6.1 Code Example: Applying Gaussian Blur

# Load the image
image = cv2.imread('image.jpg')
# Apply Gaussian Blur
blurred_image = cv2.GaussianBlur(image, (15, 15), 0)
# Save the blurred image
cv2.imwrite('blurred_image.jpg', blurred_image)

6.2 Explanation:

cv2.GaussianBlur(): Blurs the image using a Gaussian filter. The second argument is the kernel size, and the third argument is the standard deviation.

7. Edge Detection Using Canny Algorithm

Edge detection is a key feature in many computer vision applications, such as object detection and recognition. OpenCV provides the Canny edge detection algorithm for this purpose.

7.1 Code Example: Detect edges using the Canny algorithm

# Load the image
image = cv2.imread('image.jpg')
# Convert to grayscale
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Apply Canny edge detector
edges = cv2.Canny(gray_image, 100, 200)
# Save the edges image
cv2.imwrite('edges.jpg', edges)

7.2 Explanation:

cv2.Canny(): Detects edges in the input image. The two parameters define the lower and upper thresholds for edge detection.

8. Drawing Shapes and Adding Text

Sometimes, you may want to draw shapes or add text to an image for annotation purposes.

8.1 Code Example: Draw a rectangle and add text

# Load the image
image = cv2.imread('image.jpg')
# Draw a rectangle
cv2.rectangle(image, (50, 50), (200, 200), (255, 0, 0), 3)
# Add text to the image
cv2.putText(image, 'OpenCV', (60, 40), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
# Save the output image
cv2.imwrite('output_image.jpg', image)

8.2 Explanation:

cv2.rectangle(): Draws a rectangle. You specify the top-left and bottom-right points, color, and thickness.
cv2.putText(): Adds text to an image, specifying the font, color, size, and position.

9. Thresholding

Thresholding is a method of binarizing images. It’s commonly used in image segmentation, where we want to separate objects from the background.

9.1 Code Example: Apply binary thresholding

# Load the image
image = cv2.imread('image.jpg', 0)
# Apply thresholding
_, thresh_image = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY)
# Save the thresholded image
cv2.imwrite('threshold_image.jpg', thresh_image

9.2 Explanation:

cv2.threshold(): Converts an image into a binary format, based on a threshold value.

10. Contours Detection

Contours represent the boundaries of objects in an image. This technique is often used in object detection and shape analysis.

10.1 Code Example: Detect contours and draw them

# Load the image
image = cv2.imread('image.jpg')
# Convert the image to grayscale
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Find edges
edges = cv2.Canny(gray_image, 100, 200)
# Find contours
contours, _ = cv2.findContours(edges, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# Draw contours on the original image
cv2.drawContours(image, contours, -1, (0, 255, 0), 2)
# Save the output image with contours
cv2.imwrite('contours_image.jpg', image)

10.2 Explanation:

cv2.findContours(): Finds contours in a binary image.
cv2.drawContours(): Draws the contours on the original image.

Conclusion

OpenCV is an incredibly versatile library that offers numerous tools for image processing. These tools range from basic tasks such as image resizing and rotation to more advanced operations like edge detection and contour analysis. Whether you’re creating an AI model or simply exploring computer vision, OpenCV should be your primary tool for manipulating and analyzing images.