Classical Computer Vision and Sudoku Extraction View

nimda October 5, 2025

0 13 8 minutes read

Classical Computer Vision and Sudoku Extraction View

of AI Hye, sounds like everyone else is using Language language models and large Vision Transformers With every problem in a computer vision. Many people see these tools like one equal size – all solutions and immediately use the latest, forming model instead of the Understanding the subordinate They want to take out. But many times more than good. It is one of the most important lessons I learned as engineer: Do not solve simple problems.

Pipeline processing pictures — It processs steps for antated pipes

Let me show you the physical use of some simple computer techniques to get the same strategic techniques that are widely used, for example, in scanning the scan documents and output application.

On the way you will learn some interesting ideas from classical Callacco technical strategies in order to order polygon points and why this is related to Combinoric's allocation problem.

Overview

Vision
- Grayscale
- The receipt of the edges
- Blow
- Containing of Contour
Alteration
- Different a: Simple-based version of SUM / Diff
- Different B: Sharing Problem
- Different C: Cyclic filters with anchor
- Using a Change of Vision Examination
Store

Vision

Finding Sudoku Grids I looked at the many different ways from simple tests, line modification or some form of access to the deepest regimen of the Deeper Instruction or KeyKoint Discovery.

Let's explain some clarification Setting a problem:

Sudoku grid apparently and it is completely seen in the fraction of a clear quadrilateral border, with solid comparisons after.
The surface where Sudoku Grid is printed requires a flat, but can be caught angle and be processed or rotated.

Examples of different qualities of the image

I'll show you a simple pipe for some sorting steps to get our Sudoku grid boundaries. At high level, active pipe looks like this:

Grayscale

In this first step we simply turn the installation image from its three colors stations at one graycale image, because we do not need color information to process these pictures.

def find_sudoku_grid(
    image: np.ndarray,
) -> np.ndarray | None:
    """
    Finds the largest square-like contour in an image, likely the Sudoku grid.

    Returns:
        The contour of the found grid as a numpy array, or None if not found.
    """

    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

The receipt of the edges

After converting the image to grayscale we can use the Cannon Edge's algorithm to release edges. There are two breads to choose from this algorithm that determine that the pixels are accepted as the edges:

Cann's breades are described in edge — Finds of Canny Border

In our case to find grids in Sudoku, we take very strong edges on our grid lines. We can choose the highest limit to reject the sound that we appear in our masks, and the lower limit and not too low to refuse small edges connected to our mask.

The blur filter is usually used before the pictures have been reduced to reduce the sound, but in this case the edges are very strong but less, which is why to blank is left.

def find_sudoku_grid(
    image: np.ndarray,
    canny_threshold_1: int = 100,
    canny_threshold_2: int = 255,
) -> np.ndarray | None:
    """
    Finds the largest square-like contour in an image, likely the Sudoku grid.

    Args:
        image: The input image.
        canny_threshold_1: Lower threshold for the Canny edge detector.
        canny_threshold_2: Upper threshold for the Canny edge detector.

    Returns:
        The contour of the found grid as a numpy array, or None if not found.
    """

    ...

    canny = cv2.Canny(gray, threshold1=canny_threshold_1, threshold2=canny_threshold_2)

Blow

In the following step, we sent it processing the border acquisition mask with a masking keylation in the small posts in the mask.

def find_sudoku_grid(
    image: np.ndarray,
    canny_threshold_1: int = 100,
    canny_threshold_2: int = 255,
    morph_kernel_size: int = 3,
) -> np.ndarray | None:
    """
    Finds the largest square-like contour in an image, likely the Sudoku grid.

    Args:
        image: The input image.
        canny_threshold_1: First threshold for the Canny edge detector.
        canny_threshold_2: Second threshold for the Canny edge detector.
        morph_kernel_size: Size of the morphological operation kernel.

    Returns:
        The contour of the found grid as a numpy array, or None if not found.
    """

    ...

    kernel = cv2.getStructuringElement(
        shape=cv2.MORPH_RECT, ksize=(morph_kernel_size, morph_kernel_size)
    )
    mask = cv2.morphologyEx(canny, op=cv2.MORPH_DILATE, kernel=kernel, iterations=1)

Containing of Contour

Now that the binary mask is ready, we can use the algorithm to find the contour for the compatible blobs and filter one four point contour.

contours, _ = cv2.findContours(
    mask, mode=cv2.RETR_EXTERNAL, method=cv2.CHAIN_APPROX_SIMPLE
)

This first diagnostic discovery will return a list of poverty containing all one pixel part of the contour. We can use the Douglas-peucker Algorithm decreasing Interatively Number Points in Contour and measured the contour with simple polygon. We can choose a little distance between the almithrithm points.

If we think that even some triangular triangles, the shortest side is 10% of the situation, can filter potholes under four parrots.

contour_candidates: list[np.ndarray] = []
for cnt in contours:
    # Approximate the contour to a polygon
    epsilon = 0.1 * cv2.arcLength(curve=cnt, closed=True)
    approx = cv2.approxPolyDP(curve=cnt, epsilon=epsilon, closed=True)

    # Keep only polygons with 4 vertices
    if len(approx) == 4:
        contour_candidates.append(approx)

Finally, we take the largest contour found, perhaps the last Sudoku grid. We organize rows in a place in the background and take the first thing, which match the largest contour area.

best_contour = sorted(contour_candidates, key=cv2.contourArea, reverse=True)[0]

Slowed contour with highlighted with the original picture

Alteration

We finally need to turn the grid found back to its square. To achieve this, we can use the conversion. The TASRIX of the Reform may be calculated by explaining that four grid points need to finally: four corners of the image.

rect_dst = np.array(
    [[0, 0], [width - 1, 0], [width - 1, height - 1], [0, height - 1]],
)

To match contour points to corners, they need to order first, so they can be properly assigned. Let's explain the next order of our computer points:

Different: Simple-based version of SUM / Diff

Sorting the corners released and give them these intended points, a simple algorithm can look sum including differences for x including y links in each corner.

p_sum = p_x + p_y
p_diff = p_x - p_y

Based on these values, now it is possible to distinguish corners:

The upper left corner has a small number of x and y, with the smallest amount argmin(p_sum)
In the right corner has the largest amount argmax(p_sum)
Top right in the largest largest corner of all argmax(p_diff)
In the left corner than the smallest difference argmin(p_diff)

In the following pictures, I tried to visualize the four corner of a traveling square corner. Colored lines represent your photo's comer assigned to each square corner.

Pictures of rotation square, each corner with a different color and lines showing the share in the picture

def order_points(pts: np.ndarray) -> np.ndarray:
    """
    Orders the four corner points of a contour in a consistent
    top-left, top-right, bottom-right, bottom-left sequence.

    Args:
        pts: A numpy array of shape (4, 2) representing the four corners.

    Returns:
        A numpy array of shape (4, 2) with the points ordered.
    """
    # Reshape from (4, 1, 2) to (4, 2) if needed
    pts = pts.reshape(4, 2)
    rect = np.zeros((4, 2), dtype=np.float32)

    # The top-left point will have the smallest sum, whereas
    # the bottom-right point will have the largest sum
    s = pts.sum(axis=1)
    rect[0] = pts[np.argmin(s)]
    rect[2] = pts[np.argmax(s)]

    # The top-right point will have the smallest difference,
    # whereas the bottom-left will have the largest difference
    diff = np.diff(pts, axis=1)
    rect[1] = pts[np.argmin(diff)]
    rect[3] = pts[np.argmax(diff)]

    return rect

This works well unless the rectangle is very anointed, as the following. In this case, it evidently discerns that this method is wrong, as you have xangle in the same corner allocated to many corners of the image.

The same assignment process fails to circulate the rotation of rotation

Variant B: The problem of doing well

One way can reduce the distances between each point and its assigned corner. This can be used using a pairwise_distances Count between each point and corners and linear_sum_assignment Work from SCIPy, solve the allocation problem while reducing the cost of the cost.

def order_points_simplified(pts: np.ndarray) -> np.ndarray:
    """
    Orders a set of points to best match a target set of corner points.

    Args:
        pts: A numpy array of shape (N, 2) representing the points to order.

    Returns:
        A numpy array of shape (N, 2) with the points ordered.
    """
    # Reshape from (N, 1, 2) to (N, 2) if needed
    pts = pts.reshape(-1, 2)

    # Calculate the distance between each point and each target corner
    D = pairwise_distances(pts, pts_corner)

    # Find the optimal one-to-one assignment
    # row_ind[i] should be matched with col_ind[i]
    row_ind, col_ind = linear_sum_assignment(D)

    # Create an empty array to hold the sorted points
    ordered_pts = np.zeros_like(pts)

    # Place each point in the correct slot based on the corner it was matched to.
    # For example, the point matched to target_corners[0] goes into ordered_pts[0].
    ordered_pts[col_ind] = pts[row_ind]

    return ordered_pts

The flush quadrilanl changed with well-formed computers — Rotation of an orimated quadrilateral rotation with properly given to the corners of photos

Although this solution is applicable, it is incorrect, as dependent on the size of the image between the conditions and corners and cost more because the distance matrix should be built. Just here in the case of four points assigned this ignored, but this solution will not be well prepared for a polygon with many points!

Variant C: Cyclic filters with anchor

This is the third party is a very cold and effective way to edit and give the status points into the picture. The idea is to calculate angle with each point of the situation in accordance with the position of the centroid.

Sketch of angles allotted in each corner

As angles clipsWe need to choose an echor to ensure the full order of points. We just choose a point with a low amount of x and y.

def order_points(self, pts: np.ndarray) -> np.ndarray:
    """
    Orders points by angle around the centroid, then rotates to start from top-left.

    Args:
        pts: A numpy array of shape (4, 2).

    Returns:
        A numpy array of shape (4, 2) with points ordered."""
    pts = pts.reshape(4, 2)
    center = pts.mean(axis=0)
    angles = np.arctan2(pts[:, 1] - center[1], pts[:, 0] - center[0])
    pts_cyclic = pts[np.argsort(angles)]
    sum_of_coords = pts_cyclic.sum(axis=1)
    top_left_idx = np.argmin(sum_of_coords)
    return np.roll(pts_cyclic, -top_left_idx, axis=0)

The rotation of the exchange transmitted quadrilated and the cornerstones are well-assigned with an angle of the way

Now we can use this work to plan our Contour points:

rect_src = order_points(grid_contour)

Using a Change of Vision Examination

Now that we know what points they need to go there, eventually it can move to the most interesting part: creating and using a picture testing in the picture.

As we already have our quadrilateral lists listed in rect_srcand we have our points in the target corner in rect_dstWe can use the OpenCV How to calculate the Matrix Reform:

warp_mat = cv2.getPerspectiveTransform(rect_src, rect_dst)

Result 3 × 3 Warp MatrixExplain how you can be changed from 3D views to the top 2D view. Finding this lowly undesirable idea of our Sudoku grid, we can use this change to see if it is available in our first picture:

warped = cv2.warpPerspective(img, warp_mat, (side_len, side_len))

And vo'pà, we have our full square groat!

The highest high quality view of the Sudoku Square after Tuptices

Store

In this project we traveled by a simple pipeline through classical strategies for computer grids in the Sudoku. These methods provide an easy way to access Sudoku grids. Of course due to its simplicity there is another limitations that this approach is better connected to different settings and non-relatively low or difficult shadows. Using a method based on deep reading can make sense if the discovery requires performance in different settings.

Next, it is used for transformation of a view to the grid. This picture can now be used in continuing, such as extracting digits and actually solving a Suproku. The following article will consider some of the following natural steps in this work.

Look at the project source code below and let me know if you have any questions or thoughts on this work. Until then, happy codes!

For more information and full implementation include code for all photos and visual images, check the project code for my Githubub:

Everything is seen in this post was created by the writer.

Source link

nimda October 5, 2025

0 13 8 minutes read