2023年8月4日 20:49:11go评论129阅读模式

英文:

Recoloring an image

问题

我想要重新着色一张 png 图像（其中包含使用平板电脑手写的文本），以期望的方式进行重新着色（基于下面代码中的某个字典[请参见代码中的 color_dict]，该字典规定了应该用哪种颜色替换哪种颜色）。

编写一个代码来询问每个像素的颜色并根据字典重新着色像素很容易。

但是输出图像最终出现了像素化（也就是说，文本具有锯齿边界）。

经过一些搜索，我发现如果使用基于RGB值的线性函数更改颜色，那么可以避免锯齿状。

线性函数对于我的目的来说不足够。

因此，我采用了基于RGB值的二次函数（每个颜色通道都是三个变量的二次多项式），如下所示。

问题是处理一张图像大约需要7秒的时间，这对于我的目的来说太高了。

即使使用多进程，对于我的需求来说运行时间也很长（我尝试了ffmpeg中的geq滤镜，但即使反转颜色，输出中也会出现锯齿，甚至更糟）。

有没有其他的方法来重新着色图像（使用相同的重新着色方法）？

使用非命令行工具是否有优势？

from PIL import Image
import numpy as np
# 以下是代码中的函数和变量定义，不需要翻译
color_change_dict = {
    "white": [5, 98, 255, 240, 240, 240],
    "black": [255, 255, 255, 81, 92, 93],
    "red": [207, 54, 108, 52, 152, 219],
    "mud": [203, 103, 14, 203, 103, 14],
}
# 以下是字典和矩阵计算的部分，不需要翻译
change_colors_in_one_image(imge_path)

作为示例，以下是一个输入图像。

以下是重新着色后的输出图像。

英文:

I want to recolor a png image (which has handwritten text using a tablet) in a desired way (based on some dictionary [See color_dict in the code below] which dictates which color should be replaced by which ones).

It was easy to write a code which would ask the color of each pixel and recolor the pixel according to the dictionary.

But the output image ended up being pixelated (that, is, the text has jagged boundaries).

Upon some googling I found that if one changes the colors using a linear function based on the RGB values then jaggedness can be avoided.

A linear function ended up being inadequate for my purpose.

So I resorted to using a quadratic function based on RGB values (one degree 2 polynomial in three variables for each color channel) as given in the following code.

>The problem is that it takes about 7 seconds to process one image, which is too high for my purpose.

Even with multiprocessing the run time is high for my needs (which is to recolor a video, I tried the geq filter in ffmpeg but that also led to jaggedness in the output, even when inverting colors).

Is there some other way to recolor the image (with the recipe of recoloring the same)?

Is there an advantage in using non-command line tools for this purpose?

from PIL import Image
import numpy as np
def return_row(r, g, b):
    r_inv = 255 - r
    g_inv = 255 - g
    b_inv = 255 - b
    return [r_inv**2, g_inv**2, b_inv**2, r_inv * g_inv, g_inv * b_inv, b_inv * r_inv, r_inv, g_inv, b_inv]
def solve_mat(dictionary):
    A = []
    B = []
    for key, color_code in dictionary.items():
        r = color_code[0]
        g = color_code[1]
        b = color_code[2]
        value = color_code[3]
        row = return_row(r, g, b)
        A.append(row)
        B.append(value)
    X = np.linalg.lstsq(A, B, rcond=None)[0]
    return X
def get_individual_channgel_dict(color_change_dict):
    r_dict = {}
    g_dict = {}
    b_dict = {}
    for color, array in color_change_dict.items():
        r_dict[color] = array[:3]
        r_dict[color].append(array[3])
        g_dict[color] = array[:3]
        g_dict[color].append(array[4])
        b_dict[color] = array[:3]
        b_dict[color].append(array[5])
    return r_dict, g_dict, b_dict
def get_coeff_mat(r_dict, g_dict, b_dict):
    r_mat = solve_mat(r_dict)
    g_mat = solve_mat(g_dict)
    b_mat = solve_mat(b_dict)
    return r_mat, g_mat, b_mat
def rgb_out(R, G, B, param_mat):
    a = param_mat[0]
    b = param_mat[1]
    c = param_mat[2]
    d = param_mat[3]
    e = param_mat[4]
    f = param_mat[5]
    g = param_mat[6]
    h = param_mat[7]
    i = param_mat[8]
    return int((a * (R**2)) + (b * (G**2)) + (c * (B**2)) + (d * R * G) + (e * G * B) + (f * B * R) + (g * R) + (h * G) + (i * B))
def change_colors_in_one_image(image_path):
    with Image.open(image_path) as img:
        pixels = img.load()
        # Iterate over each pixel
        width, height = img.size
        for x in range(width):
            for y in range(height):
                r, g, b = pixels[x, y]
                new_r = rgb_out(r, g, b, r_mat)
                new_g = rgb_out(r, g, b, g_mat)
                new_b = rgb_out(r, g, b, b_mat)
                pixels[x, y] = (new_r, new_g, new_b)
        # Save the modified image
        img.save(image_path)
color_change_dict = {
    &quot;white&quot;: [5, 98, 255, 240, 240, 240],
    &quot;black&quot;: [255, 255, 255, 81, 92, 93],
    &quot;red&quot;: [207, 54, 108, 52, 152, 219],
    &quot;mud&quot;: [203, 103, 14, 203, 103, 14],
}
r_dict, g_dict, b_dict = get_individual_channgel_dict(color_change_dict)
r_mat, g_mat, b_mat = get_coeff_mat(r_dict, g_dict, b_dict)
change_colors_in_one_image(imge_path)

As an example, following is an input image.

Following is the output after recoloring.

答案1

得分: 2

I'd suggest using vectorization to do

英文:

I'd suggest using vectorization to do this image transformation, rather than doing it pixel-by-pixel.

def rgb_out(R, G, B, param_mat):
a = param_mat[0]
b = param_mat[1]
c = param_mat[2]
d = param_mat[3]
e = param_mat[4]
f = param_mat[5]
g = param_mat[6]
h = param_mat[7]
i = param_mat[8]
val = ((a * (R**2)) + (b * (G**2)) + (c * (B**2)) + (d * R * G) + (e * G * B) + (f * B * R) + (g * R) + (h * G) + (i * B))
# Prevent overflow - if larger than 255 or smaller than 0, clip to those values
return val.clip(0, 255).astype(&#39;uint8&#39;)
def change_colors_in_one_image(image_path):
with Image.open(image_path) as img:
# Convert array to numpy
pixels = np.array(img)
# Put channel axis first
pixels = np.moveaxis(pixels, -1, 0)
# Avoid overflow in intermediate calculations
pixels = pixels.astype(&#39;float32&#39;)
r, g, b = pixels
new_r = rgb_out(r, g, b, r_mat)
new_g = rgb_out(r, g, b, g_mat)
new_b = rgb_out(r, g, b, b_mat)
pixels = np.stack([new_r, new_g, new_b])
# Move channel axis back to last
pixels = np.moveaxis(pixels, 0, -1)
img = Image.fromarray(pixels)
return img

Timing this, it takes 372ms per frame, which is about 100x faster.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

重新着色图像

问题

答案1

将HF模型推送到Hub。

使用Z3约束求解器创建对象列表。

PySpark: 使DataFrame不再可访问

从vtk文件中读取场数据

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。