英文:
Extract specific PCB colors in R for defects classification
问题
抱歉,以下是翻译的部分:
假设我们有以下包含PCB缺陷(称为缺少孔缺陷)的图片:
缺陷列表我需要在我的项目中识别的是:
为了达到这个目的,我需要提取与缺陷类别相关的颜色。
我知道使用R语言,我们可以执行以下操作:
从缺陷类型的图片中,我看到有3种颜色可以区分这些缺陷。
我需要识别这3种颜色并从数据集中提取出来。
英文:
Assume we have the following picture that contains a PCB defect ( called missing hole defect ) :
The defects list i need to identify in my project are :
For this purpose , i need to extract colors related to defects categories.
I know that using R , we can do :
library(colorfindr)
img_path="C:/Users/Rayane_2/Desktop/Data/PCB1/PCB/images/Mouse_bite/01_mouse_bite_04.jpg"
colorfindr::get_colors(img_path,top_n=20)
# A tibble: 20 × 3
col_hex col_freq col_share
<chr> <int> <dbl>
1 #005B0C 31106 0.00646
2 #005B0E 29117 0.00605
3 #01590B 27768 0.00577
4 #005A0B 24135 0.00502
5 #015C0D 23771 0.00494
6 #00580A 22397 0.00465
7 #005B0B 21529 0.00447
8 #00560F 21476 0.00446
9 #005A0D 21324 0.00443
10 #025A0C 21191 0.00440
11 #01590D 21026 0.00437
12 #005709 20009 0.00416
13 #00580E 19063 0.00396
14 #005909 18666 0.00388
15 #015C0F 18450 0.00383
16 #025A0E 17979 0.00374
17 #015710 16621 0.00345
18 #01590F 16619 0.00345
19 #00580C 16546 0.00344
20 #005A0A 15614 0.00324
From the defects type picture , i see there are 3 colors that allows to distinct those defects.
I need to identify those 3 and extract from tibble dataset.
答案1
得分: 1
The problem is interesting and we can try different features, starting from manual, handcrafted (from simple to complex features and different machine learning models) to automatically extracted features (e.g., with deep learning deep neural net models).
Let's try a very simple feature based on colors only - the feature we shall use will be color cluster proportion.
We shall first cluster the image RGB color values into k
groups (e.g., k=3
) using kmeans
clustering algorithm and obtain k
color cluster centers using the function get.color.clusters()
, as shown below (we need to extract red, green, blue values from the hex color values).
Then we shall use the kmeans
model to predict the color cluster each pixel of an image belongs to and then compute the proportion of pixels in an image belonging to a color cluster as features (hence we shall have k
features). Hence, our data frame will look like the following for k=3
clusters:
cluster1 cluster2 cluster3 class (label)
image1 0.6 0.3 0.1 missing holes
which means we have 60%, 30%, and 10% pixels belonging to cluster 1, 2, and 3, respectively, for the missing hole image1.
Now this dataset will be used to train a (binary) classifier, and the classifier will do a descent job if our assumption that the color cluster proportions for the same defect class have a similar pattern.
Here are the two sets of images we shall use for only 2 classes:
missing-holes
mouse-bites
Now, let's extract the color cluster proportion features and try SVM classifier with RBF kernel for the classification and prediction of the defect classes.
# R code provided for extracting color cluster proportion features and using SVM classifier.
Ideally, we should train on a proportion of the dataset and evaluate the classifier on a held-out dataset to achieve generalizability.
Now the color cluster proportion feature is quite naive and is likely not to perform that well. Then you can try to extract shape features and features like HOG, SIFT, SURF, BRISK, BRIEF and use the corresponding descriptors as feature vectors for the ML classifiers.
Finally, in order to get the best performance, we can use deep neural nets to enable automatic feature generation at different layers, but in this case, we need to have a reasonably large number of training images (increase training dataset size with data augmentation) or use transfer learning on top of some standard pretrained network (e.g., Vgg-16 or ResNet-150).
英文:
The problem is interesting and we can try different features, starting from manual, handcrafted (from simple to complex features and different machine learning models) to automatically extracted features (e.g., with deep learning deep neural net models).
Let's try a very simple feature based on colors only - the feature we shall use will be color cluster proportion.
We shall first cluster the image RGB color values into k
groups (e.g., k=3
) using kmeans
clustering algorithm and obtain k
color cluster centers using the function get.color.clusters()
, as shown below (we need to extract red, green, blue values from the hex color values).
Then we shall use the kmeans
model to predict the color cluster each pixel of an image belongs to and then compute the proportion of pixels in an image belonging to a color cluster as features (hence we shall have k
features). Hence, our data frame will look like the following for k=3
clusters:
cluster1 cluster2 cluster3 class (label)
image1 0.6 0.3 0.1 missing holes
which means we have 60%, 30% and 10% pixels belonging to cluster 1, 2 and 3, respectively, for the missing hole image1.
Now this dataset will be used to train a (binary) classifier and classifier will do a descent job if our assumption that the color cluster proportions for the same defect class has similar pattern.
Here are the two sets of images we shall use for only 2 classes:
missing-holes
mouse-bites
Now, let's extract the color cluster proportion features and try SVM classifier with RBF kernel for the classification and prediction of the defect classes.
find_cluster_kmeans <- function(cl, x) { # predict the color cluster a pixel belongs to
return (which.min(apply(cl$centers, 1, function(y) sum((y-x)^2))))
}
extract.color.features <- function(img_path, cl) {
col_df <- colorfindr::get_colors(img_path, top_n=20)
cols <- as.data.frame(t(do.call(rbind, lapply(col_df['col_hex'], col2rgb))))
col_cluster <- apply(cols, 1, function(x) find_cluster_kmeans(cl, x))
col_df <- cbind(col_df, cols, col_cluster=col_cluster)
col_df <- col_df[c('col_cluster', 'col_share')]
df_feat <- aggregate(col_df$col_share, list(col_df$col_cluster), FUN=sum) # group by color clusters and sum proportions
names(df_feat) <- c('col_clust', 'prop')
for (i in 1:(nrow(cl$centers))) { # ensure that all color clusters are present
if (nrow(df_feat[df_feat$col_clust == i,]) == 0) {
df_feat <- rbind(df_feat, data.frame(col_clust=i, prop=0))
}
}
df_feat$prop <- df_feat$prop / sum(df_feat$prop) # normalize
return(df_feat)
}
get.color.clusters <- function(k=3, top_n=50) {
col_df <- NULL
for (folder in c('missing_hole', 'Mouse_bite')) {
img_path <- list.files(folder,".png", full.names = T)
cdf <- do.call(rbind, lapply(img_path, function(p) colorfindr::get_colors(p,top_n=top_n)))
col_df <- rbind(col_df, cdf)
}
cols <- as.data.frame(t(do.call(rbind, lapply(col_df['col_hex'], col2rgb))))
cl <- kmeans(cols, k)
#print(cl$center)
return (cl)
}
library(colorfindr)
set.seed(12)
k <- 3 # 3 color clusters
cl <- get.color.clusters(k)
df <- NULL
for (cls in c('missing_hole', 'Mouse_bite')) {
img_path <- list.files(cls,".png", full.names = T)
df_feat <- NULL
for (img in img_path) {
#print(img)
df_feat <- rbind(df_feat, extract.color.features(img, cl)$prop)
}
df_feat <- as.data.frame(df_feat)
df_feat$class <- cls
df <- rbind(df, df_feat)
}
names(df)[1:k] <- paste0('cluster', 1:k)
df$class <- as.factor(df$class)
df # each row corrspeonds to an image and each column to a color cluster
# cluster1 cluster2 cluster3 class
#1 0.318473896 0.68152610 0.00000000 missing_hole
#2 0.984514797 0.01548520 0.00000000 missing_hole
#3 0.967479675 0.03252033 0.00000000 missing_hole
#4 0.010911326 0.80282772 0.18626095 Mouse_bite
#5 0.008364049 0.96257443 0.02906153 Mouse_bite
#6 0.446066380 0.55393362 0.00000000 Mouse_bite
library(e1071)
svmfit = svm(class ~ ., data = df, kernel = "radial", cost = 1, scale = FALSE, type='C')
#print(svmfit)
plot(svmfit, df, cluster1 ~ cluster2, fill=TRUE, alpha=0.2)
df$prdicted <- predict(svmfit, df)
df
# cluster1 cluster2 cluster3 class prdicted
#1 0.318473896 0.68152610 0.00000000 missing_hole Mouse_bite
#2 0.984514797 0.01548520 0.00000000 missing_hole missing_hole
#3 0.967479675 0.03252033 0.00000000 missing_hole missing_hole
#4 0.010911326 0.80282772 0.18626095 Mouse_bite Mouse_bite
#5 0.008364049 0.96257443 0.02906153 Mouse_bite Mouse_bite
#6 0.446066380 0.55393362 0.00000000 Mouse_bite Mouse_bite
Ideally we should train on a proportion of dataset and evaluate the classifier on a held-out dataset to achieve generalizability.
Now the color cluster proportion feature is quite naive and is likely not preform that good, then you can try to extract shape features and features like HOG, SIFT, SURF, BRISK, BRIEF and use the corresponding descriptors as feature vectors for the ML classifiers.
Finally, in order to get the best performance we can use deep neural nets to enable automatic feature generation at different layers, but in this case we need to have reasonably large number of training images (increase training dataset size with data augmentation) or use transfer learning on top of some standard pretrained network (e.g., Vgg-16 or ResNet-150).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论