Function conv_units() not working inside an ifelse() statement

huangapple go评论76阅读模式
英文:

Function conv_units() not working inside an ifelse() statement

问题

I want to convert geographic coordinates in a table. I have some measurement in decimal degrees (dec_deg) and some in decimal minutes (deg_dec_min). I want to convert those in decimal minutes to decimal degrees. Using the function conv_unit() within a mutate() and ifelse() statement, I get a Warning message and an incorrect value.

Here's a reproducible example:

library(dplyr)
library(measurements)

data_latlon <- tibble(latitude = c(8.726088, -16.365242, -19.888074, '1 40.232', '0 2.308', '2 2.356'),
                      longitude = c(-83.180764, -62.015502, -40.549983, '75 54.301', '70 56.693', '72 41.143'), 
                      unit = c('dec_deg', 'dec_deg', 'dec_deg', 'deg_dec_min', 'deg_dec_min', 'deg_dec_min'))

Case 1: using ifelse()

data_latlon %>% 
  mutate(latitude = ifelse(unit=='deg_dec_min', 
                           conv_unit(latitude, from = 'deg_dec_min', to = 'dec_deg'), 
                           latitude))

Case 2: doing it separately

data_latlon %>% 
  filter(unit=='deg_dec_min') %>% 
  mutate(latitude = ifelse(unit=='deg_dec_min', 
                           conv_unit(latitude, from = 'deg_dec_min', to = 'dec_deg'), 
                           latitude))

Warning messages:
1: Problem while computing latitude = ifelse(...).
ℹ longer object length is not a multiple of shorter object length
2: Problem while computing latitude = ifelse(...).
ℹ data length is not a multiple of split variable

英文:

I want to convert geographic coordinates in a table. I have some measurement in decimal degrees (dec_deg) and some in decimal minutes (deg_dec_min). I want to convert those in decimal minutes to decimal degrees. Using the function conv_unit() within a mutate() and ifelse() statement, I get a Warning message and an incorrect value.

Here's a reproducible example:

library(dplyr)
library(measurements)

data_latlon &lt;- tibble(latitude = c(8.726088, -16.365242, -19.888074, 
                                   &#39;1 40.232&#39;, &#39;0 2.308&#39;, &#39;2 2.356&#39;),
                      longitude = c(-83.180764, -62.015502, -40.549983, 
                                    &#39;75 54.301&#39;, &#39;70 56.693&#39;, &#39;72 41.143&#39;), 
                      unit = c(&#39;dec_deg&#39;, &#39;dec_deg&#39;, &#39;dec_deg&#39;, 
                               &#39;deg_dec_min&#39;, &#39;deg_dec_min&#39;,&#39;deg_dec_min&#39;))

Case 1: using ifelse()

data_latlon %&gt;% 
  mutate(latitude = ifelse(unit==&#39;deg_dec_min&#39;, 
                           conv_unit(latitude, from = &#39;deg_dec_min&#39;, to = &#39;dec_deg&#39;), 
                           latitude))

# A tibble: 6 &#215; 3
  latitude         longitude  unit       
  &lt;chr&gt;            &lt;chr&gt;      &lt;chr&gt;      
1 8.726088         -83.180764 dec_deg    
2 -16.365242       -62.015502 dec_deg    
3 -19.888074       -40.549983 dec_deg    
4 2.34133333333333 75 54.301  deg_dec_min
5 2.356            70 56.693  deg_dec_min
6 0                72 41.143  deg_dec_min
Warning messages:
1: Problem while computing `latitude = ifelse(...)`.
ℹ longer object length is not a multiple of shorter object length 
2: Problem while computing `latitude = ifelse(...)`.
ℹ data length is not a multiple of split variable 

Case 2: doing it separately

data_latlon %&gt;% 
  filter(unit==&#39;deg_dec_min&#39;) %&gt;% 
  mutate(latitude = ifelse(unit==&#39;deg_dec_min&#39;, 
                           conv_unit(latitude, from = &#39;deg_dec_min&#39;, to = &#39;dec_deg&#39;), 
                           latitude))

# A tibble: 3 &#215; 3
  latitude           longitude unit       
  &lt;chr&gt;              &lt;chr&gt;     &lt;chr&gt;      
1 1.67053333333333   75 54.301 deg_dec_min
2 0.0384666666666667 70 56.693 deg_dec_min
3 2.03926666666667   72 41.143 deg_dec_min

答案1

得分: 1

I believe the issue is measurments::conv_unit is intended to receive numeric vector. However, some of your values in latitude column is character. This would be an issue if you run the entire column simultaneously.

For example, running conv_unit will return warnings and incorrect results:

conv_unit(data_latlon$latitude, from = 'deg_dec_min', to = 'dec_deg')

[1] "8.99884203333333"  "-19.9047406666667" "...
Warning messages:
1: In as.numeric(unlist(strsplit(x_na_free, " "))) * c(3600, 60) :
  longer object length is not a multiple of shorter object length
2: In split.default(as.numeric(unlist(strsplit(x_na_free, " "))) *  :
  data length is not a multiple of split variable

Solution:

The easiest solutions are to vectorize the function or use rowwise():

conv_unit_vec<-Vectorize(conv_unit)


data_latlon %>% 
  mutate(latitude = ifelse(unit=='deg_dec_min', 
                           conv_unit_vec(latitude, from = 'deg_dec_min', to = 'dec_deg'), 
                           latitude))

# A tibble: 6 x 3
  latitude           longitude  unit       
  <chr>              <chr>      <chr>      
1 8.726088           -83.180764 dec_deg    
2 -16.365242         -62.015502 dec_deg    
3 -19.888074         -40.549983 dec_deg    
4 1.67053333333333   75 54.301  deg_dec_min
5 0.0384666666666667 70 56.693  deg_dec_min
6 2.03926666666667   72 41.143  deg_dec_min

or:

data_latlon %>% 
  rowwise()%>%
  mutate(latitude = ifelse(unit=='deg_dec_min', 
                           conv_unit_vec(latitude, from = 'deg_dec_min', to = 'dec_deg'), 
                           latitude)) %>%
  ungroup()

# A tibble: 6 x 3
  latitude           longitude  unit       
  <chr>              <chr>      <chr>      
1 8.726088           -83.180764 dec_deg    
2 -16.365242         -62.015502 dec_deg    
3 -19.888074         -40.549983 dec_deg    
4 1.67053333333333   75 54.301  deg_dec_min
5 0.0384666666666667 70 56.693  deg_dec_min
6 2.03926666666667   72 41.143  deg_dec_min
英文:

I believe the issue is measurments::conv_unit is intended to receive numeric vector. However, some of your values in latitude column is character. This would be an issue if you run the entire column simultaneously.

For example, running conv_unit will return warnings and incorrect results:

conv_unit(data_latlon$latitude, from = &#39;deg_dec_min&#39;, to = &#39;dec_deg&#39;)

[1] &quot;8.99884203333333&quot;  &quot;-19.9047406666667&quot; &quot;-40.232&quot;          
[4] &quot;2.34133333333333&quot;  &quot;2.356&quot;             &quot;0&quot;                
Warning messages:
1: In as.numeric(unlist(strsplit(x_na_free, &quot; &quot;))) * c(3600, 60) :
  longer object length is not a multiple of shorter object length
2: In split.default(as.numeric(unlist(strsplit(x_na_free, &quot; &quot;))) *  :
  data length is not a multiple of split variable

Solution:

The easiest solutions are to vectorize the function or use rowwise():

conv_unit_vec&lt;-Vectorize(conv_unit)


data_latlon %&gt;% 
  mutate(latitude = ifelse(unit==&#39;deg_dec_min&#39;, 
                           conv_unit_vec(latitude, from = &#39;deg_dec_min&#39;, to = &#39;dec_deg&#39;), 
                           latitude))

# A tibble: 6 x 3
  latitude           longitude  unit       
  &lt;chr&gt;              &lt;chr&gt;      &lt;chr&gt;      
1 8.726088           -83.180764 dec_deg    
2 -16.365242         -62.015502 dec_deg    
3 -19.888074         -40.549983 dec_deg    
4 1.67053333333333   75 54.301  deg_dec_min
5 0.0384666666666667 70 56.693  deg_dec_min
6 2.03926666666667   72 41.143  deg_dec_min

or:

data_latlon %&gt;% 
  rowwise()%&gt;%
  mutate(latitude = ifelse(unit==&#39;deg_dec_min&#39;, 
                           conv_unit_vec(latitude, from = &#39;deg_dec_min&#39;, to = &#39;dec_deg&#39;), 
                           latitude)) %&gt;%
  ungroup()

# A tibble: 6 x 3
  latitude           longitude  unit       
  &lt;chr&gt;              &lt;chr&gt;      &lt;chr&gt;      
1 8.726088           -83.180764 dec_deg    
2 -16.365242         -62.015502 dec_deg    
3 -19.888074         -40.549983 dec_deg    
4 1.67053333333333   75 54.301  deg_dec_min
5 0.0384666666666667 70 56.693  deg_dec_min
6 2.03926666666667   72 41.143  deg_dec_min

答案2

得分: 1

以下是您要翻译的内容:

Few things to note here:

  • from ?ifelse : "如果test的任何元素为真,仅当yes将被评估,类似于no"; 所以这里都会完全评估yesno

  • conv_unit() 实际上并不检查 x 参数中的值是否合理。从函数源代码中摘录的部分如下:

    if (from == "deg_dec_min") 
      secs = lapply(split(as.numeric(unlist(strsplit(x_na_free, 
        " "))) * c(3600, 60), f = rep(1:length(x_na_free), 
        each = 2)), sum)

请注意它使用了 unlist()c(3600, 60)rep(..., each = 2),它依赖于一个假设,即输入向量 x 中的每个元素都会被分割成恰好2个数字,不多不少。
如果你将输入向量 c("8.726088", "-16.365242", "-19.888074", "1 40.232", "0 2.308", "2 2.356") 按照 " " 分割字符串,然后使用 unlist(),你将得到9个数字,而不是12个。这是警告和混乱结果的原因。

除了 rowwise(),你还可以通过使用 split()map_at() 来处理 deg_dec_min 被设置的行,如下所示:

library(measurements)
library(dplyr, warn.conflicts = FALSE)
library(purrr)

# 按 "unit" 列拆分为 tibble 列表,
# 仅对 "deg_dec_min" 部分应用 mutate,
# 再将两个部分合并
data_latlon %>%
  split(~ unit) %>%
  map_at("deg_dec_min",
         \(x) x %>% mutate(across(ends_with("itude"),
                                  \(coord_col) conv_unit(coord_col,
                                                         from = 'deg_dec_min',
                                                         to = 'dec_deg')))) %>%
  list_rbind()
#> # A tibble: 6 × 3
#>   latitude          longitude      unit       
#>   <chr>             <chr>          <chr>      
#> 1 8.726088          -83.180764     dec_deg    
#> 2 -16.365242        -62.015502     dec_deg    
#> 3 -19.888074        -40.549983     dec_deg    
#> 4 1.67053333333333  75.9050166666667 deg_dec_min
#> 5 0.0384666666666667 70.9448833333333 deg_dec_min
#> 6 2.03926666666667  72.6857166666667 deg_dec_min

输入:


data_latlon <- tibble(latitude = c(8.726088, -16.365242, -19.888074, 
                                   '1 40.232', '0 2.308', '2 2.356'),
                      longitude = c(-83.180764, -62.015502, -40.549983, 
                                    '75 54.301', '70 56.693', '72 41.143'), 
                      unit = c('dec_deg', 'dec_deg', 'dec_deg', 
                               'deg_dec_min', 'deg_dec_min','deg_dec_min'))

创建于2023年05月10日,使用 reprex v2.0.2

英文:

Few things to note here:

  • from ?ifelse : "yes will be evaluated if and only if any element of test is true, and analogously for no"; so both yes and no are fully evaluated here.

  • conv_unit() does not really check if values in x arg make sense.
    Snippet from function source:

    if (from == &quot;deg_dec_min&quot;) 
      secs = lapply(split(as.numeric(unlist(strsplit(x_na_free, 
        &quot; &quot;))) * c(3600, 60), f = rep(1:length(x_na_free), 
        each = 2)), sum)

Note how it uses unlist(), c(3600, 60) and rep(..., each = 2), it relies on an assumption that each element in input vector x will be split into exactly 2 numbers, no more no less.
If you take your input vector, c(&quot;8.726088&quot;, &quot;-16.365242&quot;, &quot;-19.888074&quot;, &quot;1 40.232&quot;, &quot;0 2.308&quot;, &quot;2 2.356&quot;), split strings by &quot; &quot;, and unlist, you will get 9 instead of 12 numbers. This is the reason for the warning and messed up result.

Besides rowwise() you could also handle this by e.g. split() and map_at() to only process rows where deg_dec_min is set:

library(measurements)
library(dplyr, warn.conflicts = FALSE)
library(purrr)

# split to list of tibbles by &quot;unit&quot; column, 
# apply mutate only on &quot;deg_dec_min&quot; part,
# rbind both parts back together
data_latlon %&gt;% 
  split(~ unit) %&gt;% 
  map_at(&quot;deg_dec_min&quot;, 
         \(x) x %&gt;% mutate(across(ends_with(&quot;itude&quot;),
                                  \(coord_col) conv_unit(coord_col, 
                                                         from = &#39;deg_dec_min&#39;, 
                                                         to = &#39;dec_deg&#39;)))) %&gt;% 
  list_rbind()
#&gt; # A tibble: 6 &#215; 3
#&gt;   latitude           longitude        unit       
#&gt;   &lt;chr&gt;              &lt;chr&gt;            &lt;chr&gt;      
#&gt; 1 8.726088           -83.180764       dec_deg    
#&gt; 2 -16.365242         -62.015502       dec_deg    
#&gt; 3 -19.888074         -40.549983       dec_deg    
#&gt; 4 1.67053333333333   75.9050166666667 deg_dec_min
#&gt; 5 0.0384666666666667 70.9448833333333 deg_dec_min
#&gt; 6 2.03926666666667   72.6857166666667 deg_dec_min

Input:


data_latlon &lt;- tibble(latitude = c(8.726088, -16.365242, -19.888074, 
                                   &#39;1 40.232&#39;, &#39;0 2.308&#39;, &#39;2 2.356&#39;),
                      longitude = c(-83.180764, -62.015502, -40.549983, 
                                    &#39;75 54.301&#39;, &#39;70 56.693&#39;, &#39;72 41.143&#39;), 
                      unit = c(&#39;dec_deg&#39;, &#39;dec_deg&#39;, &#39;dec_deg&#39;, 
                               &#39;deg_dec_min&#39;, &#39;deg_dec_min&#39;,&#39;deg_dec_min&#39;))

<sup>Created on 2023-05-10 with reprex v2.0.2</sup>

huangapple
  • 本文由 发表于 2023年5月10日 20:27:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/76218417.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定