Translating Stata to R yields different results.

huangapple go评论98阅读模式
英文:

Translating Stata to R yields different results

问题

I see your request, and here's the translated part:

Stata 代码如下:

  1. g tau = year - temp2 if temp2 > temp3 & (bod<. | do<. | lnfcoli<.)

R 代码如下:

  1. data <- data %>%
  2. mutate(tau = if_else((temp2 > temp3) &
  3. (is.na(bod) | is.na(do) | is.na(lnfcoli)),
  4. year - temp2,
  5. NA_integer_))

Stata 输出结果:

  1. 1 Year | temp2 | temp3 | bod | do | lnfcoli | tau |
  2. 2 1986 | 1995 | 1986 | 3.2 | 7.2 | 2.1. | -9 |

R 输出结果:

  1. 1 Year | temp2 | temp3 | bod | do | lnfcoli | tau |
  2. 2 1986 | 1995 | 1986 | 3.2 | 7.2 | 2.1. | NA |

请注意,这里只提供翻译后的内容,不回答问题。

英文:

I am trying to translate a Stata code from a paper into R.

The Stata code looks like this:

  1. g tau = year - temp2 if temp2 &gt; temp3 &amp; (bod&lt;. | do&lt;. | lnfcoli&lt;.)

My R translation looks like this:

  1. data &lt;- data %&gt;%
  2. mutate(tau = if_else((temp2 &gt; temp3) &amp;
  3. (is.na(bod) | is.na(do) | is.na(lnfcoli)),
  4. year - temp2,
  5. NA_integer_))

The problem is that when I run each code I get different results.

This is the result I get when I run the code in Stata:

  1. 1 Year | temp2 | temp3 | bod | do | lnfcoli | tau |
  2. 2 1986 | 1995 | 1986 | 3.2 | 7.2 | 2.1. | -9 |

This is the result I get when I run the code in R:

  1. 1 Year | temp2 | temp3 | bod | do | lnfcoli | tau |
  2. 2 1986 | 1995 | 1986 | 3.2 | 7.2 | 2.1. | NA |

Do you know what might be wrong with my R code or what should I modify to get the same output?

答案1

得分: 2

以下是翻译好的内容:

"bod"、"do" 和 "lnfcoli" 中的任何一个都不缺失("NA"),因此您的逻辑返回 "FALSE" 并返回 "NA_integer_"(在 "if_else" 中为 "false=")。 Stata 将 "." 或缺失的值视为正无穷大,因此该检查实际上是在查找缺失值。

因此,在 R/dplyr 中的等效操作可能是:

  1. data %>%
  2. mutate(
  3. tau = if_else(
  4. (temp2 > temp3) & (!(is.na(bod) | is.na(do) | is.na(lnfcoli))),
  5. year - temp2,
  6. NA_integer_
  7. )
  8. )
  9. # year temp2 temp3 bod do lnfcoli tau
  10. #1 1986 1995 1986 3.2 7.2 2.1 -9
英文:

None of bod, do or lnfcoli are missing (NA), so your logic returns FALSE and returns NA_integer_ (false= in the if_else). Stata treats . or missing values as positive infinity, so that check is actually looking for not missing.

So the equivalent in R/dplyr is probably:

  1. data %&gt;%
  2. mutate(
  3. tau = if_else(
  4. (temp2 &gt; temp3) &amp; (!(is.na(bod) | is.na(do) | is.na(lnfcoli))),
  5. year-temp2,
  6. NA_integer_
  7. )
  8. )
  9. # year temp2 temp3 bod do lnfcoli tau
  10. #1 1986 1995 1986 3.2 7.2 2.1 -9

huangapple
  • 本文由 发表于 2023年6月1日 12:38:39
  • 转载请务必保留本文链接:https://go.coder-hub.com/76378708.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定