英文:
Logic error of cumulative statement and lag function
问题
方法1是正确的,因为它按照给定的规则计算X(t)。方法2和方法3是错误的,因为它们在计算时没有正确地遵循规则。要修复它们,可以按照以下方式修改代码:
方法2:
data want;
  set have;
  *Method 2;
  y = 0.5 * y - 0.5 * lag(y) + e;
run;
方法3:
data want;
  set have;
  *Method 3;
  retain z;
  if _n_ = 1 then z = e;
  else if _n_ = 2 then z = 1.5 * z + e;
  else z = 1.5 * z - 0.5 * lag2_z + e;
  lag2_z = lag2(z);
run;
这些修改会确保方法2和方法3按照规则正确计算X(t)。
英文:
This task is going to derive a time-series variable, The rule is:
> X(t) = 1.5 * X(t-1) - 0.5 * X(t-2) + e;
> X(1) = e;
> X(2) = 1.5 * X(1) + e;
where e ~ N(0,1). Here is how I generate this sequence:
data have;
  call streaminit(42);
  do i=1 to 200;
    e=rand("normal",0,1);
    output;
  end;
run;
data want;
  set have;
  *Method 1;
  x+sum(0.5*x,-0.5*lag(x),e);
  *Method 2;
  y+0.5*y-0.5*lag(y)+e;
  *Method 3;
  retain z;
  if _n_=1 then z=e;
  if _n_=2 then z=1.5*z+e;
  lag2_z=lag2(z);
  if _n_>2 then z=1.5*z-0.5*lag2_z+e;
run;
Run this code, I just find only result of method 1 is right. Why Method 2 and Method 3 are wrong? How to fix them?
TIA.
答案1
得分: 1
Data:
我会使用稍微简单一些的数据,例如:
data have;
  call streaminit(42);
  do i=1 to 20;
    e=1; /*rand("normal",0,1);*/
    output;
  end;
run;
Metchod1:
这个表达式 x+sum(0.5*x,-0.5*lag(x),e); 不同于 X(t) = 1.5 * X(t-1) - 0.5 * X(t-2) + e;,你在 0.5 中漏掉了 1。
Method2:
在第一次迭代中,这个表达式 0.5*y-0.5*lag(y)+e; 的值为缺失 (.),因为 lag(y) 是缺失的,所以 sum 语句实际上是 y+.;,这将导致错误的结果。
Method3:
添加一些 put 语句以查看计算过程,例如:
  *Method 3;
  retain z;
  put z= lag2_z= e=;
  if _n_=1 then z=e;
  if _n_=2 then z=1.5*z+e;
  lag2_z=lag2(z);
  put z= lag2_z= e=;
  if _n_ > 2 then z=1.5*z-0.5*lag2_z+e;
  put _N_ z= / /;
我会这样做:
%let n = 100;
%let e=1; /* 将 1 更改为 Rand() */
/*
%let e = rand("normal",0,1);
*/
data want;
  call streaminit(42);
  e=&e.;
  a=e;
  output;
  b=1.5*a + e;
  output;
  do i=1 to &n.-2;
    e=&e.;
    c= 1.5 * b - 0.5 * a + e;
    a=b;
    b=c;
  put _all_;
  end;
run;
英文:
Data:
I would go with a bit simpler data, e.g.
data have;
  call streaminit(42);
  do i=1 to 20;
    e=1; /*rand("normal",0,1);*/
    output;
  end;
run;
Metchod1:
This x+sum(0.5*x,-0.5*lag(x),e); is different from X(t) = 1.5 * X(t-1) - 0.5 * X(t-2) + e;, you are missing 1 in 0.5.
Method2:
In the first iteration this expression 0.5*y-0.5*lag(y)+e; evaluates to missing (.) since lag(y) is missing, so the sum statement is in fact y+.; which gives wrong result.
Method3:
Add some put statements to see how it calculates, e.g.
  *Method 3;
  retain z;
  put z= lag2_z= e=;
  if _n_=1 then z=e;
  if _n_=2 then z=1.5*z+e;
  lag2_z=lag2(z);
  put z= lag2_z= e=;
  if _n_>2 then z=1.5*z-0.5*lag2_z+e;
  put _N_ z= / /;
I would do it like this:
%let n = 100;
%let e=1; /* change 1 to Rand() */
/*
%let e = rand("normal",0,1);
*/
data want;
  call streaminit(42);
  e=&e.;
  a=e;
  output;
  b=1.5*a + e;
  output;
  do i=1 to &n.-2;
    e=&e.;
    c= 1.5 * b - 0.5 * a + e;
    a=b;
    b=c;
  put _all_;
  end;
run;
答案2
得分: 0
sum()函数在求和时会忽略缺失值。
sum语句也会忽略缺失值,然而在方法2中,summand是常规算术求和,其中包括lag(y),而lag(y)是缺失的,因此在t=2时e从不起作用。换句话说,在t=1时,0.5*y-0.5*lag(y)+e是缺失的。
方法3.存在问题,因为对于_n_>2,lag2在重新计算z之前。
将这些添加到输出可能有助于你理解。
英文:
sum() function ignores missing values in it's summation.
The sum statement also ignores missing values, however the summand in method 2 is normal arithmetic sum with lag(y) which is missing and thus the e never gets into play at t=2. In other words, at t=1 0.5*y-0.5*lag(y)+e is missing.
Method 3. is hiccupping because the lag2 for n>2 is before the recalculated z
Adding these to output might help you suss it out in your head.
  * method 3;
  retain z;
  if _n_=1 then z=e;
  if _n_=2 then z=1.5*z+e;
  lag2_z=lag2(z);
  pre_z = z;
  if _n_>2 then z=1.5*z-0.5*lag2_z+e;
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论