Why I cannot read and update the register array at the same time in clocked always block with non-blocking statements? (Conwaylife example)

huangapple go评论66阅读模式
英文:

Why I cannot read and update the register array at the same time in clocked always block with non-blocking statements? (Conwaylife example)

问题

I've translated the code portions as requested:

Original Code:

module top_module(
    input clk,
    input load,
    input [255:0] data,
    output [255:0] q ); 
    
    reg [3:0] neighbor_cnt;
    reg [3:0] a, b, c, d;
    reg [7:0] N;
    
    always @ (posedge clk) begin
        if (load) q <= data;
        else begin
            for (int i=0; i<16; i++) begin:row
                for (int j=0; j<16; j++) begin:column
                    a <= i-1;
                    b <= i+1;
                    c <= j-1;
                    d <= j+1;  //overflow handling, wrap around naturally

                    N <= {q[a*16+c], q[a*16+d], q[b*16+c], q[b*16+d], q[i*16+c], q[i*16+d], q[a*16+j], q[b*16+j]};
                    neighbor_cnt <= N[0]+N[1]+N[2]+N[3]+N[4]+N[5]+N[6]+N[7];
                    case (neighbor_cnt)
                        2: q[i*16 + j] <= q[i*16 + j];
                        3: q[i*16 + j] <= 1;
                        default: q[i*16 + j] <= 0;
                    endcase
                end
            end
        end
    end

endmodule

Updated Code:

module top_module(
    input clk,
    input load,
    input [255:0] data,
    output [255:0] q ); 
    
    reg [3:0] neighbor_cnt;
    reg [3:0] a, b, c, d;
    reg [7:0] N;
    wire [255:0] q_next;
    
    always @ (*) begin 
        for (int i=0; i<16; i++) begin:row
            for (int j=0; j<16; j++) begin:column
                a = i-1;
                b = i+1;
                c = j-1;
                d = j+1;  //overflow handling, wrap around naturally

                N = {q[a*16+c], q[a*16+d], q[b*16+c], q[b*16+d], q[i*16+c], q[i*16+d], q[a*16+j], q[b*16+j]};
                neighbor_cnt = N[0]+N[1]+N[2]+N[3]+N[4]+N[5]+N[6]+N[7];
                case (neighbor_cnt)
                    2: q_next[i*16 + j] = q[i*16 + j];
                    3: q_next[i*16 + j] = 1;
                    default: q_next[i*16 + j] = 0;
                endcase
            end
        end
    end
    
    always @(posedge clk) begin
        if (load) q <= data;
        else q <= q_next;
    end

endmodule

Please note that the code has been translated without any additional content.

英文:

I am doing the Conwaylife question on HDLBits (link: https://hdlbits.01xz.net/wiki/Conwaylife). The question is kind of like a finite state machine, in which I need to update the output with the values calculated from the current output.

From the previous examples on Rule90 and Rule110 (also on the HDLBits), I intuitively wrote the update statement and the read statement of the output register q together in a clocked always block. Code is attached below:

module top_module(
    input clk,
    input load,
    input [255:0] data,
    output [255:0] q ); 
    
    reg [3:0] neighbor_cnt;
    reg [3:0] a, b, c, d;
    reg [7:0] N;
    
    always @ (posedge clk) begin
        if (load) q <= data;
        else begin
            for (int i=0; i<16; i++) begin:row
                for (int j=0; j<16; j++) begin:column
                    a <= i-1;
                    b <= i+1;
                    c <= j-1;
                    d <= j+1;  //overflow handling, wrap around naturally

                    N <= {q[a*16+c], q[a*16+d], q[b*16+c], q[b*16+d], q[i*16+c], q[i*16+d], q[a*16+j], q[b*16+j]};
                    neighbor_cnt <= N[0]+N[1]+N[2]+N[3]+N[4]+N[5]+N[6]+N[7];
                    case (neighbor_cnt)
                        2: q[i*16 + j] <= q[i*16 + j];
                        3: q[i*16 + j] <= 1;
                        default: q[i*16 + j] <= 0;
                    endcase
                end
            end
        end
    end

endmodule

The output is wrong - the simulation result of q becomes all 0 at cycle 2 (cycle 1 is reading input).

Why I cannot read and update the register array at the same time in clocked always block with non-blocking statements? (Conwaylife example)

After some debugging (e.g. just run one cycle and write the middle variables like N and neighbor_cnt into q), I think my q is somehow not synchronous. It is updated too early or too late, as of which I cannot tell. Then I tried to convert all statements into blocking ones (I know it is not appropriate using blocking statements in clocked always, but just trying). The result shows that the q is delayed (the output that is supposed to be cycle 2 shows up at cycle 3, and messed up the following calculation).

Only after some research into others attempts on this question, I wrote a code that runs successfully:

module top_module(
    input clk,
    input load,
    input [255:0] data,
    output [255:0] q ); 
    
    reg [3:0] neighbor_cnt;
    reg [3:0] a, b, c, d;
    reg [7:0] N;
    wire [255:0] q_next;
    
    always @ (*) begin 
        for (int i=0; i<16; i++) begin:row
            for (int j=0; j<16; j++) begin:column
                a = i-1;
                b = i+1;
                c = j-1;
                d = j+1;  //overflow handling, wrap around naturally

                N = {q[a*16+c], q[a*16+d], q[b*16+c], q[b*16+d], q[i*16+c], q[i*16+d], q[a*16+j], q[b*16+j]};
                neighbor_cnt = N[0]+N[1]+N[2]+N[3]+N[4]+N[5]+N[6]+N[7];
                case (neighbor_cnt)
                    2: q_next[i*16 + j] = q[i*16 + j];
                    3: q_next[i*16 + j] = 1;
                    default: q_next[i*16 + j] = 0;
                endcase
            end
        end
    end
    
    always @(posedge clk) begin
        if (load) q <= data;
        else q <= q_next;
    end

endmodule

The difference here is I used an intermediate wire q_next (not register), and place all calculations in a combinational always with blocking statements. But, I am still confused about why this difference could result in a successful code. If anyone can help out, I deeply appreciate that.

答案1

得分: 1

它更新得太早或太晚,至于我无法告诉。

HDLBits界面对于非常简单的Verilog代码是可以的,但对于像你的代码这样的情况不足够:

  • 你无法控制如何驱动输入
  • 你的可见性有限,只有你提供的文本输出和可能是最小的波形
  • 它确实有限的访问权限来运行你自己的iverilog模拟,但它太繁琐,该网站建议你使用自己的模拟器
  • 该工具不提供其测试台代码供您使用

要调试您的问题,您需要:

  • 使用另一个模拟器
  • 创建Verilog测试台代码
  • 运行模拟,转储波形,如VCD
  • 查看内部信号的波形,以查看数据信号的时间是否太早或太晚

HDLBits模拟器太宽容了。我尝试在其他模拟器上运行你的第一个代码示例时,每个都生成了编译错误,因为q必须声明为reg。一种方法是:

output reg [255:0] q ); 

或者,它可以在单独的一行上声明:

output [255:0] q ); 

reg [255:0] q;

您可以在EDA Playground上的多个模拟器上看到相同情况也适用于您的第二个代码示例。

英文:

> It is updated too early or too late, as of which I cannot tell.

The HDLBits interface is fine for very simple Verilog code, but it is inadequate for code like yours:

  • You have no control over how you drive inputs
  • You have limited visibility, with just whatever text output you are provided and maybe minimal waveforms
  • It does have limited access to run your own simulation with iverilog, but it is too cumbersome, and the site recommends you use your own simulator
  • The tool does not make its testbench code available to you

To debug your issue, you need to:

  • Use another simulator
  • Create Verilog testbench code
  • Run a simulation, dumping waveforms, like VCD
  • View waveforms of internal signals to see the timing of the data signal to see if it is too early or too late

The HDLBits simulator is too forgiving. Every other simulator I try your 1st code example on generates compile errors because q must be declared as a reg. One way to do it is:

output reg [255:0] q ); 

Or, it can be declared on a separate line:

output [255:0] q ); 

reg [255:0] q;

You can see this on multiple simulators on EDA Playground. The same is true for your 2nd code example.

huangapple
  • 本文由 发表于 2023年5月24日 17:23:39
  • 转载请务必保留本文链接:https://go.coder-hub.com/76321980.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定