2023年6月12日 07:03:53go评论209阅读模式

英文:

How to compile Rust for use with WASM's Shared Memory?

问题

当我在不同的Web Workers中运行循环时，尽管该变量应该是线程本地的，但循环会跨线程共享计数器变量。这 不应该 发生，但我不知道如何修复它。

问题的循环在run函数中，如下所示，是Rust代码被编译成WASM的一部分：

#![no_main]
#![no_std]

use core::panic::PanicInfo;
use js::*;

mod js {
    #[link(wasm_import_module = &quot;imports&quot;)]
    extern &quot;C&quot; {
        pub fn abort(msgPtr: usize, filePtr: usize, line: u32, column: u32) -&gt; !;
        pub fn _log_num(number: usize);
    }
}

#[no_mangle]
pub unsafe extern &quot;C&quot; fn run(worker_id: i32) {
    let worker_index = worker_id as u32 - 1;
    let chunk_start = 100 * worker_index;
    let chunk_end = chunk_start + 100; //Total pixels may not divide evenly into number of worker cores.
    for n in chunk_start as usize..chunk_end as usize {
        _log_num(n);
    }
}

#[panic_handler]
unsafe fn panic(_: &amp;PanicInfo) -&gt; ! { abort(0, 0, 0, 0) }

run函数接收线程ID，范围从1到3，然后打印出一百个数字，因此三个线程应该分别记录数字0到299，尽管顺序可能会混合。我期望在线程1中看到1、2、3...，在线程2中看到101、102、103...，在线程3中看到201、202、203...。如果我按顺序运行这些函数，的确会看到这些数字。但是，如果我并行运行它们，每个线程都会帮助其他线程，因此它们会记录类似1、4、7...的数字在第一个线程，2、6、9...在第二个线程，3、5、8...在第三个线程，直到99，所有三个线程都会停止。每个线程都表现得好像它与其他线程共享了chunk_start、chunk_end和n。

这不应该发生，因为.cargo/config.toml指定了 --shared-memory，所以编译器应该在分配内存时使用适当的锁定机制。

[target.wasm32-unknown-unknown]
rustflags = [
    &quot;-C&quot;, &quot;target-feature=+atomics,+mutable-globals,+bulk-memory&quot;,
    &quot;-C&quot;, &quot;link-args=--no-entry --shared-memory --import-memory --max-memory=2130706432&quot;,
]

我知道它被使用了，因为如果我将--shared-memory标志更改为其他内容，rust-lld会抱怨不知道它是什么。

wasm-bindgen的并行演示可以正常工作，所以我知道这是可能的。我只是看不出他们设置了什么使它们正常工作。

也许是我在Web Worker中加载模块的方式？

const wasmSource = fetch(&quot;sim.wasm&quot;) //现在开始请求，我们将需要它

//请参阅消息发送代码，了解为什么我们使用多个消息。
let messageArgQueue = [];
addEventListener(&quot;message&quot;, ({data}) =&gt; {
	messageArgQueue.push(data)
	if (messageArgQueue.length === 4) {
		self[messageArgQueue[0]].apply(0, messageArgQueue.slice(1))
	}
})

self.start = async (workerID, worldBackingBuffer, world) =&gt; {
	const wasm = await WebAssembly.instantiateStreaming(wasmSource, {
		env: { memory: worldBackingBuffer },
		imports: {
			abort: (messagePtr, locationPtr, row, column) =&gt; {
				throw an Error(`? (?:${row}:${column}, thread ${workerID})`)
			},
			_log_num: num =&gt; console.log(`thread ${workerID}: n is ${num}`),
		},
	})

	//初始化线程本地存储，以便我们获得独立的堆栈用于我们的局部变量。
	wasm.instance.exports.__wasm_init_tls(workerID-1)	

	//循环，当“tick”前进时运行Rust记录循环。
	let lastProcessedTick = 0
	while (1) {
		Atomics.wait(world.globalTick, 0, lastProcessedTick)
		lastProcessedTick = world.globalTick[0]
		wasm.instance.exports.run(workerID)
	}
}

这里的worldBackingBuffer是WASM模块的共享内存，它在主线程中创建。

//让我们数到300。我们将有三个Web Workers，每个负责⅓的任务。0-100, 100-200, 200-300...

//首先，分配一些共享内存。（原始任务想要在各个线程之间共享一些值。）
const memory = new WebAssembly.Memory({
	initial: 23,
	maximum: 23,
	shared: true,
})

//然后，分配到内存的数据视图。
//这是由工作线程更新的共享内存，不在主线程上。
const world = {
	globalTick: new Int32Array(memory.buffer, 1200000, 1), //当前全局tick。增加以告诉工作线程在scratchA中计数！
}

//加载一个核心并将“start”事件发送给它。
const startAWorkerCore = coreIndex =&gt; {
	const worker = new Worker(&#39;worker/sim.mjs&#39;, {type:&#39;module&#39;})
	;[&#39;start&#39;, coreIndex+1, memory, world].forEach(arg =&gt; worker.postMessage(arg)) //由于以下错误，将“start”消息封送到多个postMessage中： 1. 必须在world之前传输内存。 https://bugs.chromium.org/p/ch

<details>
<summary>英文:</summary>

When I run a loop in different Web Workers, the loop shares the counter variable across threads despite that the variable should be thread-local. It should **not** do this, but I don&#39;t know how to fix it.

The offending loop is in the `run` function, as follows in the Rust code being compiled to WASM:
```rust
#![no_main]
#![no_std]

use core::panic::PanicInfo;
use js::*;

mod js {
    #[link(wasm_import_module = &quot;imports&quot;)]
    extern &quot;C&quot; {
        pub fn abort(msgPtr: usize, filePtr: usize, line: u32, column: u32) -&gt; !;
        pub fn _log_num(number: usize);
    }
}

#[no_mangle]
pub unsafe extern &quot;C&quot; fn run(worker_id: i32) {
    let worker_index = worker_id as u32 - 1;
    let chunk_start = 100 * worker_index;
    let chunk_end = chunk_start + 100; //Total pixels may not divide evenly into number of worker cores.
    for n in chunk_start as usize..chunk_end as usize {
        _log_num(n);
    }
}

#[panic_handler]
unsafe fn panic(_: &amp;PanicInfo) -&gt; ! { abort(0, 0, 0, 0) }

run is passed the thread id, ranging from 1 to 3 inclusive, and prints out a hundred numbers - so all three threads should log the numbers 0 to 299, albeit in mixed order. I expect to see 1, 2, 3... from thread 1, 101, 102, 103... from thread 2, and 201, 202, 203 from thread 3. If I run the functions sequentially, that is indeed what I see. But if I run them in parallel, I get each thread helping each other thread, so they'll log something like 1, 4, 7 ... on the first thread, 2, 6, 9 on the second, and 3, 5, 8 on the third thread; up to 99, where all three threads will stop. Each thread is behaving like it is sharing chunk_start, chunk_end, and n with the other threads.

It should not do this, because .cargo/config.toml specifies --shared-memory so the compiler should use the appropriate locking mechanisms when allocating memory.

[target.wasm32-unknown-unknown]
rustflags = [
    &quot;-C&quot;, &quot;target-feature=+atomics,+mutable-globals,+bulk-memory&quot;,
    &quot;-C&quot;, &quot;link-args=--no-entry --shared-memory --import-memory --max-memory=2130706432&quot;,
]

I know this is being picked up, because if I change the --shared-memory flag to something else, rust-lld complains it does not know what it is.

wasm-bindgen's parallel demo works fine, so I know it's possible to do this. I just can't spot what they've set to make theirs work.

Perhaps it is something in the way I load my module in the web worker?

const wasmSource = fetch(&quot;sim.wasm&quot;) //kick off the request now, we&#39;re going to need it

//See message sending code for why we use multiple messages.
let messageArgQueue = [];
addEventListener(&quot;message&quot;, ({data}) =&gt; {
	messageArgQueue.push(data)
	if (messageArgQueue.length === 4) {
		self[messageArgQueue[0]].apply(0, messageArgQueue.slice(1))
	}
})

self.start = async (workerID, worldBackingBuffer, world) =&gt; {
	const wasm = await WebAssembly.instantiateStreaming(wasmSource, {
		env: { memory: worldBackingBuffer },
		imports: {
			abort: (messagePtr, locationPtr, row, column) =&gt; {
				throw new Error(`? (?:${row}:${column}, thread ${workerID})`)
			},
			_log_num: num =&gt; console.log(`thread ${workerID}: n is ${num}`),
		},
	})

	//Initialise thread-local storage, so we get separate stacks for our local variables.
	wasm.instance.exports.__wasm_init_tls(workerID-1)	

	//Loop, running the Rust logging loop when the &quot;tick&quot; advances.
	let lastProcessedTick = 0
	while (1) {
		Atomics.wait(world.globalTick, 0, lastProcessedTick)
		lastProcessedTick = world.globalTick[0]
		wasm.instance.exports.run(workerID)
	}
}

worldBackingBuffer here is the shared memory for the WASM module, and it's created in the main thread.

//Let&#39;s count to 300. We&#39;ll have three web workers, each taking ⅓rd of the task. 0-100, 100-200, 200-300...

//First, allocate some shared memory. (The original task wants to share some values around.)
const memory = new WebAssembly.Memory({
	initial: 23,
	maximum: 23,
	shared: true,
})

//Then, allocate the data views into the memory.
//This is shared memory which will get updated by the worker threads, off the main thread.
const world = {
	globalTick: new Int32Array(memory.buffer, 1200000, 1), //Current global tick. Increment to tell the workers to count up in scratchA!
}

//Load a core and send the &quot;start&quot; event to it.
const startAWorkerCore = coreIndex =&gt; {
	const worker = new Worker(&#39;worker/sim.mjs&#39;, {type:&#39;module&#39;})
	;[&#39;start&#39;, coreIndex+1, memory, world].forEach(arg =&gt; worker.postMessage(arg)) //Marshal the &quot;start&quot; message across multiple postMessages because of the following bugs: 1. Must transfer memory BEFORE world. https://bugs.chromium.org/p/chromium/issues/detail?id=1421524 2. Must transfer world BEFORE memory. https://bugzilla.mozilla.org/show_bug.cgi?id=1821582
}

//Now, let&#39;s start some worker threads! They will work on different memory locations, so they don&#39;t conflict.
startAWorkerCore(0) //works fine
startAWorkerCore(1) //breaks counting - COMMENT THIS OUT TO FIX COUNTING
startAWorkerCore(2) //breaks counting - COMMENT THIS OUT TO FIX COUNTING


//Run the simulation thrice. Each thread should print a hundred numbers in order, thrice.
//For thread 1, it should print 0, then 1, then 2, etc. up to 99.
//Thread 2 should run from 100 to 199, and thread 3 200 to 299.
//But when they&#39;re run simultaneously, all three threads seem to use the same counter.
setTimeout(tick, 500)
setTimeout(tick, 700)
setTimeout(tick, 900)
function tick() {
	Atomics.add(world.globalTick, 0, 1)
	Atomics.notify(world.globalTick, 0)
}

But this looks pretty normal. Why am I seeing memory corruption in my Rust for-loop?

答案1

得分: 1

wasm-bindgen 中做了一些魔法 - 开头被替换/注入了修复内存的代码。虽然似乎存在一些问题 -

https://github.com/rustwasm/wasm-bindgen/discussions/3474

https://github.com/rustwasm/wasm-bindgen/discussions/3487

英文:

there is some magic being done in wasm-bindgen - the start is replaced/injected with code fixing memory. Although there seem to be issues with it -

https://github.com/rustwasm/wasm-bindgen/discussions/3474

https://github.com/rustwasm/wasm-bindgen/discussions/3487

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何编译Rust以供WASM的共享内存使用？

问题

答案1

将多维数组转换为切片。

克隆具体类型的自定义结构体为特征对象

在Go语言中进行多线程请求并且无法获得高RPS。

Is there any way to mitigate a 'borrow may still be in use when generator yields' error in nested generators by using lifetimes?

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论