
huangapple go评论83阅读模式

Simple data stream: Go being super slow compared to Java





public static void main(String[] args) {
    long start = System.currentTimeMillis();

    Stream<Container> s = Stream.from(new Iterator<Container>() {
        int i = 0;

        public boolean hasNext() {
            return i < 10000000;

        public Container next() {
            return new Container(i++);

    s = s.map((Container _source) -> new Container(_source.value * 2));

    int j = 0;
    while (s.hasNext()) {

    System.out.println(System.currentTimeMillis() - start);

    System.out.println("j:" + j);

public static class Container {
    int value;

    public Container(int v) {
        value = v;


return new Stream<R>() {
    public boolean hasNext() {
        return Stream.this.hasNext();

    public R next() {
        return _f.apply(Stream.this.next());

Stream类只是对java.util.Iterator的扩展,以添加自定义方法。除了map之外,其他方法与标准的Java Stream API不同。


package main

import (

type Iterator interface {
	HasNext() bool
	Next() interface{}

type Stream interface {
	HasNext() bool
	Next() interface{}
	Map(transformer func(interface{}) interface{}) Stream


type incremetingIterator struct {
	i int

type SampleEntry struct {
	value int

func (s *SampleEntry) Value() int {
	return s.value

func (s *incremetingIterator) HasNext() bool {
	return s.i < 10000000

func (s *incremetingIterator) Next() interface{} {
	s.i = s.i + 1
	return &SampleEntry{
		value: s.i,

func CreateIterator() Iterator {
	return &incremetingIterator{
		i: 0,

type stream struct {
	source Iterator

func (s *stream) HasNext() bool {
	return s.source.HasNext()

func (s *stream) Next() interface{} {
	return s.source.Next()

func (s *stream) Map(tr func(interface{}) interface{}) Stream {
	return &stream{
		source: &mapIterator{
			source:      s,
			transformer: tr,

func FromIterator(it Iterator) Stream {
	return &stream{
		source: it,

type mapIterator struct {
	source      Iterator
	transformer func(interface{}) interface{}

func (s *mapIterator) HasNext() bool {
	return s.source.HasNext()

func (s *mapIterator) Next() interface{} {
	return s.transformer(s.source.Next())

func main() {

	it := CreateIterator()

	ss := FromIterator(it)

	ss = ss.Map(func(in interface{}) interface{} {
		return &SampleEntry{
			value: 2 * in.(*SampleEntry).value,

	for ss.HasNext() {





As a Java dev, I'm currently looking at Go because I think it's an interesting language.

To start with it, I decided to take a simple Java project I wrote months ago, and re-write it in Go to compare performances and (mainly, actually) compare the code readability/complexity.

The Java code sample is the following:

public static void main(String[] args) {
long start = System.currentTimeMillis();
Stream&lt;Container&gt; s = Stream.from(new Iterator&lt;Container&gt;() {
int i = 0;
public boolean hasNext() {
return i &lt; 10000000;
public Container next() {
return new Container(i++);
s = s.map((Container _source) -&gt; new Container(_source.value * 2));
int j = 0;
while (s.hasNext()) {
System.out.println(System.currentTimeMillis() - start);
System.out.println(&quot;j:&quot; + j);
public static class Container {
int value;
public Container(int v) {
value = v;

Where the map function is:

return new Stream&lt;R&gt;() {
public boolean hasNext() {
return Stream.this.hasNext();
public R next() {
return _f.apply(Stream.this.next());

And the Stream class is just an extension to java.util.Iterator to add custom methods to it. Other methods than map differs from standard Java Stream API.

Anyway, to reproduce this, I wrote the following Go code:

package main
import (
type Iterator interface {
HasNext() bool
Next() interface{}
type Stream interface {
HasNext() bool
Next() interface{}
Map(transformer func(interface{}) interface{}) Stream
type incremetingIterator struct {
i int
type SampleEntry struct {
value int
func (s *SampleEntry) Value() int {
return s.value
func (s *incremetingIterator) HasNext() bool {
return s.i &lt; 10000000
func (s *incremetingIterator) Next() interface{} {
s.i = s.i + 1
return &amp;SampleEntry{
value: s.i,
func CreateIterator() Iterator {
return &amp;incremetingIterator{
i: 0,
type stream struct {
source Iterator
func (s *stream) HasNext() bool {
return s.source.HasNext()
func (s *stream) Next() interface{} {
return s.source.Next()
func (s *stream) Map(tr func(interface{}) interface{}) Stream {
return &amp;stream{
source: &amp;mapIterator{
source:      s,
transformer: tr,
func FromIterator(it Iterator) Stream {
return &amp;stream{
source: it,
type mapIterator struct {
source      Iterator
transformer func(interface{}) interface{}
func (s *mapIterator) HasNext() bool {
return s.source.HasNext()
func (s *mapIterator) Next() interface{} {
return s.transformer(s.source.Next())
func main() {
it := CreateIterator()
ss := FromIterator(it)
ss = ss.Map(func(in interface{}) interface{} {
return &amp;SampleEntry{
value: 2 * in.(*SampleEntry).value,
for ss.HasNext() {

Both producing the same result but when Java takes about 20ms, Go takes 1050ms (with 10M items, test ran several times).

I'm very new to Go (started couple of hours ago) so please be indulgent if I did something really bad 简单数据流:与Java相比,Go速度非常慢

Thank you!


得分: 6











type Container struct {
value int

我们将在迭代器和流的Next()方法中使用它:Next() Container,以及在映射函数中使用它:

type Mapper func(Container) Container



package main
import (
type Container struct {
value int
type Iterator interface {
HasNext() bool
Next() Container
type incIter struct {
i int
func (it *incIter) HasNext() bool {
return it.i < 10000000
func (it *incIter) Next() Container {
return Container{value: it.i}
type Mapper func(Container) Container
type Stream interface {
Map(Mapper) Stream
type iterStream struct {
func NewStreamFromIter(it Iterator) Stream {
return iterStream{Iterator: it}
func (is iterStream) Map(f Mapper) Stream {
return mapperStream{Stream: is, f: f}
type mapperStream struct {
f Mapper
func (ms mapperStream) Next() Container {
return ms.f(ms.Stream.Next())
func (ms mapperStream) Map(f Mapper) Stream {
return nil // Not implemented / needed
func main() {
s := NewStreamFromIter(&incIter{})
s = s.Map(func(in Container) Container {
return Container{value: in.value * 2}
start := time.Now()
j := 0
for s.HasNext() {
fmt.Println("j:", j)





package main
import (
type Container struct {
value int
type incIter struct {
i int
func (it *incIter) HasNext() bool {
return it.i < 10000000
func (it *incIter) Next() Container {
return Container{value: it.i}
type Mapper func(Container) Container
type iterStream struct {
func NewStreamFromIter(it *incIter) iterStream {
return iterStream{incIter: it}
func (is iterStream) Map(f Mapper) mapperStream {
return mapperStream{iterStream: is, f: f}
type mapperStream struct {
f Mapper
func (ms mapperStream) Next() Container {
return ms.f(ms.iterStream.Next())
func main() {
s0 := NewStreamFromIter(&incIter{})
s := s0.Map(func(in Container) Container {
return Container{value: in.value * 2}
start := time.Now()
j := 0
for s.HasNext() {
fmt.Println("j:", j)














The other answer changed the original task quite "dramatically", and reverted to a simple loop. I consider it to be different code, and as such, it cannot be used to compare execution times (that loop could be written in Java as well, which would give smaller execution time).

Now let's try to keep the "streaming manner" of the problem at hand.

Note beforehand:

One thing to note beforehand. In Java, the granularity of System.currentTimeMillis() could be around 10 ms (!!) which is in the same order of magnitude of the result! This means the error rate could be huge in Java's 20 ms! So instead you should use System.nanoTime() to measure code execution times! For details, see https://stackoverflow.com/questions/13062345/measuring-time-differences-using-system-currenttimemillis/13062443#13062443.

Also this is not the correct way to measure execution times, as running things for the first time might run several times slower. For details, see https://stackoverflow.com/questions/41608578/order-of-the-code-and-performance/41608707#41608707.


Your original Go proposal runs on my computer roughly for 1.1 seconds, which is about the same as yours.

Removing interface{} item type

Go doesn't have generics, trying to mimic this behavior with interface{} is not the same and have serious performance impact if the value you want to work with is a primitive type (e.g. int) or some simple structs (like the Go equivalent of your Java Container type). See: The Laws of Reflection #The representation of an interface. Wrapping an int (or any other concrete type) in an interface requires creating a (type;value) pair holding the dynamic type and value to be wrapped (creation of this pair also involves copying the value being wrapped; see an analysis of this in the answer https://stackoverflow.com/questions/36077566/how-can-a-slice-contain-itself/36078970#36078970). Moreover when you want to access the value, you have to use a type assertion which is a runtime check, so the compiler can't be of any help optimizing that (and the check will add to the code execution time)!

So let's not use interface{} for our items, but instead use a concrete type for our case:

type Container struct {
value int

We will use this in the iterator's and stream's next method: Next() Container, and in the mapper function:

type Mapper func(Container) Container

Also we may utilize embedding, as the method set of Iterator is a subset of that of Stream.

Without further ado, here is the complete, runnable example:

package main
import (
type Container struct {
value int
type Iterator interface {
HasNext() bool
Next() Container
type incIter struct {
i int
func (it *incIter) HasNext() bool {
return it.i &lt; 10000000
func (it *incIter) Next() Container {
return Container{value: it.i}
type Mapper func(Container) Container
type Stream interface {
Map(Mapper) Stream
type iterStream struct {
func NewStreamFromIter(it Iterator) Stream {
return iterStream{Iterator: it}
func (is iterStream) Map(f Mapper) Stream {
return mapperStream{Stream: is, f: f}
type mapperStream struct {
f Mapper
func (ms mapperStream) Next() Container {
return ms.f(ms.Stream.Next())
func (ms mapperStream) Map(f Mapper) Stream {
return nil // Not implemented / needed
func main() {
s := NewStreamFromIter(&amp;incIter{})
s = s.Map(func(in Container) Container {
return Container{value: in.value * 2}
start := time.Now()
j := 0
for s.HasNext() {
fmt.Println(&quot;j:&quot;, j)

Execution time: 210 ms. Nice, we're already sped it up 5 times, yet we're far from Java's Stream performance.

"Removing" Iterator and Stream types

Since we can't use generics, the interface types Iterator and Stream doesn't really need to be interfaces, since we would need new types of them if we'd wanted to use them to define iterators and streams of another types.

So the next thing we do is we remove Stream and Iterator, and we use their concrete types, their implementations above. This will not hurt readability at all, in fact the solution is shorter:

package main
import (
type Container struct {
value int
type incIter struct {
i int
func (it *incIter) HasNext() bool {
return it.i &lt; 10000000
func (it *incIter) Next() Container {
return Container{value: it.i}
type Mapper func(Container) Container
type iterStream struct {
func NewStreamFromIter(it *incIter) iterStream {
return iterStream{incIter: it}
func (is iterStream) Map(f Mapper) mapperStream {
return mapperStream{iterStream: is, f: f}
type mapperStream struct {
f Mapper
func (ms mapperStream) Next() Container {
return ms.f(ms.iterStream.Next())
func main() {
s0 := NewStreamFromIter(&amp;incIter{})
s := s0.Map(func(in Container) Container {
return Container{value: in.value * 2}
start := time.Now()
j := 0
for s.HasNext() {
fmt.Println(&quot;j:&quot;, j)

Execution time: 50 ms, we've again sped it up 4 times compared to our previous solution! Now that's the same order of magnitude of the Java's solution, and we've lost nothing from the "streaming manner". Overall gain from the asker's proposal: 22 times faster.

Given the fact that in Java you used System.currentTimeMillis() to measure execution, this may even be the same as Java's performance. Asker confirmed: it's the same!

Regarding the same performance

Now we're talking about roughly the "same" code which does pretty simple, basic tasks, in different languages. If they're doing basic tasks, there is not much one language could do better than the other.

Also keep in mind that Java is a mature adult (over 21 years old), and had an enormous time to evolve and be optimized; actually Java's JIT (just-in-time compilation) is doing a pretty good job for long running processes, such as yours. Go is much younger, still just a kid (will be 5 years old 11 days from now), and probably will have better performance improvements in the foreseeable future than Java.

Further improvements

This "streamy" way may not be the "Go" way to approach the problem you're trying to solve. This is merely the "mirror" code of your Java's solution, using more idiomatic constructs of Go.

Instead you should take advantage of Go's excellent support for concurrency, namely goroutines (see go statement) which are much more efficient than Java's threads, and other language constructs such as channels (see answer https://stackoverflow.com/questions/39826692/what-are-golang-channels-used-for/39826883#39826883) and select statement.

Properly chunking / partitioning your originally big task to smaller ones, a goroutine worker pool might be quite powerful to process big amount of data. See

Also you claimed in your comment that "I don't have 10M items to process but more 10G which won't fit in memory". If this is the case, think about IO time and the delay of the external system you're fetching the data from to process. If that takes significant time, it might out-weight the processing time in the app, and app's execution time might not matter (at all).

Go is not about squeezing every nanosecond out of execution time, but rather providing you a simple, minimalist language and tools, by which you can easily (by writing simple code) take control of and utilize your available resources (e.g. goroutines and multi-core CPU).

(Try to compare the Go language spec and the Java language spec. Personally I've read Go's lang spec multiple times, but could never get to the end of Java's.)


得分: 5


values := make([]int64, 10000000)
start := time.Now()
for i := int64(0); i < 10000000; i++ {
    values[i] = 2 * i
fmt.Println("Over after:", time.Now().Sub(start))


package main

import (

type Entry struct {
    Value int64

type EntrySlice []*Entry

func New(l int64) EntrySlice {
    entries := make(EntrySlice, l)
    for i := int64(0); i < l; i++ {
        entries[i] = &Entry{Value: i}
    return entries

func (entries EntrySlice) Map(fn func(i int64) int64) {
    for _, e := range entries {
        e.Value = fn(e.Value)

func main() {

    entries := New(10000000)

    start := time.Now()
    entries.Map(func(v int64) int64 {
        return 2 * v
    fmt.Println("Over after:", time.Now().Sub(start))


  • 传递interface{}类型的参数,请避免这样做。
  • 构建单独的迭代器类型,应使用range或for循环。
  • 分配内存,因此应该在原地进行转换,而不是为每个结果分配新的结构体。


entries.Map(func(e *Entry) *Entry {
    return &Entry{Value: 2 * e.Value}




This is I think an interesting question as it gets to the heart of differences between Java and Go and highlights the difficulties of porting code. Here is the same thing in go minus all the machinery (time ~50ms here):

values := make([]int64, 10000000)
start := time.Now()
for i := int64(0); i &lt; 10000000; i++ {
values[i] = 2 * i
fmt.Println(&quot;Over after:&quot;, time.Now().Sub(start))

More seriously here is the same thing with a map over a slice of entries which is a more idiomatic version of what you have above and could work with any sort of Entry struct. This actually works out at a faster time on my machine of 30ms than the for loop above (anyone care to explain why?), so probably similar to your Java version:

package main
import (
type Entry struct {
Value int64
type EntrySlice []*Entry
func New(l int64) EntrySlice {
entries := make(EntrySlice, l)
for i := int64(0); i &lt; l; i++ {
entries[i] = &amp;Entry{Value: i}
return entries
func (entries EntrySlice) Map(fn func(i int64) int64) {
for _, e := range entries {
e.Value = fn(e.Value)
func main() {
entries := New(10000000)
start := time.Now()
entries.Map(func(v int64) int64 {
return 2 * v
fmt.Println(&quot;Over after:&quot;, time.Now().Sub(start))

Things that will make operations more expensive -

  • Passing around interface{}, don't do this
  • Building a separate iterator type - use range or for loops
  • Allocations - so building new types to store answers, transform in place

Re using interface{}, I would avoid this - this means you have to write a separate map (say) for each type, not a great hardship. Instead of building an iterator, a range is probably more appropriate. Re transforming in place, if you allocate new structs for each result it'll put pressure on the garbage collector, using a Map func like this is an order of magnitude slower:

entries.Map(func(e *Entry) *Entry {
return &amp;Entry{Value: 2 * e.Value}

To stream split the data into chunks and do the same as above (keeping a memo of last object if you depend on previous calcs). If you have independent calculations (not as here) you could also fan out to a bunch of goroutines doing the work and get it done faster if there is a lot of it (this has overhead, in simple examples it won't be faster).

Finally, if you're interested in data processing with go, I'd recommend visiting this new site: http://gopherdata.io/


得分: 0





最终,我仍然存在着10倍的性能差异,但我不会放弃,我会努力更好地理解Go的工作原理,以尝试使其更快速 简单数据流:与Java相比,Go速度非常慢


Just as a complement to the previous comments, I changed the code of both Java and Go implementations to run the test 100 times.

What's interesting here is that Go takes a constant time between 69 and 72ms.

Owever, Java takes 71ms the first time (71ms, 19ms, 12ms) and then between 5 and 7ms.

From my test and understanding, this comes from the fact that the JVM takes a bit of time to properly load the classes and do some optimization.

In the end I'm still having this 10 times performance difference but I'm not giving up and I'll try to have a better understanding of how Go works to try to have it more fast 简单数据流:与Java相比,Go速度非常慢

  • 本文由 发表于 2017年3月17日 01:54:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/42841501.html



:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:
