字符串数组 – 无需同步?

huangapple go评论75阅读模式
英文:

String array - needless synchronization?

问题

I'm studying ChronicleHFT libraries. I found a class StringInterner posted below:

public class StringInterner {
    @NotNull
    protected final String[] interner;
    protected final int mask;
    protected final int shift;
    protected boolean toggle = false;

    public StringInterner(int capacity) throws IllegalArgumentException {
        int n = Maths.nextPower2(capacity, 128);
        this.shift = Maths.intLog2((long)n);
        this.interner = new String[n];
        this.mask = n - 1;
    }

    @Nullable
    public String intern(@Nullable CharSequence cs) {
        if (cs == null) {
            return null;
        } else if (cs.length() > this.interner.length) {
            return cs.toString();
        } else {
            int hash = Maths.hash32(cs);
            int h = hash & this.mask;
            String s = this.interner[h];
            if (StringUtils.isEqual(cs, s)) {
                return s;
            } else {
                int h2 = hash >> this.shift & this.mask;
                String s2 = this.interner[h2];
                if (StringUtils.isEqual(cs, s2)) {
                    return s2;
                } else {
                    String s3 = cs.toString();
                    this.interner[s != null && (s2 == null || !this.toggle()) ? h2 : h] = s3;
                    return s3;
                }
            }
        }
    }
}

I found a YouTube video by Peter Lawrey where he explains that this class is thread-safe and doesn't need additional synchronization. Video link: https://www.youtube.com/watch?v=sNSD6AUG5a0&t=1200

My question is why this class doesn't need any synchronization?

  1. How about visibility - if one thread puts something into interner[n], are other threads guaranteed to see it?
  2. What happens if the scheduler yields a thread in the middle of the method? Does it lead to putting the same value in the same index twice?
英文:

I'm studying ChronicleHFT libraries. I found a class StringInterner posted below

public class StringInterner {
@NotNull
protected final String[] interner;
protected final int mask;
protected final int shift;
protected boolean toggle = false;
public StringInterner(int capacity) throws IllegalArgumentException {
int n = Maths.nextPower2(capacity, 128);
this.shift = Maths.intLog2((long)n);
this.interner = new String[n];
this.mask = n - 1;
}
@Nullable
public String intern(@Nullable CharSequence cs) {
if (cs == null) {
return null;
} else if (cs.length() > this.interner.length) {
return cs.toString();
} else {
int hash = Maths.hash32(cs);
int h = hash & this.mask;
String s = this.interner[h];
if (StringUtils.isEqual(cs, s)) {
return s;
} else {
int h2 = hash >> this.shift & this.mask;
String s2 = this.interner[h2];
if (StringUtils.isEqual(cs, s2)) {
return s2;
} else {
String s3 = cs.toString();
this.interner
展开收缩
= s3; return s3; } }

I found yt video from Peter Lawrey on which he explains (or to be more precise - he just says) that this class is thread safe and doesn't need any additional synchronization to work in multithreaded environment. Video yt link: https://www.youtube.com/watch?v=sNSD6AUG5a0&t=1200

My question is why this class doesn't need any sync?

  1. How about visibility - if one thread put something into interner[n], does another threads are guaranteed to see it?
  2. What happens in case, when scheduler yields a thread in the middle of method? Does it lead to put same value in same index twice?

答案1

得分: 3

The Javadoc for StringInterner 解释说它在技术上并非线程安全:

StringInterner 只保证它会以正确的方式运行。当您要求它为给定输入返回一个字符串时,它必须返回一个与该 CharSequence 的 toString() 匹配的字符串。

它不保证所有线程看到相同的数据,也不保证多个线程为相同的字符串返回相同的字符串对象。它被设计为尽力而为,以便尽可能轻量化。

因此,尽管在技术上不是线程安全的,但当从多个线程使用时,它不会阻止它正常运行,但它比显式加锁或线程安全更快。注意:它依赖于从 Java 5.0 开始保证的 String 线程安全性。

顺便说一下,我对“String”在 Java 5 之前不是线程安全的说法感到好奇;我想看到引文。

英文:

The Javadoc for StringInterner explains that it's not technically thread-safe:

> StringInterner only guarantees it will behave in a correct manner. When you ask it for a String for a given input, it must return a String which matches the toString() of that CharSequence.
>
> It doesn't guarantee that all threads see the same data, nor that multiple threads will return the same String object for the same string. It is designed to be a best-effort basis so it can be as lightweight as possible.
>
> So while technically not thread safe, it doesn't prevent it operating correctly when used from multiple threads, but it is faster than added explicit locking or thread safety. NOTE: It does rely on String being thread safe, something which was guarenteed from Java 5.0 onwards.

Incidentally, I'm curious about the claim that String was not thread-safe prior to Java 5; I'd love to see a citation.

huangapple
  • 本文由 发表于 2020年8月13日 03:47:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/63383745.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定