《Java高并发程序设计》学习 --6.7. 原子类的增强

xiaoxiao2021-03-25 77

无锁的原子类操作使用系统的CAS指令，有着远远超越锁的性能。在Java 8中引入了LongAddr类，这个类也在java.util.concurrent.atomic包下，因此，它也是使用了CAS指令。 1）更快的原子类：LongAddr AtomicInteger的基本实现机制，它们都是在一个死循环内，不断尝试修改目标值，知道修改成功。如果竞争不激烈，那么修改成功的概率就很高，否则，修改失败的概率就很高。在大量修改失败时，这些原子操作就会进行多次循环尝试，因此性能会受到影响。当竞争激烈的时候，为了进一步提高系统的性能，一种基本方案就是可以使用热点分离，将竞争的数据进行分解，基于这个思路，可以想到一种对传统AtomicInteger等原子类的改进方法。虽然在CAS操作中没有锁，但是像减小锁粒度这种分离热点的思想依然可以使用。一种可行的方案就是仿造ConcurrentHashMap，将热点数据分离。比如，可以将AtomicInteger的内部核心数据value分离成一个数组，每个线程访问时，通过哈希等算法映射到其中一个数字进行计算，而最终的计算结果，则为这个数组的求和累加。热点value被分离成多个单元cell，每个cell独自维护内部的值，当前对象的实际值由所有的cell累计合成，这样，热点就进行了有效的分离，提高了并行度。LongAddr正是使用了这种思想。在实际的操作中，LongAddr并不会一开始就动用数组进行处理，而是将所有数据都先记录在一个称为base的变量中。如果在多线程条件下，大家修改base都没有冲突，那么也没有必要扩展为cell数组。但是，一旦base修改发生冲突，就会初始化cell数组，使用新的策略。如果使用cell数组更新后，发现在某一个cell上的更新依然发生冲突，那么系统就会尝试创建新的cell，或者将cell的数量加倍，以减少冲突的可能。下面简单分析一下increment()方法（该方法会将LongAddr自增1）的内部实现： public void increment() { add(1L); } public void add(long x) { Cell[] as; long b, v; int m; Cell a; //如果cell表为null，会尝试将x累加到base上。 if ((as = cells) != null || !casBase(b = base, b + x)) { /* * 如果cell表不为null或者尝试将x累加到base上失败，执行以下操作。 * 如果cell表不为null且通过当前线程的probe值定位到的cell表中的Cell不为null。 * 那么尝试累加x到对应的Cell上。 */ boolean uncontended = true; if (as == null || (m = as.length - 1) < 0 || (a = as[getProbe() & m]) == null || !(uncontended = a.cas(v = a.value, v + x))) //或者cell表为null，或者定位到的cell为null，或者尝试失败，都会调用下面的Striped64中定义的longAccumulate方法。 longAccumulate(x, null, uncontended); } } 它的核心是addd()方法。最开始cells为null，因此数据会向base增加。但是如果对base的操作冲突，则会设置冲突标记uncontended 为true。接着，如果判断cells数组不可用，或者当前线程对应的cell为null，则直接进入longAccumulate()方法。否则会尝试使用CAS方法更新对应的cell数据，如果成功，则退出，失败则进入longAccumulate()方法。由于longAccumulate()方法的大致内容是，根据需要创建新的cell或者对cell数组进行扩容，以减少冲突。下面，简单地对LongAddr、原子类以及同步锁进行性能测试。测试方法使用多个线程对同一个整数进行累加，观察3中不同方法时所消耗的时间。首先，定义一些辅助变量： private static final int MAX_THREADS = 3; //线程数 private static final int TASK_COUNT = 3; //任务书 private static final int TARGET_COUNT = 3; //线程数 private AtomicLong acount = new AtomicLong(0L); //无锁的原子操作 private LongAddr lacount = new LongAddr(); private long count = 0; static CountDownLatch cdlsync = new CountDownLatch(TASK_COUNT); static CountDownLatch cdlatomic = new CountDownLatch(TASK_COUNT); static CountDownLatch cdladdr = new CountDownLatch(TASK_COUNT); 上述代码中，指定了测试线程数量、目标总数以及3个初始化值为0的整型变量acount、lacount、count。它们分别表示使用AtomicLong、LongAddr和锁进行同步时的操作对象。下面是使用同步锁时的测试代码： protected synchronized long inc() { return ++count; } protected synchronized long getCount() { return count; } public class SyncThread implements Runnable { protected String name; protected long starttime; LongAddrDemo out; public SyncThread(LongAddrDemo o, long starttime) { out = o; this.starttime = starttime; } @Override public void run() { long v = out.getCount(); while(v<TARGET_COUNT) { v = out.inc(); } long endtime = System.currentTimeMills(); System.out.println("SyncThread spend:" + (endtime - starttime) + "ms" + " v=" + v); cdlsync.countDown(); } } public void testSync() throws InterruptedException { ExecutorService exe = Executors.newFixedThreadPool(MAX_THREADS); long starttime = System.currentTimeMills(); SyncThread sync = new SyncThread(this, starttime); for(int i=0; i<TASK_COUNT; i++) { exe.submit(sync); } cdlsync.await(); exe.shutdown(); } 上述代码，定义线程SyncThread，它使用加锁方式增加count的值。在 testSync()方法中，使用线程池控制多线程进行累加操作。使用类似的方法实现原子类累加计时统计： public class AtomicThread implements Runnable { protected String name; protected long starttime; public AtomicThread(long starttime) { this.starttime = starttime; } @Override public void run() { long v = acount.get(); while(v<TARGET_COUNT) { v = acount.incrementAndGet(); } long endtime = System.currentTimeMills(); System.out.println("AtomicThread spend:" + (endtime - starttime) + "ms" + " v=" + v); cdlatomic.countDown(); } } public void testAtomic() throws InterruptedException { ExecutorService exe = Executors.newFixedThreadPool(MAX_THREADS); long starttime = System.currentTimeMills(); AtomicThread sync = new AtomicThread(starttime); for(int i=0; i<TASK_COUNT; i++) { exe.submit(atomic); } cdlatomic.await(); exe.shutdown(); } 同理，以下代码使用LongAddr实现类似功能： public class LongAddrThread implements Runnable { protected String name; protected long starttime; public AtomicThread(long starttime) { this.starttime = starttime; } @Override public void run() { long v = lacount.sum(); while(v<TARGET_COUNT) { lacount.increment(); v = lacount.sum(); } long endtime = System.currentTimeMills(); System.out.println(" LongAddrThread spend:" + (endtime - starttime) + "ms" + " v=" + v); cdladdr.countDown(); } } public void testLongAddr() throws InterruptedException { ExecutorService exe = Executors.newFixedThreadPool(MAX_THREADS); long starttime = System.currentTimeMills(); LongAddrThread sync = new LongAddrThread(starttime); for(int i=0; i<TASK_COUNT; i++) { exe.submit(atomic); } cdladdr.await(); exe.shutdown(); } 注意，由于LongAddr中，将单个数值分解为多个不同的段。因此，在进行累加后，上述代码中increment()函数并不能返回当前的数值。要取得当前的实际值，需要使用sum()函数重新计算。这个计算是需要有额外的成本的，但即使加上这个额外成本，LongAddr的表现还是比AtomicLong要好。就计数性能而言，LongAddr已经超越了普通的原子操作。 LongAddr的另外一个优化手段是避免了伪共存。LongAddr中并不是直接使用padding这种看起来比较碍眼的做法，而是引入了一种新的注释“@sun.misc.Contended“。对于LongAddr中的每一个Cell，它的定义如下： @sun.misc.Contended static final class Cell { volatile long value; Cell(long x) { value=x; } final boolean cas(long cmp, long val) { return UNSAFE.compareAndSwapLong(this, valueOffset, cmp, val); } } 可以看到，在上述代码第1行申明了Cell类为sun.misc.Contended。这将会使得Java虚拟机自动为Cell解决伪共享问题。当然，在我们自己的代码中也可以使用sun.misc.Contended来解决伪共享问题，但是需要额外使用虚拟机参数-XX:-RestrictContended，否则，这个注释将被忽略。 2）LongAddr的功能增强版：LongAccumulator LongAccumulator是LongAddr的亲兄弟，它们有公共的父类Striped64。因此，LongAccumulator内部的优化方式和LongAddr是一样的。它们都将一个long型整数进行分割，存储在不同的变量中，以防止多线程竞争。两者的主要逻辑类似，但是LongAccumulator是LongAddr的功能扩展，对于LongAddr来说，它只是每次对给定的整数执行一次加法，而LongAccumulator则可以实现任意函数惭怍。可以使用下面的构造函数创建一个LongAccumulator实例： public LongAccumulator(LongBinaryOperator accumulatorFunction, long identify) 第一个参数accumulatorFunction就是需要执行的二元函数（接收两个long形参数并返回long），第2个参数是初始值。下面这个例子展示了LongAccurator的使用，它将通过多线程访问若干个整数，并返回遇到的最大的那个数字。 public static void main(String[] args) throws Exception { LongAccumulator accumulator = new LongAccumulator(Long::max, Long.MIN_VALUE); Thread[] ts = new Thread[1000]; for(int i=0; i<1000; i++) { ts[i] = new Thread(()->{ Random random = new Random(); long value = random.nextLong(); accumulator.accumulate(value); }); ts[i].start(); } for(int i=0; i<1000; i++) { ts[1000].join(); } System.out.println(accumulator .longValue); } 上述代码中，构造了LongAccumulator实例。因为要过滤最大值，因此传入Long::max函数句柄。当有数据通过accumulate()方法传入LongAccumulator后，LongAccumulator会通过Long::max识别最大值并且保存在内部（很可能是cell数组，也可能是base）。通过longValue()函数对所有的cell进行Long::max操作，得到最大值。

转载请注明原文地址: https://ju.6miu.com/read-16987.html

技术

最新回复(0)