知识库--Limitations of STM 【高并发写 not in favor】（137）

xiaoxiao2021-03-25 76

STM的优点 STM eliminates explicit synchronization. We no longer have to worry if we forgot to synchronize or if we synchronized at the wrong level. There are no issues of failing to cross the memory barrier or race conditions. I can hear the shrewd programmer in you asking “What’s the catch?” Yes, STM has limitations—otherwise, this book would’ve ended right here. It’s suitable only when write collisions are infrequent. If our application has a lot of write contention, we should look beyond STM.

读操作性能突出，但是写就不一定 Let’s discuss this limitation further. STM provides an explicit lock-free programming model. It allows transactions to run concurrently, and they all complete without a glitch （毛刺）where there are no conflicts between them. This provides for greater concurrency and thread safety at the same time. When transactions collide on write access to the same object or data, one of them is allowed to complete, and the others are automatically retried. The retries delay（重试机制延迟） the execution of the colliding writers but provide for greater speed for readers and the winning writer. The performance doesn’t take much of a hit when we have infrequent concurrent writers to the same object. As the collisions increase, things get worse, however. If we have a high rate of write collision to the same data, in the best case our writes are slow. In the worst case, our writes may fail because of too many retries. The examples we saw so far in this chapter showed the benefits of STM. Although STM is easy to use, not all uses will yield good results, as we’ll see in the next example.

In Coordination Using CountDownLatch, we used AtomicLong to synchronize concurrent updates to the total file size when multiple threads explored directories. Furthermore, we would resort to using synchronization if we had more than one variable to change at the same time. It looks like a nice candidate for STM, but the high contentions don’t favor it. Let’s see whether that’s true by modifying the file size program to use STM. Instead of using AtomicLong, we’ll use Akka managed references for the fields in our file size finder. import java.io.File; import java.util.concurrent.Executors; import java.util.concurrent.ExecutorService; import java.util.concurrent.CountDownLatch; import java.util.concurrent.TimeUnit; import akka.stm.Ref; import akka.stm.Atomic; public class FileSizeWSTM { private ExecutorService service; final private Ref<Long> pendingFileVisits = new Ref<Long>(0L); final private Ref<Long> totalSize = new Ref<Long>(0L); final private CountDownLatch latch = new CountDownLatch(1); /** * pendingFileVisits needs to be incremented or decremented within the safe haven of a transaction. In the case of AtomicLong, we used a simple incrementAndGet() or decrementAndGet(). However, since the managed reference is generic（普通引用） and does not specifically deal with numbers, we have to put in some more effort. It’s easier if we isolate that into a separate method. * @param value * @return */ private long updatePendingFileVisits(final int value) { return new Atomic<Long>() { public Long atomically() { pendingFileVisits.swap(pendingFileVisits.get() + value); return pendingFileVisits.get(); } }.execute(); } /** * The method to explore directories and find file sizes should now be pretty easy to implement. It’s a simple conversion from using AtomicLong to using the managed references: * @param file */ private void findTotalSizeOfFilesInDir(final File file) { try { if (!file.isDirectory()) { new Atomic() { public Object atomically() { totalSize.swap(totalSize.get() + file.length()); return null; } }.execute(); } else { final File[] children = file.listFiles(); if (children != null) { for(final File child : children) { updatePendingFileVisits(1); service.execute(new Runnable() { public void run() { findTotalSizeOfFilesInDir(child); } }); } } } if(updatePendingFileVisits(-1) == 0) latch.countDown(); } catch(Exception ex) { System.out.println(ex.getMessage()); System.exit(1); } } private long getTotalSizeOfFile(final String fileName) throws InterruptedException { service = Executors.newFixedThreadPool(100); updatePendingFileVisits(1); try { findTotalSizeOfFilesInDir(new File(fileName)); latch.await(100, TimeUnit.SECONDS); return totalSize.get(); } finally { service.shutdown(); } } public static void main(final String[] args) throws InterruptedException { final long start = System.nanoTime(); final long total = new FileSizeWSTM().getTotalSizeOfFile(args[0]); final long end = System.nanoTime(); System.out.println("Total Size: " + total); System.out.println("Time taken: " + (end - start)/1.0e9); } }

结果分析： The code run into trouble, so at the sign of an exception indicating failure of a transaction, we terminate the application. If a value changes（事务中发生变化会重试） before the transaction is committed, the transaction will be retried automatically. Several threads compete to modify these two mutable variables, and results may vary between slow-running code to outright failure. Go ahead and run the code to explore different directories. I report the output on my system for the /etc and /usr directories here:

Total file size for /etc Total Size: 2266408 Time taken: 0.537082 Total file size for /usr Too many retries on transaction 'DefaultTransaction', maxRetries = 1000 Too many retries on transaction 'DefaultTransaction', maxRetries = 1000 Too many retries on transaction 'DefaultTransaction', maxRetries = 1000

The STM version gave the same file size for the /etc directory as the earlier version that used AtomicLong. However, the STM version was much slower, by about an order of magnitude, because of several retries. Exploring the /usr directory turned out to be much worse—quite a few transactions exceeded the default maximum retries limit. Even though we asked the application to be terminated, since several transactions are running concurrently, we may notice more failures before the first failure gets a chance to terminate the application.

One of the reviewers asked whether using commute instead of alter helps. commute provides higher concurrency than alter since it will not retry transactions; it instead performs the commits separately without holding the calling transaction. However, for the file size program, using commute only marginally helped, and for the large directory hierarchy it did not yield consistent results with good performance. We can also try using atom with the swap! method. The changes to an atom are uncoordinated and synchronized but don’t require a transaction. It can be used only when we want to change one variable (such as the total size in the file size example) and will not encounter any transactional retries. However, we’ll still encounter delays because of the under-the-covers synchronization of changes to the atom. //如果写冲突很多使用 actors not stm The file size program has a very high frequency of write conflicts because a lot of threads try to update the total size. So, STM is not suitable for this problem. STM will serve well and remove the need to synchronize when we have highly frequent reads and infrequent to reasonably frequent write conflicts. If the problem has enormous write collisions, which is rare considering other delays in general applications, don’t use STM. Instead, we can use actors to avoid synchronization.

总结： STM is a very powerful model for concurrency with quite a few benefits: 1 Provides maximum concurrency dictated directly by real application behavior; that is, instead of an overly conservative, predefined synchronization, we let STM dynamically handle contention. 2 Provides explicit lock-free programming model with good thread safety and high concurrent performance. 3 Ensures identities are changed only within transactions. 4 The lack of explicit locks means we don’t have to worry about lock order and related problems. 5 No explicit locks leads to deadlock-free concurrency. 6 Mitigates（缓解） the need for up-front design decisions（前期设计预测） as to who locks and what; instead, we rely on dynamic implicit lock composition. 7 The model is suitable for concurrent reads and infrequent to reasonably frequent write collisions to the same data.

STM provides an effective way to deal with shared mutability if the application data access fits that pattern. If we have huge write collisions, however, we may want to lean toward the actor-based model.

转载请注明原文地址: https://ju.6miu.com/read-17297.html

技术

最新回复(0)