Weinberg’s Second Law says that if builders built buildings the way programmers wrote programs, the first woodpecker that came along would destroy the civilization. In practice it means that any software has to be tested very carefully to fix bugs in it in the earliest stage of its lifecycle.
Rapid expansion of multicore and multiprocessor systems makes software developers write parallel programs to use system resources as efficiently as possible. And here another big hazard is hidden – errors, related to incorrect usage of multiple threads of execution. On big production servers it’s a much bigger threat than it seems, because the program code is physically executed on different cores or processors, and concurrent issues arise in their full capacity.
Most well-known problems of such a kind are deadlocks and data races. All concurrent issues are hard to detect manually or by testing, because their nature is essentially nondeterministic. Data race detection is an especial issue, because the effect of their occurrence may become apparent much later. When a data race occurs, global data are corrupted, but the application itself doesn’t halt, it continues to work with incorrect data, and who knows when one would notice it. Data races may be really dangerous – e.g. data race was the cause of accidents with Therac-25, the radiation therapy machine, that gave massive overdoses of radiation to six patients. Also data race was one of the causes that led to the Northeast blackout of 2003 that affected more than 50 million people in the US and Canada.
Good news now: we in Devexperts make every effort to provide high quality for industrial parallel systems that we develop. Particularly, we try to detect data races in our software during development and testing phases. But how? Manual approach is back-strapped here, so we need to use some automatic tools. There are two principal approaches to automatic race detection: static and dynamic. The static ones analyze the application code without executing application itself. They are good at finding certain inconsistencies, but depth of their analysis is strictly limited due to performance issues. The best-known tool is FindBugs, the most powerful (as our research’s revealed) is jChord, an open-source academic project, provided with user guide and documentation and could be used to detect some concurrency issues in early stage.
Dynamic detectors execute simultaneously with tested application and analyze information about it on-the-fly. Unfortunately, there are very few industrial dynamic detectors for Java because of enormously high overhead and technical problems with developing such tools. We are aware of the two attempts – ThreadSanitizer for Java and IBM MSDK. Both of them are unable to analyze even our small internal projects: they fail to launch, go out of memory or hang up. At best еhey produce a large log file with list of data races, overwhelming majority of which are false alarms.
For these reasons we are developing our own dynamic Data Race Detector. This research has been held for several years and lies on the intersection of programming and science. We’ve invented a new approach to reduce the overhead of dynamic analysis without losing accuracy and precision and discussed it with scientific society at several conferences in Russia. Also we’ve developed the detector itself and tested it on synthetic and internal applications. At the moment the beta-version is available and soon (approximately in 1Q2014) we are going to release stable DRD 1.0 and open its sources to community. Also documentation, available here, will be updated and published in English.
So, finally: what should I do to find data races in my software?
- Use static analyzers regularly. The easiest way is to use FindBugs, that is integrated into SonarQube, a famous open source quality management platform.
- Try IBM MSDK, ThreadSanitizer for Java or Data Race Detector on your Dev/QA environments. The last one is unstable, but very easy-to-use and may find some races.
- Keep track of our blog or send us an email – soon DRD 1.0 will be released and we’ll let you know.
Any questions? Would be glad to answer!