Lawrence Berkeley National Laboratory masthead A-Z Index Berkeley Lab masthead U.S. Department of Energy logo Phone Book Jobs Search
Tech Transfer
Licensing Interest Form Receive Customized Tech Alerts

Efficient Data Reduction Method with Locally Exchangeable Measures





Berkeley Lab researcher Alexander Sim and colleagues have developed a dynamic sampling algorithm that reduces large streaming data, yet provides accurate information about the data for analysis. The Berkeley Lab technology could prove beneficial to network routers, for use in network monitoring mechanisms; facilities that generate large amounts of data, as a means to reduce data volume; and social networks, among other applications.

Large streaming data are an essential part of computational modeling and network communications. Yet such data are generally intractable to store, compute, search, and retrieve. This dynamic data reduction algorithm detects redundant patterns and reduces data size by exploiting the exchangeability of measurements; it exploits both redundancies of data in a time series and redundancies of data distribution. The Berkeley Lab technology can be used for streaming data in high frequency as well as stored data.

A common technique in network monitoring and other practices to reduce the size of collected monitoring measurements is to store a random sample, such as one out of 1,000 network packets. The drawbacks to this approach are lack of scalability for high frequency streaming data and no guarantee of reflecting underlying data distribution. Another method is to use the exact or approximate data compression technique, such as spectral analysis. However, current data compression methods require use of either whole data or data chunks of a designated size; these methods are impractical for large streaming data in high frequency. Berkeley Lab’s algorithm resolves drawbacks to the above approaches.

DEVELOPMENT STAGE: Proven principle. Data reduction between 47% and 80% demonstrated in experiments. Potential for exponential scale data deductions for streaming data. Development aiming for a prototype is ongoing.

STATUS: Patent pending. Available for licensing or collaborative research.

Choi, J., Hu, K., and Sim, A. Relational Dynamic Bayesian Networks with Locally Exchangeable Measures. Computational Research Division, Lawrence Berkeley National Laboratory.



See More Computing Technologies