Case Studies
Mapping Landforms on Mars
Challenge
The goal of this project was to develop a robust system for the automatic geomorphic mapping of land surfaces by fusing pattern recognition tools, including machine learning, computer vision, and data compression.
Solution
The project utilized topographic data as input for the automatic identification of landforms on Mars using geomorphic mapping. The use of topography over traditionally used imagery enables the automation of mapping. Our approach yielded results that, in terms of appearance and content, closely mimic manually derived maps. Techniques from machine learning, specifically semi-supervised learning and meta-learning, were used to minimize expert intervention and maximize efficiency during map generation.




Failure Prediction - Oil and Gas
Challenge
In another project, we studied the problem of stuck pipes, which accounts for several million dollars' worth of non-productive time for oil & gas operators each year; developing a method to predict this event in real-time had become a high priority for the drilling industry. Automatically predicting stuck pipe events is now possible due to modern sensor techniques and advanced data analysis tools.
Solution
We conducted a comprehensive study that uses machine learning for the accurate prediction of stuck pipes. Examples of variables involved in monitoring the drilling process included pore pressure, hook load, rate of penetration, weight on bit, standpipe pressure, and others. Our experiments considered time-based data with warning windows of fifteen minutes, thirty minutes, and one hour. We generated user interfaces that provided information to gas & oil operators regarding the possibility of a stuck pipe.


Identifying Hidden Patterns - Particle Physics
Challenge
Experiments in particle physics face the challenge of identifying elementary particles produced at the forefront of energy colliders. Typical colliders have millions of electronic channels, producing vast amounts of data per second. These data are analyzed in real-time and reduced to a few terabytes per day, which are stored for later analysis. Of the billion particle collisions occurring each second, only a few are interesting. Finding these interesting—but possibly unanticipated—collisions in such a massive data stream represents a challenging test of forefront technology and computational power.
Solution
We used machine learning techniques for classification. First, we trained a predictive model using simulated data (Monte Carlo simulations) to obtain a model in a controlled setting. Then, we tested our model on real data using several classification techniques. Our final model can efficiently detect event signals with high confidence.


Predicting Computer Performance - Time Series Analysis
Challenge
This project aimed to predict service problems in computer networks and respond to those predictions by applying corrective actions. This is important because detecting system failures on a few servers can prevent the spread of those failures across the entire network. Additionally, prediction can be used to ensure the continuous provision of network services through the automatic implementation of corrective actions.
Solution
We performed experiments on a central database with thousands of computers. We formed predictions for six important parameters: response time, maximum response time, CPU utilization, memory utilization, disk utilization, and disk arm utilization. For all six performance variables under study, applying the model to the learning approach yielded significant gains in accuracy.

