Email Link Print Page
  Canada's magazine for data-driven, multi-channel interactive marketers.

The Art & Science
of Predictable Marketing T

Home
Advertising
Subscriptions
Articles
Directory
About Us
Contact Us
Media Partners
Direct Marketing Current Issue

Geodemographics & Profiling

Marketing at the Edge
What's possible now in data mining. By Kurt Thearling

Over the past decade, the term “data mining” has entered our common vocabulary. Everyone seems to understand that large volumes of data can be analyzed or “mined” to find patterns in that data. The discovered patterns might reveal a link between a genetic marker and a disease, or a specific customer demographic with a propensity for responding to a new marketing campaign. The success of data mining hinges on finding persistent patterns that lead to valuable insights.

I’d like to discuss two trends that I see having an impact on data mining for cutting edge marketing applications. The first trend focuses on the data part of data mining.  The second trend involves the extraction of structure from unstructured data.

“The success of data mining hinges on finding persistent patterns that lead to valuable insights.”

Trend 1: Collecting Better Data
Data mining works only when you have a rich and varied data set to feed into the analysis. Up until recently, companies primarily collected data that was naturally generated during a transaction.  For example, the purchase of a CD player in a department store would generate information about the transaction such as price, promotions, manufacturer, as well as information about the purchaser.  This is a reasonable approach to data collection and it works well, but it only tells you about situations that have already occurred. What is missing is the data about other purchases that were not made, either because the customer wasn’t interested or because they did not know about the other product. The data for those potential transactions won’t exist, and can’t be analyzed.

More recently, the process of data mining has focused on getting richer data into the analysis process.  Enriching the data means including more data points that cover the range of possible variable combinations. To get that variation, data mining needs to experiment with the circumstances in which the data was generated to maximize the value of the analysis.  Early forms of such data experimentation go back as far as the eighteenth century when researchers collected and analyzed medical data. In 1935, the scientist Ronald Fisher developed the mathematical foundation for data analysis in his book The Design of Experiments, which lead to significant discoveries in medicine, agriculture, and manufacturing.

But things changed in the late 1990s. The scale at which the experimental data, in particular for marketing applications, could be efficiently and inexpensively collected, grew dramatically. Prior to the late 1990s, a large-scale agricultural test might involve testing only tens (or possibly hundreds) of experimental variations.  Medical clinical trials might be limited to several hundred patients. That is not to say that large-scale experiments were not run. Back then, database marketing campaigns involving mailings to millions of prospective customers were conducted, but not without a great deal of effort to design and carry out, not to mention the significant costs which were proportional to the size of the experiment. Consider a test of a new cell phone marketing campaign composed of mailing two different versions of an offer to one million prospects (not unreasonable if you assume a low response rate). The printing and postage costs alone would have been considerable.

Compare that with what a post-1990s e-commerce site can now do to experiment with a new marketing campaign. If the site is popular (think Amazon.com or Buy.com), it could get thousands of hits in a few minutes. The site could try out a new colour scheme or a new product promotion and very quickly collect data on customer responses. In fact, every interaction that the company has with a customer on its Web site could be an experiment, constantly varying product offers.  Managing this process can be complicated, but there is currently a variety of software packages that can automate the experimentation. Such software allows you to simply choose the characteristics you would like to experiment with and the system will automate the process of varying the Web site and collecting the data.

Once the experimental data is collected, it can be mined to find patterns. For example, you might be interested in knowing that people who bought consumer electronics products in the past month might respond differently to a promotion as compared with people who have not bought electronics but who have bought clothing. The point of experimentation is to collect more data across the space of possible patterns. For example, the experiment could vary the banner color to see if it impacts purchase behavior, or adjust the number of related offers displayed to see if there is any effect.
Another example of using large-scale experiments to collect data for mining occurs in search advertising. Search engines such as Google, Yahoo, and MSN operate by letting advertisers bid on search terms, with the winning bids generating advertisement placements within the search results page.  For example, if you are selling automobile tires, you might want to bid on the search terms “car tires.”  The hope is that people who are searching for car tires might be looking to buy tires, and would click on the advertisement. 

But what are the best terms to use and what amount should you bid to optimize your value? Again, this is where experimentation can be used. An advertiser can build up a collection of words related to the topic (car, auto, automobile, tire, wheel, etc.) and the numerous combinations of those words. These experiments can grow significantly, resulting in experiments with millions of key words. Again, this is where software systems step in to manage the complexity of the process.

Trend 2: Finding Structure in Data
The second trend that I want to discuss relates to the use of unstructured data in data mining. In the past, data mining has most often been used to identify patterns in data that contains built-in structure, typically numbers and categories. In the case of numbers, there is an inherent order, and differences are well defined. Textual data is generally thought of as the prototypical form of unstructured data. While that isn’t strictly true (most languages have an established, but usually complex, structure), traditional data mining techniques for numbers are not a good match for text.

The trend of using unstructured data in data mining is not simply about being able to analyze data with complex or unknown structures; it is actually about being able to extract new structure from the data. Analyzing data about automobile insurance claims might identify patterns showing that certain types of claims are more likely to be fraudulent. It might be more interesting to discover patterns showing that certain people are more likely to be part of a network of individuals responsible for a disproportionate number of fraudulent claims. For example, a particular doctor or lawyer might be part of a fraud ring, or the same phone number might be used by a number of people filing fraudulent claims. Instead of simply being part of one data point, these patterns are derived by analyzing relationships between the data points. A relationship pattern can vary in strength depending on the distance of the relationship as well as the accuracy of the prediction. In the end, the pattern might be a complex network of relationships, such as organized crime and the networks of individuals that make up a criminal organization. More relevant to marketing might be an analysis of data from an online social network. For example, a network of Star Wars movie fans might be interested in purchasing a DVD of the most recent series spin off. Combine that with a network of parents with elementary school age children and those same people might be interested in purchasing Star Wars Lego™ kits (for their children, of course).

While the analysis to extract these structural relationships is more complex than the usual data mining models, there is now software on the market that enables this kind of analysis. This software can explore the multitude of possible relationships between data points, picking out the strongest relationships.

Once you have built a basic data mining capability, implementing a process for collecting better data and finding structure in data will allow you to take your analysis to the next level: to extract more useful and valuable patterns from your data. First, make sure that you have the richest data possible for your analysis. Constantly experiment, collecting data that might not naturally occur. Then, when you have your data, don’t limit yourself to looking for patterns with traditional data mining techniques. Instead, take your analysis further, and look for patterns in how the data is structured. These new trends and techniques will take your marketing strategies to the edge.

Kurt Thearling is vice president of Strategic Technology at Capital One. He has spent the past eighteen years developing data mining and data analysis systems, applying this work to problems in financial services, life sciences, insurance, and telecommunications. He has written numerous articles on the topic of data mining and is a co-author of the book “Building Data Mining Applications for CRM.” His extensive data mining and analytics Web site can be found at www.thearling.com

Home | Advertising | Subscriptions | Articles | Directory | About Us | Contact Us | Media Partners

© 2008-2009 Lloydmedia, Inc.
Formerly Direct Marketing News
302-137 Main Street North
Markham, ON L3P 1Y2 Canada
905-201-6600/1-800-668-1838