Welcome Guest!
Create Account | Login
Locator+ Code:

Search:
FTPOnline Channels Conferences Resources Hot Topics Partner Sites Magazines About FTP RSS 2.0 Feed

Free Trial Issue of Visual Studio Magazine

email article
printer friendly
get the code
more resources

Mine the Predictive Power of Data
Learn how to use the built-in functionality of SQL Server to make your business processes more efficient with data mining.
by Sara Rea

February 10, 2005

Technology Toolbox: VB.NET, SQL Server 2000, Analysis Services

Editor’s Note: Portions of this article have been adapted from Chapter 5, “Data-Mining Predictions,” and Chapter 6, “Applying Data-Mining Predictions,” of Sara Rea’s book, Building Intelligent .NET Applications: Agents, Data Mining, Rule-Based Systems, and Speech Processing, with permission from Addison-Wesley Professional [2005, ISBN: 0321246268]. © 2005 Pearson Education Inc. Publishing as Addison-Wesley. For more information, go to www.awprofessional.com/title/0321246268. To read Chapter 5 in its entirety, go to ftponline.com/books/chapters/default.asp?isbn=0321246268.

Data mining is the process of extracting meaningful information from large quantities of data. It involves uncovering patterns in the data and is often tied to data warehousing because it attempts to make large amounts of data usable.

ADVERTISEMENT

Data elements fall into distinct categories; these categories enable you to make predictions about other pieces of data. For example, a bank might wish to ascertain the characteristics that typify customers who pay back loans. Although this could be done with database queries, the bank would first have to know which customer attributes to query for. Banks can use data mining to identify what those attributes are and then make predictions about future customer behavior.

One of the more difficult aspects of data mining has always been translating the theory into practice. Practical examples can be hard to come by, especially the code that backs them up. I’ll show you how a fictional retailer named Savings Mart could use Microsoft’s Analysis Services, included with Microsoft SQL Server 2000, to improve operational efficiencies and reduce costs. I’ll explain the techniques required to implement and refine data mining, as well as provide the code you need to test this out.

You’ll be able to use the techniques described and code provided to create a mining model that lets you predict the right amount of shipments to each store. The mining model is the first step toward revising the way Savings Mart procedurally handles product orders and shipments, with the end goal being reduced operating costs.

You can implement data mining with Analysis Services using one of two popular mining algorithms: decision trees and clustering. You use these algorithms to find meaningful patterns in a group of data, then make predictions about the data. Decision trees are useful for predicting exact outcomes. For example, you can apply the decision trees algorithm to a training data set to create a tree that allows the user to map a path to a successful outcome. At every node along the tree, the user answers a question (or makes a “decision”), such as “years applicant has been at current job (0–1, 1–5, >5 years).” I’ll teach you how to use the decision trees algorithm to build a mining model based on shipment dates.

Savings Mart is a fictitious discount retailer operating in a single American state. It has been in business since 2001 and hopes to open new stores by achieving greater operational efficiencies. Since its inception, Savings Mart has relied on a system of adjusting product inventory thresholds to determine when shipments will be made to stores. Every time someone purchases a product, the company’s business software updates the quantity for that product and store. When the quantity dips below the minimum threshold allowed for that product and store, the software automatically generates an order that is delivered three days later.

This process might seem like a good way to ensure that the stores are well stocked, but it results in shipments being made to each store almost every day. This means high overhead costs for each store. Management wants to replace the order/shipment strategy with a system designed to predict shipment dates rather than rely on adjustable thresholds.




Back to top














Java Pro | Visual Studio Magazine | Windows Server System Magazine
.NET Magazine | Enterprise Architect | XML & Web Services Magazine
VSLive! | Thunder Lizard Events | Discussions | Newsletters | FTP Home