Archive | Data Mining

The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition (Springer Series in Statistics)

Posted on 22 June 2009 by admin

Author : Trevor Hastie

During the past decade there has been an explosion in computation and information technology. With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It is a valuable resource for statisticians and anyone interested in data mining in science or industry. The book’s coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting—the first comprehensive treatment of this topic in any book.

This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression & path algorithms for the lasso, non-negative matrix factorization, and spectral clustering. There is also a chapter on methods for “wide” data (p bigger than n), including multiple testing and false discovery rates.

Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie co-developed much of the statistical modeling software and environment in R/S-PLUS and invented principal curves and surfaces. Tibshirani proposed the lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, projection pursuit and gradient boosting.

Price & avaiability
List Price : $89.95 , Available from Amazon.com for $71.96

Amazon Link : The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition (Springer Series in Statistics)

Comments (0)

High Performance MySQL: Optimization, Backups, Replication, and More

Posted on 22 June 2009 by admin

Author : Baron Schwartz
High Performance MySQL is the definitive guide to building fast, reliable systems with MySQL. Written by noted experts with years of real-world experience building very large systems, this book covers every aspect of MySQL performance in detail, and focuses on robustness, security, and data integrity. High Performance MySQL teaches you advanced techniques in depth so you can bring out MySQL’s full power. Learn how to design schemas, indexes, queries and advanced MySQL features for maximum performance, and get detailed guidance for tuning your MySQL server, operating system, and hardware to their fullest potential. You’ll also learn practical, safe, high-performance ways to scale your applications with replication, load balancing, high availability, and failover. This second edition is completely revised and greatly expanded, with deeper coverage in all areas. Major additions include:
  • Emphasis throughout on both performance and reliability
  • Thorough coverage of storage engines, including in-depth tuning and optimizations for the InnoDB storage engine
  • Effects of new features in MySQL 5.0 and 5.1, including stored procedures, partitioned databases, triggers, and views
  • A detailed discussion on how to build very large, highly scalable systems with MySQL
  • New options for backups and replication
  • Optimization of advanced querying features, such as full-text searches
  • Four new appendices

The book also includes chapters on benchmarking, profiling, backups, security, and tools and techniques to help you measure, monitor, and manage your MySQL installations.

Price & avaiability
List Price : $49.99 , Available from Amazon.com for $31.49

Amazon Link : High Performance MySQL: Optimization, Backups, Replication, and More

Comments (0)

Hadoop: The Definitive Guide

Posted on 22 June 2009 by admin

Author : Tom White
Hadoop: The Definitive Guide helps you harness the power of your data. Ideal for processing large datasets, the Apache Hadoop framework is an open source implementation of the MapReduce algorithm on which Google built its empire. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoop clusters. Complete with case studies that illustrate how Hadoop solves specific problems, this book helps you:

Use the Hadoop Distributed File System (HDFS) for storing large datasets, and run distributed computations over those datasets using MapReduce Become familiar with Hadoop’s data and I/O building blocks for compression, data integrity, serialization, and persistence Discover common pitfalls and advanced features for writing real-world MapReduce programs Design, build, and administer a dedicated Hadoop cluster, or run Hadoop in the cloud Use Pig, a high-level query language for large-scale data processing Take advantage of HBase, Hadoop’s database for structured and semi-structured data Learn ZooKeeper, a toolkit of coordination primitives for building distributed systems

If you have lots of data — whether it’s gigabytes or petabytes — Hadoop is the perfect solution. Hadoop: The Definitive Guide is the most thorough book available on the subject. “Now you have the opportunity to learn about Hadoop from a master-not only of the technology, but also of common sense and plain talk.” — Doug Cutting, Hadoop Founder, Yahoo!

Price & avaiability
List Price : $44.99 , Available from Amazon.com for $29.69

Amazon Link : Hadoop: The Definitive Guide

Comments (0)

Learning SQL

Posted on 22 June 2009 by admin

Author : Alan Beaulieu
Updated for the latest database management systems — including MySQL 6.0, Oracle 11g, and Microsoft’s SQL Server 2008 — this introductory guide will get you up and running with SQL quickly. Whether you need to write database applications, perform administrative tasks, or generate reports, Learning SQL, Second Edition, will help you easily master all the SQL fundamentals. Each chapter presents a self-contained lesson on a key SQL concept or technique, with numerous illustrations and annotated examples. Exercises at the end of each chapter let you practice the skills you learn. With this book, you will:

Move quickly through SQL basics and learn several advanced features Use SQL data statements to generate, manipulate, and retrieve data Create database objects, such as tables, indexes, and constraints, using SQL schema statements Learn how data sets interact with queries, and understand the importance of subqueries Convert and manipulate data with SQL’s built-in functions, and use conditional logic in data statements

Knowledge of SQL is a must for interacting with data. With Learning SQL, you’ll quickly learn how to put the power and flexibility of this language to work.

Price & avaiability
List Price : $39.99 , Available from Amazon.com for $26.39

Amazon Link : Learning SQL

Comments (0)

Head First SQL: Your Brain on SQL — A Learner’s Guide

Posted on 22 June 2009 by admin

Author : Lynn Beighley
Is your data dragging you down? Are your tables all tangled up? Well we’ve got the tools to teach you just how to wrangle your databases into submission. Using the latest research in neurobiology, cognitive science, and learning theory to craft a multi-sensory SQL learning experience, Head First SQL has a visually rich format designed for the way your brain works, not a text-heavy approach that puts you to sleep.

Maybe you’ve written some simple SQL queries to interact with databases. But now you want more, you want to really dig into those databases and work with your data. Head First SQL will show you the fundamentals of SQL and how to really take advantage of it. We’ll take you on a journey through the language, from basic INSERT statements and SELECT queries to hardcore database manipulation with indices, joins, and transactions. We all know “Data is Power” – but we’ll show you how to have “Power over your Data”. Expect to have fun, expect to learn, and expect to be querying, normalizing, and joining your data like a pro by the time you’re finished reading!

Price & avaiability
List Price : $44.99 , Available from Amazon.com for $29.69

Amazon Link : Head First SQL: Your Brain on SQL — A Learner’s Guide

Comments (0)

Web Analytics: An Hour a Day

Posted on 22 June 2009 by admin

Author : Avinash Kaushik
Written by an in-the-trenches practitioner, this step-by-step guide shows you how to implement a successful Web analytics strategy. Web analytics expert Avinash Kaushik, in his thought-provoking style, debunks leading myths and leads you on a path to gaining actionable insights from your analytics efforts. Discover how to move beyond clickstream analysis, why qualitative data should be your focus, and more insights and techniques that will help you develop a customer-centric mindset without sacrificing your company’s bottom line.

Note: CD-ROM/DVD and other supplementary materials are not included as part of eBook file.

Price & avaiability
List Price : $29.99 , Available from Amazon.com for $18.89

Amazon Link : Web Analytics: An Hour a Day

Comments (0)

Advanced Web Metrics with Google Analytics

Posted on 22 June 2009 by admin

Author : Brian Clifton
Are you getting the most out of your website? Google insider and web metrics expert Brian Clifton reveals the information you need to get a true picture of your site’s impact and stay competitive using Google Analytics (GA) and the latest web metrics methodologies. Which marketing campaigns work best? How do you quantify their success? What indicators should you track? Packed with techniques and insider secrets not documented elsewhere, this book has the expert guidance you need to enhance your brand and increase your site’s ROI.

Price & avaiability
List Price : $39.99 , Available from Amazon.com for $25.19

Amazon Link : Advanced Web Metrics with Google Analytics

Comments (0)

FileMaker Pro 9: The Missing Manual

Posted on 22 June 2009 by admin

Author : Geoff Coffey
FileMaker Pro 9: The Missing Manual is the clear, thorough and accessible guide to this popular desktop database program. FileMaker Pro lets you do almost anything with the information you give it — you can print corporate reports, plan your retirement, or run a small country. This Missing Manual helps non-technical folks like you get in, get your database built, and get the results you need. Pronto.

Price & avaiability
List Price : $27.99 , Available from Amazon.com for $15.39

Amazon Link : FileMaker Pro 9: The Missing Manual

Comments (0)

Competing on Analytics: The New Science of Winning

Posted on 22 June 2009 by admin

Author : Thomas H. Davenport
You have more information at hand about your business environment than ever before. But are you using it to “out-think” your rivals? If not, you may be missing out on a potent competitive tool. In “Competing on Analytics: The New Science of Winning” , Thomas H. Davenport and Jeanne G. Harris argue that the frontier for using data to make decisions has shifted dramatically. Certain high-performing enterprises are now building their competitive strategies around data-driven insights that in turn generate impressive business results. Their secret weapon: Analytics: sophisticated quantitative and statistical analysis and predictive modeling. Exemplars of analytics are using new tools to identify their most profitable customers and offer them the right price, to accelerate product innovation, to optimize supply chains, and to identify the true drivers of financial performance. A wealth of examples – from organizations as diverse as Amazon, Barclay’s, Capital One, Harrah’s, Procter & Gamble, Wachovia, and the Boston Red Sox – illuminate how to leverage the power of analytics.

Price & avaiability
List Price : $29.95 , Available from Amazon.com for $19.77

Amazon Link : Competing on Analytics: The New Science of Winning

Comments (0)

SQL Cookbook (Cookbooks (O’Reilly))

Posted on 22 June 2009 by admin

Author : Anthony Molinaro
You know the rudiments of the SQL query language, yet you feel you aren’t taking full advantage of SQL’s expressive power. You’d like to learn how to do more work with SQL inside the database before pushing data across the network to your applications. You’d like to take your SQL skills to the next level.

Let’s face it, SQL is a deceptively simple language to learn, and many database developers never go far beyond the simple statement: SELECT FROM WHERE . But there is “so” much more you can do with the language. In the “SQL Cookbook,” experienced SQL developer Anthony Molinaro shares his favorite SQL techniques and features. You’ll learn about:

Window functions, arguably the most significant enhancement to SQL in the past decade. If you’re not using these, you’re missing out

Powerful, database-specific features such as SQL Server’s PIVOT and UNPIVOT operators, Oracle’s MODEL clause, and PostgreSQL’s very useful GENERATE_SERIES function

Pivoting rows into columns, reverse-pivoting columns into rows, using pivoting to facilitate inter-row calculations, and double-pivoting a result set

“Bucketization,” and why you should never use that term in Brooklyn.

How to create histograms, summarize data into buckets, perform aggregations over a moving range of values, generate running-totals and subtotals, and other advanced, data warehousing techniques

The technique of “walking a string,” which allows you to use SQL to parse through the characters, words, or delimited elements of a string

Written in O’Reilly’s popular Problem/Solution/Discussion style, the “SQL Cookbook” is sure to please. Anthony’s credo is: “When it comes down to it, we all go to work, weall have bills to pay, and we all want to go home at a reasonable time and enjoy what’s still available of our days.” The “SQL Cookbook” moves quickly from problem to solution, saving you time each step of the way.

Price & avaiability
List Price : $39.95 , Available from Amazon.com for $26.37

Amazon Link : SQL Cookbook (Cookbooks (O’Reilly))

Comments (0)