CITATION

Zikopoulos, Paul; deRoos, Dirk; Parasuraman, Krishnan; Deutsch, Thomas; Giles, James; and Corrigan, David. Harness the Power of Big Data The IBM Big Data Platform. McGraw-Hill Osborne Media, 2012.

Harness the Power of Big Data The IBM Big Data Platform

Published:  November 2012

eISBN: 9780071808187 0071808183 | ISBN: 9780071808170
  • Cover
  • About the Authors
  • Title Page
  • Copyright Page
  • Contents
  • Foreword
  • Preface
  • Acknowledgments
  • About This Book
  • Part I: The Big Deal About Big Data
  • 1 What Is Big Data?
  • Why Is Big Data Important?
  • Now, the "What Is Big Data?" Part
  • Brought to You by the Letter V: How We Define Big Data
  • What About My Data Warehouse in a Big Data World?
  • Wrapping It Up
  • 2 Applying Big Data to Business Problems: A Sampling of Use Cases
  • When to Consider a Big Data Solution
  • Before We Start: Big Data, Jigsaw Puzzles, and Insight
  • Big Data Use Cases: Patterns for Big Data Deployment
  • You Spent the Money to Instrument It-Now Exploit It!
  • IT for IT: Data Center, Machine Data, and Log Analytics
  • What, Why, and Who? Social Media Analytics
  • Understanding Customer Sentiment
  • Social Media Techniques Make the World Your Oyster
  • Customer State: Or, Don't Try to Upsell Me When I Am Mad
  • Fraud Detection: "Who Buys an Engagement Ring at 4 A.M.?"
  • Liquidity and Risk: Moving from Aggregate to Individual
  • Wrapping It Up
  • 3 Boost Your Big Data IQ: The IBM Big Data Platform
  • The New Era of Analytics
  • Key Considerations for the Analytic Enterprise
  • The Big Data Platform Manifesto
  • IBM's Strategy for Big Data and Analytics
  • 1. Sustained Investments in Research and Acquisitions
  • 2. Strong Commitment to Open Source Efforts and a Fostering of Ecosystem Development
  • 3. Support Multiple Entry Points to Big Data
  • A Flexible, Platform-Based Approach to Big Data
  • Wrapping It Up
  • Part II: Analytics for Big Data at Rest
  • 4 A Big Data Platform for High-Performance Deep Analytics: IBM PureData Systems
  • Netezza's Design Principles
  • Appliance Simplicity: Minimize the Human Effort
  • Hardware Acceleration: Process Analytics Close to the Data Store
  • Balanced, Massively Parallel Architecture: Deliver Linear Scalability
  • Modular Design: Support Flexible Configurations and Extreme Scalability
  • What's in the Box? The Netezza Appliance Architecture Overview
  • A Look Inside the Netezza Appliance
  • The Secret Sauce: FPGA-Assisted Analytics
  • Query Orchestration in Netezza
  • Platform for Advanced Analytics
  • Extending the Netezza Analytics Platform with Hadoop
  • Customers' Success Stories: The Netezza Experience
  • T-Mobile: Delivering Extreme Performance with Simplicity at the Petabyte Scale
  • State University of New York: Using Analytics to Help Find a Cure for Multiple Sclerosis
  • NYSE Euronext: Reducing Data Latency and Enabling Rapid Ad-Hoc Searches
  • 5 IBM's Enterprise Hadoop: InfoSphere BigInsights
  • What the Hadoop!
  • Where Elephants Come From: The History of Hadoop
  • Components of Hadoop and Related Projects
  • Hadoop 2. 0
  • What's in the Box: The Components of InfoSphere BigInsights
  • Hadoop Components Included in InfoSphere BigInsights 2. 0
  • The BigInsights Web Console
  • The BigInsights Development Tools
  • BigInsights Editions: Basic and Advanced
  • Deploying BigInsights
  • Ease of Use: A Simple Installation Process
  • A Low-Cost Way to Get Started: Running BigInsights on the Cloud
  • Higher-Class Hardware: IBM PowerLinux Solution for Big Data
  • Cloudera Support
  • Analytics: Exploration, Development, and Deployment
  • Advanced Text Analytics Toolkit
  • Machine Learning for the Masses: Deep Statistical Analysis on BigInsights
  • Analytic Accelerators: Finding Needles in Haystacks of Needles?
  • Apps for the Masses: Easy Deployment and Execution of Custom Applications
  • Data Discovery and Visualization: BigSheets
  • The BigInsights Development Environment
  • The BigInsights Application Lifecycle
  • Data Integration
  • The Anlaytics-Based IBM PureData Systems and DB2
  • JDBC Module
  • InfoSphere Streams for Data in Motion
  • InfoSphere DataStage
  • Operational Excellence
  • Securing the Cluster
  • Monitoring All Aspects of Your Cluster
  • Compression
  • Improved Workload Scheduling: Intelligent Scheduler
  • Adaptive MapReduce
  • A Flexible File System for Hadoop: GPFS-FPO
  • Wrapping It Up
  • Part III: Analytics for Big Data in Motion
  • 6 Real-Time Analytical Processing with InfoSphere Streams
  • The Basics: InfoSphere Streams
  • How InfoSphere Streams Works
  • What's a Lowercase "stream"?
  • Programming Streams Made Easy
  • The Streams Processing Language
  • Source and Sink Adapters
  • Operators
  • Streams Toolkits
  • Enterprise Class
  • High Availability
  • Integration Is the Apex of Enterprise Class Analysis
  • Industry Use Cases for InfoSphere Streams
  • Telecommunications
  • Enforcement, Defense, Surveillance, and Cyber Security
  • Financial Services Sector
  • Health and Life Sciences
  • And the Rest We Can't Fit in This Book ...
  • Wrapping It Up
  • Part IV: Unlocking Big Data
  • 7 If Data Is the New Oil-You Need Data Exploration and Discovery
  • Indexing Data from Multiple Sources with InfoSphere Data Explorer
  • Connector Framework
  • The Data Explorer Processing Layer
  • User Management Layer
  • Beefing Up InfoSphere BigInsights
  • An App with a View: Creating Information Dashboards with InfoSphere Data Explorer Application Builder
  • Wrapping It Up: Data Explorer Unlocks Big Data
  • Part V: Big Data Analytic Accelerators
  • 8 Differentiate Yourself with Text Analytics
  • What Is Text Analysis?
  • The Annotated Query Language to the Rescue!
  • Productivity Tools That Make All the Difference
  • Wrapping It Up
  • 9 The IBM Big Data Analytic Accelerators
  • The IBM Accelerator for Machine Data Analytics
  • Ingesting Machine Data
  • Extract
  • Index
  • Transform
  • Statistical Modeling
  • Visualization
  • Faceted Search
  • The IBM Accelerator for Social Data Analytics
  • Feedback Extractors: What Are People Saying?
  • Profile Extractors: Who Are These People?
  • Workflow: Pulling It All Together
  • The IBM Accelerator for Telecommunications Event Data Analytics
  • Call Detail Record Enrichment
  • Network Quality Monitoring
  • Customer Experience Indicators
  • Wrapping It Up: Accelerating Your Productivity
  • Part VI: Integration and Governance in a Big Data World
  • 10 To Govern or Not to Govern: Governance in a Big Data World
  • Why Should Big Data Be Governed?
  • Competing on Information and Analytics
  • The Definition of Information Integration and Governance
  • An Information Governance Process
  • The IBM Information Integration and Governance Technology Platform
  • IBM InfoSphere Business Information Exchange
  • IBM InfoSphere Information Server
  • Data Quality
  • Master Data Management
  • Data Lifecycle Management
  • Privacy and Security
  • Wrapping It Up: Trust Is About Turning Big Data into Trusted Information
  • 11 Integrating Big Data in the Enterprise
  • Analytic Application Integration
  • IBM Cognos Software
  • IBM Content Analytics with Enterprise Search
  • SPSS
  • SAS
  • Unica
  • Q1 Labs: Security Solutions
  • IBM i2 Intelligence Analysis Platform
  • Platform Symphony MapReduce
  • Component Integration Within the IBM Big Data Platform
  • InfoSphere BigInsights
  • InfoSphere Streams
  • Data Warehouse Solutions
  • The Advanced Text Analytics Toolkit
  • InfoSphere Data Explorer
  • InfoSphere Information Server
  • InfoSphere Master Data Management
  • InfoSphere Guardium
  • InfoSphere Optim
  • WebSphere Front Office
  • WebSphere Decision Server: iLog Rules
  • Rational
  • Data Repository-Level Integration
  • Enterprise Platform Plug-ins
  • Development Tooling
  • Analytics
  • Visualization
  • Wrapping It Up