| Preface | xi |
| Acknowledgments | xiii |
| 1 | Introduction | 1 |
| 1.1 | Hybridization | 1 |
| 1.2 | Affymetrix GeneChip Technology | 3 |
| 1.3 | Spotted Arrays | 6 |
| 1.4 | Serial Analysis of Gene Expression (SAGE) | 8 |
| 1.5 | Example: Affymetrix vs. Spotted Arrays | 9 |
| 1.6 | Summary | 11 |
| 1.7 | Further Reading | 13 |
| 2 | Overview of Data Analysis | 15 |
| 3 | Basic Data Analysis | 17 |
| 3.1 | Absolute Measurements | 17 |
| 3.2 | Scaling | 18 |
| 3.2.1 | Example: Linear and Nonlinear Scaling | 20 |
| 3.3 | Detection of Outliers | 20 |
| 3.4 | Fold Change | 21 |
| 3.5 | Significance | 22 |
| 3.5.1 | Nonparametric Tests | 24 |
| 3.5.2 | Correction for Multiple Testing | 24 |
| 3.5.3 | Example I: t-Test and ANOVA | 25 |
| 3.5.4 | Example II: Number of Replicates | 26 |
| 3.6 | Summary | 28 |
| 3.7 | Further Reading | 29 |
| 4 | Visualization by Reduction of Dimensionality | 33 |
| 4.1 | Principal Component Analysis | 33 |
| 4.2 | Example 1: PCA on Small Data Matrix | 35 |
| 4.3 | Example 2: PCA on Real Data | 37 |
| 4.4 | Summary | 37 |
| 4.5 | Further Reading | 39 |
| 5 | Cluster Analysis | 41 |
| 5.1 | Hierarchical Clustering | 41 |
| 5.2 | K-means Clustering | 43 |
| 5.3 | Self-Organizing Maps | 44 |
| 5.4 | Distance Measures | 45 |
| 5.4.1 | Example: Comparison of Distance Measures | 47 |
| 5.5 | Normalization | 49 |
| 5.6 | Visualization of Clusters | 50 |
| 5.6.1 | Example: Visualization of Gene Clusters in Bladder Cancer | 50 |
| 5.7 | Summary | 50 |
| 5.8 | Further Reading | 52 |
| 6 | Beyond Cluster Analysis | 55 |
| 6.1 | Function Prediction | 55 |
| 6.2 | Discovery of Regulatory Elements in Promoter Regions | 56 |
| 6.2.1 | Example 1: Discovery of Proteasomal Element | 57 |
| 6.2.2 | Example 2: Rediscovery of Mlu Cell Cycle Box (MCB) | 57 |
| 6.3 | Integration of data | 58 |
| 6.4 | Summary | 59 |
| 6.5 | Further Reading | 59 |
| 7 | Reverse Engineering of Regulatory Networks | 63 |
| 7.1 | The Time-Series Approach | 63 |
| 7.2 | The Steady-State Approach | 64 |
| 7.3 | Limitations of Network Modeling | 65 |
| 7.4 | Example 1: Steady-State Model | 65 |
| 7.5 | Example 2: Steady-State Model on Real Data | 66 |
| 7.6 | Example 3: Steady-State Model on Real Data | 68 |
| 7.7 | Example 4: Linear Time-Series Model | 68 |
| 7.8 | Further Reading | 71 |
| 8 | Molecular Classifiers | 75 |
| 8.1 | Classification Schemes | 76 |
| 8.1.1 | Nearest Neighbor | 76 |
| 8.1.2 | Neural Networks | 76 |
| 8.1.3 | Support Vector Machine | 76 |
| 8.2 | Example I: Classification of Cancer Subtypes | 77 |
| 8.3 | Example II: Classification of Cancer Subtypes | 78 |
| 8.4 | Summary | 79 |
| 8.5 | Further Reading | 79 |
| 9 | Selection of Genes for Spotting on Arrays | 81 |
| 9.1 | Gene Finding | 82 |
| 9.2 | Selection of Regions Within Genes | 82 |
| 9.3 | Selection of Primers for PCR | 83 |
| 9.4 | Selection of Unique Oligomer Probes | 83 |
| 9.4.1 | Example: Finding PCR Primers for Gene AF105374 | 83 |
| 9.5 | Experimental Design | 84 |
| 9.6 | Further Reading | 84 |
| 10 | Limitations of Expression Analysis | 87 |
| 10.1 | Relative Versus Absolute RNA Quantification | 88 |
| 10.2 | Further Reading | 88 |
| 11 | Genotyping Chips | 91 |
| 11.1 | Example: Neural Networks for GeneChip prediction | 91 |
| 11.2 | Further Reading | 93 |
| 12 | Software Issues and Data Formats | 95 |
| 12.1 | Standardization Efforts | 96 |
| 12.2 | Standard File Format | 97 |
| 12.2.1 | Example: Small Scripts in Awk | 97 |
| 12.3 | Software for Clustering | 98 |
| 12.3.1 | Example: Clustering with ClustArray | 99 |
| 12.4 | Software for Statistical Analysis | 99 |
| 12.4.1 | Example: Statistical Analysis with R | 99 |
| 12.4.2 | The affyR Software Package | 103 |
| 12.4.3 | Commercial Statistics Packages | 103 |
| 12.5 | Summary | 103 |
| 12.6 | Further Reading | 104 |
| 13 | Commercial Software Packages | 105 |
| 14 | Bibliography | 109 |
| Index | 123 |