Powerful engineering tools can help solve today's complex biological and biomedical research challenges - and this first-of-its-kind guide is paving the way. This trail-blazing work gives engineers a quantitative systems approach to bioinformatics research using computational tools drawn from techniSystems bioinformaticsAn Engineering Case-Based ApproachGil alterovitzMarco f ramonieditorsARTECHHOUSEBOSTON LCNDONartechhouse. comLibrary of Congress Cataloging-in-Publication DataA catalog rccord for this book is availablc from the U.S. Library of CongressBritish Library Cataloguing in Publication DataA catalogue record for this book is available from the British library.lSBN13:978-1-59693-1244Cover design by Igor ValdmanO 2007 ARTECH HOUSE INC685 Canton StreetNorwood MA 02062All rights reserved. Printed and bound in the United States of America. No part of thisbook may be reproduced or utilized in any form or by any means, electronic ormechanical, including photocopying, recording, or by any information storage andretrieval system, without permission in writing from the publisher.all terms mentioned in this book that are known to be trademarks or service markshave been appropriately capitalized Artech House cannot attest to the accuracy of thisnformation. Use of a term in this book should not be regarded as affecting the validityof any trademark or service mark10987654321To our parentsContentsPrefacePARTIntroduction: Molecular and Cellular BiologyCHAPTER 1Molecular and Cellular Biology: An Engineering Perspective1.1 Cellular Structures and Functions31.2 Introduction to Information Handling in Cells41.3 The Importance and diversity of Proteins1.4 DNA Replication: Copying the Code61.5 Transcription: Sending a Messenger1.6 Translation: Protein Synthesis1.7 Control of Gene Expression1.8 Genetic Engineering121.9 Summary13CHAPTER 2Proteomics: from genome to proteome152.1 Defining the Proteome152.1.1 From genes to proteins152.1.2 What is proteomics?172.1.3 Functional Proteomics182.2 Building Gene Collections for Functional Proteomics Approaches182.2.1 Selection of Target Genes for a Cloning Project212.2.2 Clone Production252.2.3 Sequencing and analysis322. 2. 4 Clone Maintenance and Distribution342.3 Use of Clones in Functional Proteomics approaches352.3.1 High-Throughput Protein Production362.3.2 Protein Arrays382.3.3 Cell-Based Functional Proteomic assays9VIlVIllContentsPARTAnalysis: Signal Processing47CHaPTER 3Introduction to Biological Signal Processing at the Cell Level493.1 Introduction to Fundamental Signal Processing Concepts513.1.1 Signals513.1.2 Systems543.1.3 Random Processes and Spectral analysis573.2 Signal Detection and Estimation593.2.1 DNA Sequencing603.2.2 Gene Identification673.2.3 Protein Hotspots Identification713.3 System Identification and analysis743.3. 1 Gene Regulation Systems773.3.2 Protein Signaling Systems843.4 Conclusion93CHAPTER 4Signal Processing Methods for Mass Spectrometry1014.1 Introduction1014.1.1 Data Acquisition Methods1024.1.2 History of ionization Techniques1024.1.3 Sample Preparation1034.1.4 Ionization1034.1.5 Separation of Ions by Mass and Charge1034.1.6 Detection of ions and recorded data1044.1.7 Data Preprocessing1044.1.8 Example data1054.2 Signal Resampling1054.2.1 Algorithm Explanation and Discussion1064.2.2 Example Demonstrating Down Sampling1074.3 Correcting the Background1094.3.1 Algorithm Explanation and Discussion1094.3.2 Example Demonstrating Baseline Subtraction1114.4 Aligning Mass/Charge values1124.4.1 Algorithm Explanation and discussion1134.4.2 Example Demonstrating Aligning Mass/Charge Values1144.5 Normalizing Relative Intensity1164.5.1 Example Demonstrating Intensity Normalization1164.6 Smoothing Noise1194.6.1 Lowess Filter Smoothing1204.6.2 Savitzky and golay Filter Smoothing1214.6.3 Example Demonstrating Noise Smoothing1214.7 Identifying Ion Peaks122ContentsPARTAnalysis: Control and Systems125CHAPTER 5Control and Systems fundamentals1275.1 Introduction1275.2 Review of Fundamental Concepts in Control and Systems Theory1285. 2. 1 Discrete-Time Dynamical Systems1325.3 Control Theory in Systems Biology1335.4 Reverse Engineering Cellular Networks1355.5 Gene Networks1375.5.1 Boolean Networks1395.5.2 Dynamic Bayesian Networks1435.6 Conclusion147CHAPTER 6Modeling cellular Networks516.1 Introduction1516.2 Construction and analysis of Kinetic Models1536.2. 1 Parameter Estimation and Modeling resources1536.2.2 A Modular Approach to Model Formulation1546.2. 3 Basic Kinetics1566.2.4 Deterministic Models1586.2.5 Cellular Noise and Stochastic methods1586.2.6 System Analysis Techniques1616.3 Case Studies1646.3.1 Expression of a Single Gene1646.3.2 A Phosphorylation-Dephosphorylation Cycle1666.3.3 A Synthetic Population Control Circuit1686.4 Conclusion172RT VAnalysis: Probabilistic Data Networks and Communications179CHAPTER 7opological Analysis of Biomolecular Networks1817.1 Cellular Networks1817.1.1 Genetic Regulation Networks1827.1.2 Protein-Protein Interaction Networks1847.1.3 Metabolic Regulation Networks1857.1.4 The Scale-Free Property: A Network Characteristic1867.2 The Topology of Cellular Networks1897. 2. 1 Network Motifs in Genetic Regulation Networks1897.2.2 Topological Characterization of Protein Networks1917.2.3 Topology of Metabolic Networks192Contents7. 2.4 Adjacency matrices1967.2.5 Iubs1967. 2.6 Reachability1977.3 Gene Ontology and Functional Clustering of Essential Genes1987.4 Conclusion and Future Avenues201ChAPTER 8Bayesian Networks for Genetic Analysis2058.1 Introduction2058.2 Elements of Population genetics2068. 3 Bayesian Networks2108.3.1 Representation2108.3.1 Learning2138.3.3 Reasoning2178.3.4 Validation and Inference2198.3.5 Risk Prediction2198.4 Two Applications2218.4.1 Stroke risk in Sickle Cell Anemia Subjects2218.4.2 Network Representation of a Complex trait2218. 5 Conclusion224PART VDesign: Synthetic Biology229ChAPTER 9Fundamentals of Design for Synthetic Biology2319.1 Overview2319.2 Circuits2329.2.1 Riboregulators2349.2.2 Feedback Loops2359. 2. 3 Toggle Switches2369.2.4 Logic Gatcs2369.2.5 Oscillators2369.3 Multicellular Systcms2369.4 Challenges2389.4.1 Standardization2389.4.2 Stochasticity2389.4.3 Dircctcd Evoluti2399.4.4 Random and Targeted Mutagenesis and recombination2399.4.5 System Interface2409.4.6 Kinetics2409.5 Conclusion240