|
 |
Cambridge International Science Publishing
WRITING FAST PROGRAMS: A PRACTICAL GUIDE FOR SCIENTISTS
AND ENGINEERS
John
Riley DSB Scientific Consulting 616 Finch Ct. Lugoff,
SC 29078-8877, USA |
ORDER HERE
ISBN 1-904602-40-1
340 pages
Softback February
2006
£32/$50 plus
p&p
Order this book
CISP
Homepage Tel:+44
(0)1223 893295 Fax: +44 (0)1223 894539 E-mail
|
"Writing
Fast Programs" provides the basic elements of code optimization and provides
strategies for reducing bottlenecks in practical simulation and numerical
modeling code. The target audience is scientists and engineers and students in
these fields. One pre-publication reviewer called this a much-needed
intermediate text to bridge the gap between existing introductory and more
advance programming books aimed at scientists. "Writing Fast Programs" does not
teach basic programming; some programming proficiency is assumed, along with
familiarity with the basic programming terminology. Code examples are presented
in C, but BASIC (as a convenient pseudo-language) examples are provided for
those not familiar with C. In general, the strategies presented are not
language specific and should therefore benefit a wide programming audience.
More information available from the DSB Scientific website, please click
here. |
Contents
Preface Forward Acknowledgements List of
Figures List of Tables Part I:The Foundations 1. Introduction to
Code Optimization 1.1 Motivation for Writing High Performance Code 1.2
Scientist Programmer vs. Computer Programmer 1.3 Programming Languages
1.3.1 One Extreme: BASIC for Ease of Use 1.3.2 Another Extreme: C For
Speed 1.3.3 Program Development Efficiency 1.3.4 Scripting 1.3.5
Procedural vs. Object Oriented Languages 1.4 General Thoughts on
Optimization 1.4.1 Why Optimize 1.4.2 How Difficult is Optimization
1.4.3 When and What to Optimize 1.5 Overview of Samples Presented in This
Book 1.6 Additional Reading 2. PC Hardware 2.1 Basic System
Architecture 2.2 Numerical Representations 2.3 Addressing 2.4 Basic
CPU Architecture and Introduction to ASSEMBLY Language 2.5 Code Execution
and Timing 2.6 Pipelining, Speculative Execution and Branch Prediction
2.7 Optimization at the CPU Level 2.8 A Brief Historical Summary of
Desktop Computer CPUs 2.8.1 Intel Processors 2.8.2 AMD Processors
2.8.3 Cyrix Processors 2.8.4 Motorola/IBM Processors 2.9 Modern
Physical Memory Architectures 2.9.1 Basic Memory Architecture 2.9.2
SDRAM and DDR SDRAM Memory 2.9.3 RAMBUS Memory 3. Operating System
Considerations 3.1 The Operating System in Perspective 3.2 Operating
Systems and Performance 3.3 Operating System Architectures 3.4 Specific
PC Operating Systems 3.4.1 MS Windows 3.4.2 Linux 3.5 A Brief
Comparison of Windows and Linux Performance 3.5.1 Simple Numerical
Procedure 3.5.2 Thread/Process Creation 3.5.3 Context Switching 4.
Compiler Considerations 4.1 Interpreters vs. Compilers 4.2 Compiler
Optimizations 4.2.1 Aliasing 4.2.2 Array Bounds Checking 4.2.3
Numerical Overflow Checking 4.2.4 Unrounded Floating Point Operations
4.2.5 FDIV Bug Checking 4.2.6 Inline and Intrinsic Functions 4.2.7
Common Sub Expressions 4.3 Using Programs Compiled with Old Compilers
4.4 C++ and Similar Compilers 4.4.1 MS Visual C++ 4.4.2 MS Visual
C# 4.4.3 g++ 4.4.4 VectorC 4.4.5 KAI C++ Part II: Implementation
5. Data Management 5.1 Implicit Declaration and the Variant Problem
5.2 Eliminating Implicit Type Conversions 5.2.1 Type Matching to
Function Calls 5.2.2 Loop Counters 5.3 Immediates, Constants and
Variables 5.4 Type Specific Operators 5.5 Variable Scope 5.6 Data
Organization 5.6.1 Scalars, Arrays and Hashes 5.6.2 Pointers 5.6.3
Queues and Stacks 5.6.4 Linked Lists and Trees 5.6.5 Structures/User
Defined Types 5.6.6 Objects and Classes 5.7 Naming Variables,
Functions, Objects and Classes 6. Function and Procedure Calling:
Optimizing Program Flow 6.1 Mechanism of Function Calling and Inline Code
6.2 Programming in the Sub-Routine Style 6.3 Calling with Reduced Stack
Overhead 6.4 Using Library Functions 6.5 Hand Coding vs. Using Language
Functions: A Pseudo Random Number Generator 6.6 Recursive Functions: The
Factorial 6.7 Function Calling Conventions 7. Loops and Vectors 7.1
General Vector Concepts and the Potential for Complex Code Structures 7.2
Loop Unrolling in a Trade-Off with Generality Matrix Multiplication 7.3
Partial Loop Unrolling Simpsonâs Rule Integration 7.4 A
Practical Example: Time Dependent Heat Flow 8. Programming in the RISC
Style 8.1 The KISS Principle 8.2 Practical Example The Lennard Jones
Energy 8.3 Practical Example The Boltzmann Factor 8.4 Other
Mathematical Tricks 9. Look-Up Tables 9.1 Memory Access vs.
Computation 9.2 The LUT in Action: 1 Dimensional Examples 9.2.1 Lennard
Jones Energy 9.2.2 Other 1-D Examples 9.2.3 Introduced Error in the LUT
Technique 9.3 The LUT in Action: 2 Dimensional Examples 9.3.1 Distance
9.3.2 Heterogeneous Lennard Jones Energy 9.4 Virtual vs. Physical
Memory 10. Other Algorithm Optimization Techniques 10.1 The Use of
Symmetry 10.2 Elimination of Nested Loops 10.3 Reducing Decision Logic
Overhead: Bit Flag Encoding Part III: Parallel Processing 11.
Multi-Tasking Basics 11.1 Multi-Tasking Terms 11.2 Multi-Threaded
Programming 11.2.1 Simple Thread Creation and Termination 11.2.2
Communicating between Threads 11.2.3 When to Use Multiple Threads 11.3
Multi-Process Programming 11.3.1 Simple Process Creation and Termination
11.3.2 Communicating Between Processes 11.3.3 When to Use Multiple
Processes 12. Parallel Computation Basics 12.1 Parallel Architecture
Basics 12.1.1 Multiple Execution Units in the CPU 12.1.2 Symmetric
Multi-Processors 12.1.3 Single Instruction, Multiple Data (SIMD) 12.1.4
Multiple Instruction, Multiple Data (MIMD) 12.1.5 Algorithm
Considerations 12.1.6 Performance Considerations in Parallel Systems
12.2 SIMD 12.2.1 Integer Array Addition with MMX 12.2.2 Floating
Point SIMD and Testing for SIMD Capability 12.2.3 Vector Dot Product using
SIMD 12.2.4 4x4 Matrix Multiplication using SIMD 12.2.5
Simpsonâs Rule Integration using SIMD 12.2.6 A Comparison
of 3dNow! to SSE SIMD Performance 12.2.7 Compilers for SIMD and Additional
SIMD Extensions 12.3 MIMD 12.3.1 Networks of Workstations 12.3.2
Clusters 12.3.3 Distributed Computing 12.4 Implementing MIMD 12.4.1
Simple Parallel Computing Without Messages 12.4.2 Simple Master-Slave
Parallel Computing 12.4.3 Simple Message Passing Parallel Computing
12.5 Message Passing Tools 12.5.1 Parallel Virtual Machine (PVM)
12.5.2 Message Passing Interface (MPI) 12.5.3 Other Tools Appendix
A: A List of Modern Programming Tools Appendix B: How to Buy A High
Performance PC Appendix C: Contents of the CD-Rom Appendix D: Complete
Listing of MCGas Monte Carlo Program (Demo) Appendix E: BASIC Listings for
the Part I and Part II Demo Programs Bibliography Index |
|
| |
|
|
|
| |
|
|
|
| |
|
| |
|
| |
|
|
|