An Introduction to Computer Simulation Methods
Applications to Physical System
Harvey Gould, Jan Tobochnik, and Wolfgang Christian
August 27, 2016
Contents
Preface i
1 Introduction 1
1.1 Importance of computers in physics . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 The importance of computer simulation . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Programming languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Object oriented techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 How to use this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Tools for Doing Simulations 12
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Simulating Free Fall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Getting Started with Object-Oriented Programming . . . . . . . . . . . . . . . . . 19
2.4 Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.5 The Open Source Physics Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6 Animation and Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.7 Model-View-Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3 Simulating Particle Motion 45
3.1 Modiﬁed Euler algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2 Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.3 Drawing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.4 Specifying The State of a System Using Arrays . . . . . . . . . . . . . . . . . . . . . 51
3.5 The ODE Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.6 The ODESolver Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.7 Eﬀects of Drag Resistance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.8 Two-Dimensional Trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.9 Decay Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.10 ∗Visualizing Three-Dimensional Motion . . . . . . . . . . . . . . . . . . . . . . . . 68
3.11 Levels of Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
1
CONTENTS 2
4 Oscillations 85
4.1 Simple Harmonic Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.2 The Motion of a Pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.3 Damped Harmonic Oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.4 Response to External Forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.5 Electrical Circuit Oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.6 Accuracy and Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.7 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5 Few-Body Problems: The Motion of the Planets 108
5.1 Planetary Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.2 The Equations of Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.3 Circular and Elliptical Orbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.4 Astronomical Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.5 Log-log and Semilog Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.6 Simulation of the Orbit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.7 Impulsive Forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
5.8 Velocity Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.9 A Mini-Solar System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.10 Two-Body Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.11 Three-body problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.12 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
6 The Chaotic Motion of Dynamical Systems 142
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.2 A Simple One-Dimensional Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.3 Period Doubling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
6.4 Universal Properties and Self-Similarity . . . . . . . . . . . . . . . . . . . . . . . . 154
6.5 Measuring Chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
6.6 *Controlling Chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
6.7 Higher-Dimensional Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
6.8 Forced Damped Pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
6.9 *Hamiltonian Chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
6.10 Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
6.11 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
CONTENTS 3
7 Random Processes 197
7.1 Order to Disorder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
7.2 Random Walks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
7.3 Modiﬁed Random Walks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
7.4 The Poisson Distribution and Nuclear Decay . . . . . . . . . . . . . . . . . . . . . . 216
7.5 Problems in Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
7.6 Method of Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
7.7 Applications to Polymers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
7.8 Diﬀusion-Controlled Chemical Reactions . . . . . . . . . . . . . . . . . . . . . . . 232
7.9 Random Number Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
7.10 Variational Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
7.11 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
Appendix 10: Random Walks and the Diﬀusion Equation . . . . . . . . . . . . . . . . . 247
8 The Dynamics of Many-Particle Systems 254
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
8.2 The Intermolecular Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
8.3 Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
8.4 The Numerical Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
8.5 Periodic Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
8.6 A Molecular Dynamics Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
8.7 Thermodynamic Quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
8.8 Radial Distribution Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
8.9 Hard Disks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
8.10 Dynamical Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
8.11 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
8.12 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
9 Normal Modes and Waves 310
9.1 Coupled Oscillators and Normal Modes . . . . . . . . . . . . . . . . . . . . . . . . 310
9.2 Numerical Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
9.3 Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
9.4 Two-Dimensional Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
9.5 Fourier Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
9.6 Power Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
9.7 Wave Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
9.8 Interference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
9.9 Fraunhofer Diﬀraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
9.10 Fresnel Diﬀraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
Appendix 9A: Complex Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
Appendix 9B: Fast Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
Appendix 9C: Plotting Scalar Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
CONTENTS 4
10 Electrodynamics 361
10.1 Static Charges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
10.2 Electric Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
10.3 Electric Field Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
10.4 Electric Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
10.5 Numerical Solutions of Boundary Value Problems . . . . . . . . . . . . . . . . . . . 373
10.6 Random Walk Solution of Laplace’s Equation . . . . . . . . . . . . . . . . . . . . . 382
10.7 *Fields Due to Moving Charges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384
10.8 *Maxwell’s Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
10.9 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
Appendix A: Plotting Vector Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
11 Numerical and Monte Carlo Methods 406
11.1 Numerical Integration Methods in One Dimension . . . . . . . . . . . . . . . . . . 406
11.2 Simple Monte Carlo Evaluation of Integrals . . . . . . . . . . . . . . . . . . . . . . 415
11.3 Multidimensional Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
11.4 Monte Carlo Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
11.5 Nonuniform Probability Distributions . . . . . . . . . . . . . . . . . . . . . . . . . 423
11.6 Importance Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
11.7 Metropolis Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
11.8 *Neutron Transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431
12 Percolation 445
12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
12.2 The Percolation Threshold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
12.3 Finding Clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456
12.4 Critical Exponents and Finite Size Scaling . . . . . . . . . . . . . . . . . . . . . . . 464
12.5 The Renormalization Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468
12.6 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
13 Fractals and Kinetic Growth Models 484
13.1 The Fractal Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484
13.2 Regular Fractals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492
13.3 Kinetic Growth Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
13.4 Fractals and Chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512
13.5 Many Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514
13.6 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516
CONTENTS 5
14 Complex Systems 522
14.1 Cellular Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 522
14.2 Self-Organized Critical Phenomena . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
14.3 The Hopﬁeld Model and Neural Networks . . . . . . . . . . . . . . . . . . . . . . . 543
14.4 Growing Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547
14.5 Genetic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554
14.6 Lattice Gas Models of Fluid Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . 560
14.7 Overview and Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 571
15 Monte Carlo Simulations of Thermal Systems 582
15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 582
15.2 The Microcanonical Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 582
15.3 The Demon Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584
15.4 The Demon as a Thermometer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 588
15.5 The Ising Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 590
15.6 The Metropolis Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595
15.7 Simulation of the Ising Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601
15.8 The Ising Phase Transition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 610
15.9 Other Applications of the Ising Model . . . . . . . . . . . . . . . . . . . . . . . . . 615
15.10Simulation of Classical Fluids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 619
15.11Optimized Monte Carlo Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 624
15.12∗Other Ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629
15.13More Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633
15.14Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 636
16 Quantum Systems 662
16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 662
16.2 Review of Quantum Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663
16.3 Bound State Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668
16.4 Time Development of Eigenstate Superpositions . . . . . . . . . . . . . . . . . . . 673
16.5 The Time-Dependent Schrödinger Equation . . . . . . . . . . . . . . . . . . . . . . 678
16.6 Fourier Transformations and Momentum Space . . . . . . . . . . . . . . . . . . . . 684
16.7 Variational Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 687
16.8 Random Walk Solutions of the Schrödinger Equation . . . . . . . . . . . . . . . . . 689
16.9 Diﬀusion Quantum Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . . . . . . 695
16.10Path Integral Quantum Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . . . . 698
16.11Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 702
Appendix A: Visualizing Complex Functions . . . . . . . . . . . . . . . . . . . . . . . . 704
CONTENTS 6
17 Visualization and Rigid Body Dynamics 709
17.1 Two-Dimensional Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . 709
17.2 Three-Dimensional Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . 713
17.3 The Three-Dimensional Open Source Physics Library . . . . . . . . . . . . . . . . . 719
17.4 Dynamics of a Rigid Body . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722
17.5 Quaternion Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726
17.6 Quaternion equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 729
17.7 Rigid Body Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735
17.8 Motion of a spinning top . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 739
17.9 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 742
18 Seeing in Special and General Relativity 750
18.1 Special Relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 750
18.2 General Relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 754
18.3 Dynamics in Polar Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755
18.4 Black Holes and Schwarzschild Coordinates . . . . . . . . . . . . . . . . . . . . . . 757
18.5 Particle and Light Trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 759
18.6 Seeing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 761
18.7 General Relativistic Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 762
18.8 ∗The Kerr Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 763
18.9 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 764
19 Epilogue: The Unity of Physics 767
19.1 The Unity of Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 767
19.2 Spiral Galaxies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 768
19.3 Numbers, Pretty Pictures, and Insight . . . . . . . . . . . . . . . . . . . . . . . . . 770
19.4 Constrained Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 771
19.5 What are Computers Doing to Physics? . . . . . . . . . . . . . . . . . . . . . . . . . 775
Preface
Computer simulations are now an integral part of contemporary basic and applied physics, and
computation has become as important as theory and experiment. The ability to compute is now
part of the essential repertoire of research scientists.
Since writing the ﬁrst two editions of our text, more courses devoted to the study of physics
using computers have been introduced into the physics curriculum, and many more traditional
courses are incorporating numerical examples. We are gratiﬁed to see that our text has helped
shape these innovations. The purpose of our book includes the following:
1. To provide a means for students to do physics.
2. To give students an opportunity to gain a deeper understanding of the physics they have
learned in other courses.
3. To encourage students to “discover” physics in a way similar to how physicists learn in
the context of research.
4. To introduce numerical methods and new areas of physics that can be studied with these
methods.
5. To give examples of how physics can be applied in a much broader context than is discussed
in the traditional physics undergraduate curriculum.
6. To teach object-oriented programming in the context of doing science.
Our overall goal is to encourage students to learn about science through experience and by
asking questions. Our objective always is understanding, not the generation of numbers.
The major change in this edition is the use of the Java programming language instead of
True Basic, which was used in the ﬁrst two editions. We chose Java for some of the same reasons
we originally chose True Basic. Java is available for all popular operating systems, and is
platform independent, contains built-in graphics capabilities, is freely available, and has all the
features needed to write powerful computer simulations. There is an abundance of free open
source tools available for Java programmers, including the Eclipse integrated development environment.
Because Java is popular, it continues to evolve, and its speed is now comparable
to other languages used in scientiﬁc programming. In addition, Java is object oriented, which
has become the dominant paradigm in computer science and software engineering, and therefore
learning Java is excellent preparation for students with interests in physics and computer
science. Java programs can be easily adapted for delivery over the Web. Finally, as for True
Basic, the nongraphical parts of our programs can easily be converted to other languages such
as C/C++, whose syntax is similar to Java.
i
PREFACE ii
When we chose True Basic for our ﬁrst edition, introductory computer science courses were
teaching Pascal. When we continued with True Basic in the second edition, computer science
departments were experimenting with teaching C/C++. Finally, we are able to choose a language
that is commonly taught and used in many contexts. Thus, it is likely that some of the
students reading our text will already know Java and can contribute much to a class that uses
our text.
Java provides many powerful libraries for building a graphical user interface and incorporating
audio, video, and other media. If we were to discuss these libraries, students would
become absorbed in programming tasks that have little or nothing to do with physics. For this
reason our text uses the Open Source Physics library which makes it easy to write programs that
are simpler and more graphically oriented than those that we wrote in True Basic. In addition,
the Open Source Physics library is useful for other computational physics projects which are
not discussed in this text, as well as general programming tasks. This library provides for easy
graphical input of parameters, tabular output of data, plots, visualizations and animations, and
the numerical solution of ordinary diﬀerential equations. It also provides several useful data
structures. The Open Source Physics library was developed by Wolfgang Christian, with the
contributions and assistance of many others. The book Open Source Physics: A User’s Guide with
Examples by Wolfgang Christian is available separately and discusses the Open Source Physics
library in much more detail. A CD that comes with the User’s Guide contains the source code
for the Open Source Physics library, the programs in this book, as well as ready-to-run versions
of these programs. The source code and the library can also be downloaded freely from
<www.opensourcephysics.org/sip>.
The ease of doing visualizations is a new and important aspect of Java and the Open Source
Physics library, giving Java an advantage over other languages such as C++ and Fortran, which
do not have built-in graphics capabilities. For example, when debugging a program, it is frequently
much quicker to detect when the program is not working by looking at a visual representation
of the data rather than by scanning the data as lists of numbers. Also, it is easier to
choose the appropriate values of the parameters by varying them and visualizing the results.
Finally, more insight is likely to be gained by looking at a visualization than a list of numbers.
Because animations and the continuous plotting of data usually cause a program to run
more slowly, we have designed our programs so that the graphical output can be turned oﬀ or
implemented infrequently during a simulation.
Java provides support for interacting with a program during runtime. The Open Source
Physics library makes this interaction even easier, so that we can write programs that use a
mouse to input data, such as the location of a charge, or toggle the value of a cell in a lattice. We
also do not need to input how long a simulation should run and can stop the program at any
time to change parameters.
As with our previous editions, we assume no background in computer programming. Much
of the text can be understood by students with only a semester each of physics and calculus.
Chapter 2 introduces Java and the Open Source Physics library. In Chapter 3 we discuss the concept
of interfaces and how to use some of the important interfaces in the Open Source Physics
library. Later chapters introduce more Java and Open Source Physics constructs as needed, but
essentially all of the chapters after Chapter 3 can be studied independently and in any order.
We include many topics that are sometimes considered too advanced for undergraduates, such
as random walks, chaos, fractals, percolation, simulations of many particle systems, and topics
in the theory of complexity, but we introduce these topics so that very little background
is required. Other chapters discuss optics, electrodynamics, relativity, rigid body motion, and
quantum mechanics, which require knowledge of the physics found in the corresponding stan-
PREFACE iii
dard undergraduate courses.
This text is written so that the physics drives the choice of algorithms and the programming
syntax that we discuss. We believe that students can learn how to program more quickly with
this approach because they have an immediate context, namely doing simulations, in which
to hone their skills. In the beginning most of the programming tasks involve modifying the
programs in the text. Students should then be given some assignments that require them to
write their own programs by following the format of those in the text. The students may later
develop their own style as they work on their projects.
Our text is most appropriately used in a project-oriented course that lets students with a
wide variety of backgrounds and abilities work at their own pace. The courses that we have
taught using this text have a laboratory component. From our experience we believe that active
learning where students are directly grappling with the material in this text is the most eﬃcient.
In a laboratory context students who already know a programming language can help those
who do not. Also, students can further contribute to a course by sharing their knowledge from
various backgrounds in physics, chemistry, computer science, mathematics, biology, economics,
and other subjects.
Although most of our text is at the undergraduate level, many of the topics are considered
to be graduate level and thus would be of interest to graduate students. One of us regularly
teaches a laboratory-based course on computer simulation with both undergraduate and graduate
students. Because the course is project oriented, students can go at their own pace and
work on diﬀerent problems. In this context, graduate and undergraduate students can learn
much from each other.
Some instructors who might consider using our text in a graduate-level context might think
that our text is not suﬃciently rigorous. For example, in the suggested problems we usually do
not explicitly ask students to do an extensive data analysis. However, we do discuss how to
estimate errors in Chapter 11. We encourage instructors to ask for a careful data analysis on
at least one assignment, but we believe that it is more important for students to spend most of
their time in an exploratory mode where the focus is on gaining physical insight and obtaining
numerical results that are qualitatively correct.
There are four types of suggested student activities. The exercises, which are primarily
found in the beginning of the text, are designed to help students learn speciﬁc programming
techniques. The problems, which are scattered throughout each chapter, are open ended and
require students to run, analyze, and modify programs given in the text, or write new, but similar
programs. Students will soon learn that the format for most of the programs is very similar.
Starred problems require either signiﬁcantly more background or work and may require the
writing of a program from scratch. However, the programs for these problems still follow a
similar format. The projects at the end of most of the chapters are usually more time consuming
and would be appropriate for term projects or independent student research. Many new
problems and projects have been added to this edition, while others have been improved or
eliminated. Instructors and students should view the problem descriptions and questions as
starting points for thinking about the system of interest. It is important that students read the
problems even if they do not plan to do them.
We encourage instructors to ask students to write laboratory reports for at least some of the
problems. The Appendix to Chapter 1 provides guidance on what these reports should include.
Part of the beauty and fun of doing computer simulations is that one is forced to think about
the choice of algorithm, its implementation, the choice of parameters, what to measure, and
the results. Do the results make sense? What happens if you change a parameter? What if you
change the algorithm? Much physics can be learned in this way.
PREFACE iv
Although all of the programs discussed in the text can be downloaded freely, most are
listed in the text to encourage students to read them carefully. Students might ﬁnd some useful
techniques that they can use elsewhere, and the discussion in the text frequently refers to the
listings.
A casual perusal of the text might suggest that the text is bereft of ﬁgures. One reason that
we have not included more ﬁgures is that most of the programs in the text have an important
visual component in color. Black and white ﬁgures pale in comparison. Much of the text is
meant to be read while working on the programs. Thus, students can easily see the plots and
animations produced by the programs while they are reading the text.
As new technologies become available and the backgrounds and expectations of students
change, the question of what is worth knowing needs to be reconsidered. Today, calculators
not only do arithmetic and numerical operations, but most can do algebra, calculus, and plotting.
Students have lost the sense of number and most can only do the simplest mathematical
manipulations in their head. On the other hand, most students feel comfortable using computers
and gathering information oﬀ the Web. Because there exist programs and applets that
can perform many of the simulations in this text, why should students learn how to write their
own programs? We have at least two answers. First, most innovative scientiﬁc research involves
writing programs that do not ﬁt into the domains of existing software. More importantly, we
believe that students obtain a deeper understanding of the physics and the algorithms themselves
by writing and modifying their own programs. Just as we need to insure that students
can carry out basic mathematical operations without a calculator so that they understand what
these operations mean, we must do the same when it comes to computational physics.
The recommended readings at the end of each chapter have been selected for their pedagogical
value rather than for completeness or for historical accuracy. We apologize to our colleagues
whose work has been inadvertently omitted, and we would appreciate suggestions for new and
additional references.
Because students come with a diﬀerent skill set than most of their instructors, it is important
that instructors realize that certain aspects of this text might be easier for their students
than for them. Some instructors might be surprised that much of the code for organizing the
simulations is “hidden” in the Open Source Physics library (although the source code is freely
available). Some instructors will initially think that Chapter 2 contains too much material.
However, from the student’s perspective this material is not that diﬃcult to learn. They are
used to downloading ﬁles, using various software environments, and learning how to make
software do what they want. The diﬃcult parts of the text, where instructor input is most
needed, is understanding the physics and the algorithms. Converting algorithms to programs
is also diﬃcult for many students, and we spend much time in the text explaining the programs
that implement various algorithms. In some cases instructors will ﬁnd it diﬃcult to set up an
environment to use Java and the Open Source Physics library. Because this task depends on the
operating system, we have placed instructions on how to set up an environment for Java and
Open Source Physics at <opensourcephysics.org>. This website also contains links to updates
of the evolving Open Source Physics library as well as other resources for this text including the
source code for the programs in the text.
We acknowledge generous support from the National Science Foundation which has allowed
us to work on many ideas that have found their way into this textbook. We also thank
Kipton Barros, Mario Belloni, Doug Brown, Francisco Esquembre, and Joshua Gould for their
advice, suggestions, and contributions to the Open Source Physics library and to the text. We
thank Anne Cox for suggesting numerous improvements to the narrative and for hosting an
Open Source Physics developer’s workshop at Eckerd College.
PREFACE v
We are especially grateful to Louis Colonna-Romano for drawing almost all of the ﬁgures.
Lou writes programs in postscript the way others write programs in Java or Fortran.
We are especially thankful to students and faculty at Clark University, Davidson College,
and Kalamazoo College who have generously commented on the Open Source Physics project
as they class tested early versions of this manuscript. Carlos Ortiz helped prepare the index for
this book.
Many individuals reviewed parts of the text and we thank them for their assistance. They
include Lowell M. Boone, Roger Cowley, Shamanthi Fernando, Alejandro L. Garcia, Alexander
L. Godunov, Rubin Landau, Donald G. Luttermoser, Cristopher Moore, Anders Sandvik, Ross
Spencer, Dietrich Stauﬀer, Jutta Luettmer–Strathmann, Daniel Suson, Matthias Troyer, Slavomir
Tuleja, and Michael T. Vaughn. We thank all our friends and colleagues for their encouragement
and support.
We are grateful to our wives, Patti Gould, Andrea Moll Tobochnik, and Barbara Christian,
and to our children, Joshua, Emily, and Evan Gould, Steven and Howard Tobochnik, and
Katherine, Charlie, and Konrad Christian for their encouragement and understanding during
the course of this work. It takes a village to raise a child and a community to write a textbook.
No book of this length can be free of typos and errors. We encourage readers to email us
about errors that they ﬁnd and suggestions for improvements. Our plan is to continuously
revise our book so that the next edition will be more timely.
Harvey Gould
Clark University
Worcester, MA 01610-1477
hgould@clarku.edu
Jan Tobochnik
Kalamazoo College
Kalamazoo, MI 49006-3295
jant@kzoo.edu
Wolfgang Christian
Davidson College
Davidson, NC 28036-6926
wochristian@davidson.edu
Chapter 1
Introduction
The importance of computers in physics and the nature of computer simulation is discussed.
The nature of object-oriented programming and various computer languages is also considered.
1.1 Importance of computers in physics
Computation is now an integral part of contemporary science and is having a profound eﬀect
on the way we do physics, on the nature of the important questions, and on the physical systems
we choose to study. Developments in computer technology are leading to new ways of thinking
about physical systems. Asking “How can I formulate this problem on a computer?” has led
to the understanding that it is practical and natural to formulate physical laws as rules for a
computer rather than only in terms of diﬀerential equations.
For the purposes of discussion, we will divide the use of computers in physics into the
following categories: numerical analysis, symbolic manipulation, visualization, simulation, and
the collection and analysis of data. Numerical analysis refers to the solution of well-deﬁned
mathematical problems to produce numerical (in contrast to symbolic) solutions. For example,
we know that the solution of many problems in physics can be reduced to the solution of a set
of simultaneous linear equations. Consider the equations
2x + 3y = 18
x − y = 4.
It is easy to ﬁnd the analytical solution x = 6, y = 2 using the method of substitution. Suppose
we wish to solve a set of four simultaneous equations. We again can ﬁnd an analytical solution,
perhaps using a more sophisticated method. However, if the number of simultaneous equations
becomes much larger, we would need to use a computer to ﬁnd a solution. In this mode the
computer is a tool of numerical analysis. Because it is often necessary to compute multidimensional
integrals, manipulate large matrices, or solve nonlinear diﬀerential equations, this use of
the computer is important in physics.
One of the strengths of mathematics is its ability to use the power of abstraction, which
allows us to solve many similar problems simultaneously by using symbols. Computers can be
used to do much of the symbolic manipulation. As an example, suppose we want to know the
solution of the quadratic equation, ax2 + bx + c = 0. A symbolic manipulation program can give
the solution as x = [−b ±
√
b2 − 4ac]/2a. In addition, such a program can give the usual numerical
solutions for speciﬁc values of a, b, and c. Mathematical operations such as diﬀerentiation,
1
CHAPTER 1. INTRODUCTION 2
sin x
x
Figure 1.1: What is the meaning of the sine function?
integration, matrix inversion, and power series expansion can be performed using symbolic
manipulation programs. The calculation of Feynman diagrams, which represent multidimensional
integrals of importance in quantum electrodynamics, has been a major impetus to the
development of computer algebra software that can manipulate and simplify symbolic expressions.
Maxima, Maple, and Mathematica are examples of software packages that have symbolic
manipulation capabilities as well as many tools for numerical analysis. Matlab and Octave are
examples of software packages that are convenient for computations involving matrices and
related tasks.
As the computer plays an increasing role in our understanding of physical phenomena,
the visual representation of complex numerical results is becoming even more important. The
human eye in conjunction with the visual processing capacity of the brain is a very sophisticated
device. Our eyes can determine patterns and trends that might not be evident from tables of
data and can observe changes with time that can lead to insight into the important mechanisms
underlying a system’s behavior. The use of graphics can also increase our understanding of the
nature of analytical solutions. For example, what does a sine function mean to you? We suspect
that your answer is not the series, sinx = x − x3/3! + x5/5! + ··· , but rather a periodic, constant
amplitude curve (see Figure 1.1). What is most important is the mental image gained from a
visualization of the form of the function.
Traditional modes of presenting data include two- and three-dimensional plots including
contour and ﬁeld line plots. Frequently, more than three variables are needed to understand
the behavior of a system, and new methods of using color and texture are being developed to
help researchers gain greater insights into their data.
An essential role of science is to develop models of nature. To know whether a model is consistent
with observation, we have to understand the behavior of the model and its predictions.
One way to do so is to implement the model on a computer. We call such an implementation a
computer simulation or simulation for short. For example, suppose a teacher gives $10 to each
student in a class of 100. The teacher, who also begins with $10 in her pocket, chooses a student
at random and ﬂips a coin. If the coin is heads, the teacher gives $1 to the student; otherwise,
the student gives $1 to the teacher. If either the teacher or the student would go into debt by
CHAPTER 1. INTRODUCTION 3
this transaction, the transaction is not allowed. After many exchanges, what is the probability
that a student has s dollars? What is the probability that the teacher has t dollars? Are these
two probabilities the same? Although these particular questions can be answered by analytical
methods, many problems of this nature cannot be solved in this way (see Problem 1.1).
One way to determine the answers to these questions is to do a classroom experiment. However,
such an experiment would be diﬃcult to arrange, and it would be tedious to do a suﬃcient
number of transactions.
A more practical way to proceed is to convert the rules of the model into a computer program,
simulate many exchanges, and estimate the quantities of interest. Knowing the results
might help us gain more insight into the nature of an analytical solution if one exists. We can
also modify the rules and ask “what if?” questions. For example, would the probabilities change
if the students could exchange money with one another? What would happen if the teacher was
allowed to go into debt?
Simulations frequently use the computational tools of numerical analysis and visualization,
and occasionally symbolic manipulation. The diﬀerence is one of emphasis. Simulations are
usually done with a minimum of analysis. Because simulation emphasizes an exploratory mode
of learning, we will stress this approach.
Computers are also involved in all phases of a laboratory experiment, from the design of
the apparatus to the collection and analysis of data. LabView is an example of a data acquisition
program. Some of the roles of the computer in laboratory experiments, such as the varying of
parameters and the analysis of data, are similar to those encountered in simulations. However,
the tasks involved in real-time control and interactive data analysis are qualitatively diﬀerent
and involve the interfacing of computer hardware to various types of instrumentation. We will
not discuss this use of the computer.
1.2 The importance of computer simulation
Why is computation becoming so important in physics? One reason is that most of our analytical
tools such as diﬀerential calculus are best suited to the analysis of linear problems. For
example, you probably have analyzed the motion of a particle attached to a spring by assuming
a linear restoring force and solving Newton’s second law of motion. In this case a small change
in the displacement of the particle leads to a small change in the force. However, many natural
phenomena are nonlinear, and a small change in a variable might produce a large change in
another. Because relatively few nonlinear problems can be solved by analytical methods, the
computer gives us a new tool to explore nonlinear phenomena.
Another reason for the importance of computation is the growing interest in systems with
many variables or with many degrees of freedom. The money exchange model described in
Section 1.1 is a simple example of a system with many variables. A similar problem is given at
the end of this chapter.
Computer simulations are sometimes referred to as computer experiments because they share
much in common with laboratory experiments. Some of the analogies are shown in Table 1.1.
The starting point of a computer simulation is the development of an idealized model of a physical
system of interest. We then need to specify a procedure or algorithm for implementing the
model on a computer and decide what quantities to measure. The results of a computer simulation
can serve as a bridge between laboratory experiments and theoretical calculations. In
some cases we can obtain essentially exact results by simulating an idealized model that has no
CHAPTER 1. INTRODUCTION 4
Laboratory Experiment Computer Simulation
sample model
physical apparatus computer program
calibration testing of program
measurement computation
data analysis data analysis
Table 1.1: Analogies between a computer simulation and a laboratory experiment.
laboratory counterpart. The results of the idealized model can serve as a stimulus to the development
of the theory. On the other hand, we sometimes can do simulations of a more realistic
model than can be done theoretically, and hence make a more direct comparison with laboratory
experiments. Computation has become a third way of doing physics and complements
both theory and experiment.
Computer simulations, like laboratory experiments, are not substitutes for thinking, but
are tools that we can use to understand natural phenomena. The goal of all our investigations
of fundamental phenomena is to seek explanations of natural phenomena that can be stated
concisely.
1.3 Programming languages
There is no single best programming language any more than there is a best natural language.
Fortran is the oldest of the more popular scientiﬁc programming languages and was developed
by John Backus and his colleagues at IBM between 1954 and 1957. Fortran is commonly used in
scientiﬁc applications and continues to evolve. Fortran 90/95/2000 has many modern features
that are similar to C/C++.
The Basic programming language was developed in 1965 by John Kemeny and Thomas
Kurtz at Dartmouth College as a language for introductory courses in computer science. In
1983 Kemeny and Kurtz extended the language to include platform independent graphics and
advanced control structures necessary for structured programming. The programs in the ﬁrst
two editions of our textbook were written in this version of Basic, known as True Basic.
C was developed by Dennis Ritchie at Bell Laboratories around 1972 in parallel with the
Unix operating system. C++ is an extension of C designed by Bjarne Stroustrup at Bell laboratories
in the mid-eighties. C++ is considerably more complex than C and has object oriented
features, as well as and other extensions. In general, programs written in C/C++ have high
performance, but can be diﬃcult to debug. C and C++ are popular choices for developing operating
systems and software applications because they provide direct access to memory and
other system resources.
Python, like Basic, was designed to be easy to learn and use. Python enthusiasts like to say
that C and C++ were written to make life easier for the computer, but Python was designed
to be easier for the programmer. Guido van Rossum created Python in the late 80’s and early
90’s. It is an interpreted, object-oriented, general-purpose programming language that is also
good for prototyping. Because Python is interpreted, its performance is signiﬁcantly less than
optimized languages like C or Fortran.
Java is an object-oriented language that was created by James Gosling and others at Sun
Microsystems. Since Java was introduced in late 1995, it has rapidly evolved and is the language
of choice in most introductory computer science courses. Java borrows much of its syntax from
CHAPTER 1. INTRODUCTION 5
C++ but has a simpler structure. Although the language contains only ﬁfty keywords, the
Java platform adds a rich library that enables a Java program to connect to the internet, render
images, and perform other high-level tasks.
Most modern languages incorporate object-oriented features. The idea of object-oriented
programming is that functions and data are grouped together in an object, rather than treated
separately. A program is a structured collection of objects that communicate with each other
causing the internal state within a given object to change. A fundamental goal of object-oriented
design is to increase the understandability and reusability of program code by focusing on what
an object does and how it is used, rather than how an object is implemented.
Our choice of Java for this text is motivated in part by its platform independence, ﬂexible
standard graphics libraries, good performance, and its no cost availability. The popularity of
Java ensures that the language will continue to evolve, and that programming experience in
Java is a valuable and marketable skill. The Java programmer can leverage a vast collection of
third-party libraries, including those for numerical calculations and visualization. Java is also
relatively simple to learn, especially the subset of Java that we will need to simulate physical
systems.
Java can be thought of as a platform in itself, similar to the Macintosh and Windows, because
it has an application programming interface (API) that enables cross-platform graphics
and user interfaces. Java programs are compiled to a platform neutral byte code so that they
can run on any computer that has a Java Virtual Machine. Despite the high level of abstraction
and platform independence, the performance of Java is becoming comparable with native languages.
If a project requires more speed, the computationally demanding parts of the program
can be converted to C/C++ or Fortran.
Readers who wish to use another programming language should ﬁnd the algorithmic components
of the Java program listings in the text to be easily converted into a language of their
choice.
1.4 Object oriented techniques
If you already know how to program, try reading a program that you wrote several years or
even several weeks ago. Many of us would not be able to follow the logic of our own program
and would have to rewrite it. And your program would probably be of little use to a friend
who needs to solve a similar problem. If you are learning programming for the ﬁrst time, it is
important to learn good programming habits to minimize this problem. One way is to employ
object-oriented techniques such as encapsulation, inheritance, and polymorphism.
Encapsulation refers to the way that an object’s essential information is exposed through a
well-documented interface, but unnecessary details of the code are hidden. For example, we
can model a particle as an object. Whenever a particle moves, it calculates its acceleration from
the total force on it. Someone who wishes to use the trajectory of the particle, for example to
animate the particle’s trajectory, needs to refer only to the interface and does not need to know
how the trajectory is calculated.
Inheritance allows a programmer to add capabilities to existing code without having to
rewrite it or even know the details of how the code works. For example, you will write programs
that show the evolution of planetary systems, quantum mechanical wave functions, and
molecular models. Many of these programs will use (extend) code in the Open Source Physics
library known as an AbstractSimulation. This code has a timer that periodically executes code
in your program and then refreshes the on-screen animation. Using the Open Source Physics
CHAPTER 1. INTRODUCTION 6
library will let you focus your eﬀorts on programming the physics, because it is not necessary
to write the code to produce the timer or to refresh the screen. Similarly, we have designed a
general purpose graphical user interface (GUI) by extending code written by Sun Microsystems
known as a JFrame. Our GUI has the features of a standard user interface such as a menu bar,
minimize button, and title, even though we did not write the code to implement these features.
Polymorphism helps us to write reusable code. For example, it is easy to imagine many types
of objects that are able to evolve over time. In Chapter 15 we will simulate a system of particles
using random numbers rather than forces to move the particles. By using polymorphism, we
can write general purpose code to do animations with both types of systems.
Science students have a rich context in which to learn programming. The past several
decades of doing physics with computers has given us numerous examples that we can use
to learn physics, programming, and data analysis. Unlike many programming manuals, the
emphasis of this book is on learning by example. We will not discuss all aspects of Java, and
this text is not a substitute for a text on Java. Think of how you learned your native language.
First you learned by example, and then you learned more systematically.
Although using an object oriented language makes it easier to write well-structured programs,
it does not guarantee that your programs will be well written or even correct. The single
most important criterion of program quality is readability. If your program is easy to read and
follow, it is probably a good program. There are many analogies between a good program and a
well-written paper. Few papers and programs come out perfectly on their ﬁrst draft, regardless
of the techniques and rules we use to write them. Rewriting is an important part of program-
ming.
1.5 How to use this book
Most chapters in this text begin with a brief background summary of the nature of a system and
the important questions. We then introduce the computer algorithms, new syntax as needed,
and discuss a sample program. The programs are meant to be read as text on an equal basis
with the discussions and are interspersed throughout the text. It is strongly recommended that
all the problems be read, because many concepts are introduced after you have had a chance to
think about the result of a simulation.
It is a good idea to maintain a computer-based notebook to record your programs, results,
graphical output, and analysis of the data. This practice will help you develop good habits for
future research projects, prevent duplication, organize your thoughts, and save you time. After
a while you will ﬁnd that most of your new programs will use parts of your earlier programs.
Ideally, you will use your ﬁles to write a laboratory report or a paper on your work. Guidelines
for writing a laboratory report are given in Appendix 1A.
Many of the problems in the text are open ended and do not lend themselves to simple
“back of the book” answers. So how will you know if your results are correct? How will you
know when you have done enough? There are no simple answers to either question, but we can
give some guidelines. First, you should compare the results of your program to known results
whenever possible. The known results might come from an analytical solution that exists in
certain limits or from published results. You should also look at your numbers and graphs, and
determine if they make sense. Do the numbers have the right sign? Are they the right order
of magnitude? Do the trends make sense as you change the parameters? What is the statistical
error in the data? What is the systematic error? Some of the problems explicitly ask you to do
these checks, but you should make it a habit to do as many as you can whenever possible.
CHAPTER 1. INTRODUCTION 7
How do you know when you are ﬁnished? The main guideline is whether you can tell a
coherent story about your system of interest. If you have only a few numbers and do not know
their signiﬁcance, then you need to do more. Let your curiosity lead you to more explorations.
Do not let the questions asked in the problems limit what you do. The questions are only
starting points, and frequently you will be able to think of your own questions.
The following problem is an example of the kind of problems that will be posed in the
following chapters. Note its similarity to the questions posed on page 3. Although most of the
simulations that we will do will be on the kind of physical systems that you will encounter in
other physics courses, we will consider simulations in related areas, such as traﬃc ﬂow, small
world networks, and economics. Of course, unless you already know how to do simulations, you
will have to study the following chapters so that you will able to do problems like the following.
Problem 1.1. Distribution of money
The distribution of income in a society f (m) behaves as f (m) ∝ m−1−α, where m is the income
(money) and the exponent α is between 1 and 2. The quantity f (m) can be taken to be the
number of people who have an amount of money between m and m + ∆m. This power law
behavior of the income distribution is often referred to as Pareto’s law or the 80/20 rule (20%
of the people have 80% of the income) and was proposed in the late 1800’s by Vilfredo Pareto,
an economist and sociologist. In the following, we consider some simple models of a closed
economy to determine the relation between the microdynamics and the resulting macroscopic
distribution of money.
a. Suppose that N agents (people) can exchange money in pairs. For simplicity, we assume
that all the agents are initially assigned the same amount of money m0, and the agents are
then allowed to interact. At each time step, a pair of agents i and j with money mi and mj
is randomly chosen and a transaction takes place. Again for simplicity, let us assume that
mi → mi and mj → mj by a random reassignment of their total amount of money, mi + mj,
such that
mi = (mi + mj) (1.1a)
mj = (1 − )(mi + mj) (1.1b)
where is a random number between 0 and 1. Note that this reassignment ensures that the
agents have no debt after the transaction, that is, they are always left with an amount m ≥ 0.
Simulate this model and determine the distribution of money among the agents after the
system has relaxed to an equilibrium state. Choose N = 100 and m0 = 1000.
b. Now let us ask what happens if the agents save a fraction λ of their money before the transaction.
We write
mi = mi + δm (1.2a)
mj = mj − δm (1.2b)
δm = (1 − λ)[ mj − (1 − )mi]. (1.2c)
Modify your program so that this savings model is implemented. Consider λ = 0.25, 0.50,
0.75, and 0.9. For some of the values of λ, as many as 107 transactions will need to be
considered. Does the form of f (m) change for λ > 0?
The form of f (m) for the model in Problem 1.1a can be found analytically and is known to
students who have had a course in statistical mechanics. However, the analytical form of f (m)
CHAPTER 1. INTRODUCTION 8
in Problem 1.1b is not known. More information about this model can be found in the article
by Patriarca, Chakraborti, and Kaski (see the references at the end of this chapter).
Problem 1.1 illustrates some of the characteristics of simulations that we will consider in
the following chapters. Implementing this model on a computer would help you to gain insight
into its behavior and might encourage you to explore variations of the model. Note that the
model lends itself to asking a relatively simple “what if” question, which in this case leads to
qualitatively diﬀerent behavior. Asking similar questions might require modifying only a few
lines of code. However, such a change might convert an analytically tractable problem into one
for which the solution is unknown.
Problem 1.2. Questions to consider
a. You are familiar with the fall of various objects near the earth’s surface. Suppose that a ball
is in the earth’s atmosphere long enough for air resistance to be important. How would you
simulate the motion of the ball?
b. Suppose that you wish to model a simple liquid such as liquid Argon. Why is such a liquid
simpler to simulate than water? What is the maximum number of atoms that can be
simulated in a reasonable amount of time using present computer technology? What is the
maximum real time that is possible to simulate? That is, if we run our program for a week of
computer time, what would be the equivalent time that the liquid has evolved?
c. Discuss some examples of systems that would be interesting to you to simulate. Can these
systems be analyzed by analytical methods? Can they be investigated experimentally?
d. An article by Post and Votta (see references) claims that “. . . (computers) have largely replaced
pencil and paper as the theorist’s main tool.” Do you agree with this statement? Ask
some of the theoretical physicists that you know for their opinions.
Appendix 1A: Laboratory reports
Laboratory reports should reﬂect clear writing style and obey proper rules of grammar and
correct spelling. Write in a manner that can be understood by another person who has not done
the research. In the following, we give a suggested format for your reports.
Introduction. Brieﬂy summarize the nature of the physical system, the basic numerical method
or algorithm, and the interesting or relevant questions.
Method. Describe the algorithm and how it is implemented in the program. In some cases this
explanation can be given in the program itself. Give a typical listing of your program.
Simple modiﬁcations of the program can be included in an appendix if necessary. The
program should include your name and date and be annotated in a way that is as selfexplanatory
as possible. Be sure to discuss any important features of your program.
Veriﬁcation of program. Conﬁrm that your program is not incorrect by considering special cases
and by giving at least one comparison to a hand calculation or known result.
Data. Show the results of some typical runs in graphical or tabular form. Additional runs can
be included in an appendix. All runs should be labeled, and all tables and ﬁgures must
be referred to in the body of the text. Each ﬁgure and table should have a caption with
complete information, for example, the value of the time step.
CHAPTER 1. INTRODUCTION 9
Analysis. In general, the analysis of your results will include a determination of qualitative and
quantitative relationships between variables and an estimation of numerical accuracy.
Interpretation. Summarize your results and explain them in simple physical terms whenever
possible. Speciﬁc questions that were raised in the assignment should be addressed here.
Also give suggestions for future work or possible extensions. It is not necessary to answer
every part of each question in the text.
Critique. Summarize the important physical concepts for which you gained a better understanding
and discuss the numerical or computer techniques you learned. Make speciﬁc
comments on the assignment and suggestions for improvements or alternatives.
Log. Keep a log of the time spent on each assignment and include it with your report.
References and suggestions for further reading
Programming
We list some of our favorite Java programming books here. There are many useful online tuto-
rials.
Joshua Bloch, Eﬀective Java (Addison–Wesley, 2001). This excellent book is for advanced Java
programmers and should be read after you have become familiar with Java.
Rogers Cadenhead and Laura Lemay Teach Yourself Java in 21 Days 4th ed. (Sams, 2004). An
inexpensive self-study guide that uses a step by step tutorial approach to cover the basics.
Stephen J. Chapman, Java for Engineers and Scientists, 2nd ed. (Prentice Hall, 2004).
Wolfgang Christian, Open Source Physics: A User’s Guide with Examples (Addison–Wesley,
2006). This guide is a useful supplement to our text.
Bruce Eckel, Thinking in Java, 3rd ed. (Prentice Hall. 2003). This text discusses the ﬁner points
of object-oriented programming and is recommended after you have become familiar with
Java. See also <www.mindview.net/Books/>.
David Flanagan, Java in a Nutshell, 5th ed. (O’Reilly, 2005) and Java Examples in a Nutshell,
3rd ed. (O’Reilly, 2004). A fast-paced Java tutorial for those who already know another
programming language.
Brian D. Hahn and Katherine M. Malan, Essential Java for Scientists and Engineers (ButterworthHeinemann,
2002).
Cay S. Horstmann and Gary Cornell, Core Java 2: Fundamentals and Core Java 2: Advanced
Features, both in 7th ed. (Prentice Hall, 2005). A two-volume set that covers all aspects of
Java programming.
Patrick Niemeyer and Jonathan Knudsen, Learning Java, 2nd ed. (O’Reilly, 2002). A comprehensive
introduction to Java that starts with HelloWorld and ends with a discussion of
XML. The book contains many examples showing how the core Java API is used. This
book is one of our favorites for beginning Java programmers. However, it might be intimidating
to someone who does not have some familiarity with computers.
CHAPTER 1. INTRODUCTION 10
Sherry Shavor, Jim D’Anjou, Pat McCarthy, John Kellerman, and Scott Fairbrothe, The Java Developer’s
Guide to Eclipse (Addison–Wesley Professional, 2003). A good reference for the
open source Eclipse development environment. Check for new versions because Eclipse is
evolving rapidly.
General References on Physics and Computers
Richard E. Crandall, Projects in Scientiﬁc Computation (Springer–Verlag, 1994).
Paul L. DeVries, A First Course in Computational Physics (John Wiley & Sons, 1994).
Alejandro L. Garcia, Numerical Methods for Physics, 2nd ed. (Prentice Hall, 2000). Matlab,
C++, and Fortran are used.
Neil Gershenfeld, The Nature of Mathematical Modeling (Cambridge University Press, 1998).
Nicholas J. Giordano and Hisao Nakanishi, Computational Physics. 2nd ed. (Prentice Hall,
2005).
Dieter W. Heermann, Computer Simulation Methods in Theoretical Physics, 2nd ed. (Springer–
Verlag, 1990). A discussion of molecular dynamics and Monte Carlo methods directed
toward advanced undergraduate and beginning graduate students.
David Landau and Kurt Binder, A Guide to Monte Carlo Simulations in Statistical Physics,
2nd ed. (Cambridge University Press, 2005). The authors emphasize the complementary
nature of simulation to theory and experiment.
Rubin H. Landau, A First Course in Scientiﬁc Computing (Princeton University Press, 2005).
P. Kevin MacKeown, Stochastic Simulation in Physics (Springer, 1997).
Tao Pang, Computational Physics (Cambridge University Press, 1997).
Franz J. Vesely, Computational Physics, 2nd ed. (Plenum Press, 2002).
Michael M. Woolfson and Geoﬀrey J. Perl, Introduction to Computer Simulation (Oxford University
Press, 1999).
Other References
Ruth Chabay and Bruce Sherwood, Matter & Interactions (John Wiley & Sons, 2002). This
two-volume text uses computer models written in VPython to present topics not typically
discussed in introductory physics courses.
H. Gould, “Computational physics and the undergraduate curriculum,” Computer Physics
Communications 127 (1), 6–10 (2000).
Brian Hayes, “g-OLOGY,” Am. Scientist 92 (3), 212–216 (2004) discusses the g-factor of the
electron and the importance of algebraic and numerical calculations.
Problem 1.1 is based on a paper by Marco Patriarca, Anirban Chakraborti, and Kimmo Kaski,
“Gibbs versus non-Gibbs distributions in money dynamics,” Physica A 340, 334–339 (2004).
An interesting article on the future of computational science by Douglass E. Post and Lawrence
G. Votta, “Computational science demands a new paradigm,” Physics Today 58 (1), 35–41
(2005) raises many interesting questions.
CHAPTER 1. INTRODUCTION 11
Ross L. Spencer, “Teaching computational physics as a laboratory sequence,” Am. J. Phys. 73,
151–153 (2005).
Chapter 2
Tools for Doing Simulations
We introduce some of the core syntax of Java in the context of simulating the motion of falling
particles near the Earth’s surface. A simple algorithm for solving ﬁrst-order diﬀerential equations
numerically is also discussed.
2.1 Introduction
If you were to take a laboratory-based course in physics, you would soon be introduced to the
oscilloscope. You would learn the function of many of the knobs, how to read the display, and
how to connect various devices so that you could measure various quantities. If you did not
know already, you would learn about voltage, current, impedance, and AC and DC signals.
Your goal would be to learn how to use the oscilloscope. In contrast, you would learn only a
little about the inner workings of the oscilloscope.
The same approach can be easily adopted with an object-oriented language such as Java. If
you are new to programming, you will learn how to make Java do what you want, but you will
not learn everything about Java. In this chapter, we will present some of the essential syntax of
Java and introduce the Open Source Physics library, which will facilitate writing programs with
a graphical user interface and visual output such as plots and animations.
One of the ways that science progresses is by making models. If the model is suﬃciently
detailed, we can determine its behavior and then compare the behavior with experiment. This
comparison might lead to veriﬁcation of the model, changes in the model, and further simulations
and experiments. In the context of computer simulation, we usually begin with a set
of initial conditions, determine the dynamical behavior of the model numerically, and generate
data in the form of tables of numbers, plots, and animations. We begin with a simple example
to see how this process works.
Imagine a particle such as a ball near the surface of the Earth subject to a single force, the
force of gravity. We assume that air friction is negligible, and the gravitational force is given by
Fg = −mg (2.1)
where m is the mass of the ball and g = 9.8 N/kg is the gravitational ﬁeld (force per unit mass)
near the Earth’s surface. To make our example as simple as possible, we ﬁrst assume that there
is only vertical motion. We use Newton’s second law to ﬁnd the motion of the ball,
m
d2y
dt2
= F (2.2)
12
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 13
where y is the vertical coordinate deﬁned so that up is positive, t is the time, F is the total force
on the ball, and m is the inertial mass [which is the same as the gravitational mass in (2.1)]. If
we set F = Fg, (2.1) and (2.2) lead to
d2y
dt2
= −g. (2.3)
Equation (2.3) is a statement of a model for the motion of the ball. In this case the model is in
the form of a second-order diﬀerential equation.
You are probably familiar with the model summarized in (2.3) and know the analytic solu-
tion:
y(t) = y(0) + v(0)t −
1
2
gt2
(2.4a)
v(t) = v(0) − gt. (2.4b)
Nevertheless, we will determine the motion of a freely falling particle numerically in order to
introduce the tools that we will need in a familiar context.
We begin by expressing (2.3) as two ﬁrst-order diﬀerential equations:
dy
dt
= v (2.5a)
dv
dt
= −g (2.5b)
where v is the vertical velocity of the ball. We next approximate the derivatives by small (ﬁnite)
diﬀerences:
y(t + ∆t) − y(t)
∆t
= v(t) (2.6a)
v(t + ∆t) − v(t)
∆t
= −g. (2.6b)
Note that in the limit ∆t → 0, (2.6) reduces to (2.5). We can rewrite (2.6) as
y(t + ∆t) = y(t) + v(t)∆t (2.7a)
v(t + ∆t) = v(t) − g∆t. (2.7b)
The ﬁnite diﬀerence approximation we used to obtain (2.7) is an example of the Euler algorithm.
Equation (2.7) is an example of a ﬁnite diﬀerence equation, and ∆t is the time step.
Now we are ready to follow y(t) and v(t) in time. We begin with an initial value for y and v
and then iterate (2.7). If ∆t is suﬃciently small, we will obtain a numerical answer that is close
to the solution of the original diﬀerential equations in (2.6). In this case we know the answer,
and we can test our numerical results directly.
Exercise 2.1. A simple example
Consider the ﬁrst-order diﬀerential equation
dy
dx
= f (x) (2.8)
where f (x) is a function of x. The approximate solution as given by the Euler algorithm is
yn+1 = yn + f (xn)∆x. (2.9)
Note that the rate of change of y has been approximated by its value at the beginning of the
interval, f (xn).
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 14
(a) Suppose that f (x) = 2x and y(x = 0) = 0. The analytic solution is y(x) = x2, which we can
conﬁrm by taking the derivative of y(x). Convert (2.8) into a ﬁnite diﬀerence equation using
the Euler algorithm. For simplicity, choose ∆x = 0.1. It would be a good idea to ﬁrst use a
calculator or pencil and paper to determine yn for the ﬁrst several time steps.
(b) Sketch the diﬀerence between the exact solution and the approximate solution given by the
Euler algorithm. What condition would the rate of change f (x) have to satisfy for the Euler
algorithm to give the exact answer?
Problem 2.2. Invent your own numerical algorithm
As we have mentioned, the Euler algorithm evaluates the rate of change of y by its value at the
beginning of the interval, f (xn). The choice of where to approximate the rate of change of y
during the interval from x to x + ∆x is arbitrary, although we will learn that some choices are
better than others. All that is required is that the ﬁnite diﬀerence equation must reduce to the
original diﬀerential equation in the limit ∆x → 0. Think of several other algorithms that are
consistent with this condition.
2.2 Simulating Free Fall
The source code for the class FirstFallingBallApp shown in Listing 2.1 is deﬁned in a ﬁle
named FirstFallingBallApp.java. The code consists of a sequence of statements that create variables
and deﬁne methods. Each statement ends with a semicolon. Each source code ﬁle is compiled
into byte code that can then be executed. The compiler places the byte code in a ﬁle with the
same name as the Java source code ﬁle with the extension class. For example, the compiler converts
FirstFallingBallApp.java into byte code and produces the FirstFallingBallApp.class ﬁle.
One of the features of Java is that this byte code can be used by any computer that can run Java
programs.
A Java application is a class that contains a main method. The following application is an
implementation of the Euler algorithm given in (2.7). The program also compares the numerical
and analytic results. We will next describe the syntax used in each line of the program.
Listing 2.1: First version of a simulation of a falling particle.
1 / / example of a s i n g l e l i n e comment statement ( ignored by compiler )
2 package org . opensourcephysics . sip . ch02 ; / / l o c a t i o n of f i l e
3 / / beginning of c l a s s d e f i n i t i o n
4 public class FirstFallingBallApp {
5 / / beginning of method d e f i n i t i o n
6 public s t a t i c void main ( String [ ] args ) {
7 / / braces { } used to group statements .
8 / / indent statements within a block so that
9 / / they can be e a s i l y i d e n t i f i e d
10 / / f ol l o wi n g statements form the body of main method
11 / / example of d e c l a r a t i o n and assignment statement
12 double y0 = 10;
13 double v0 = 0; / / i n i t i a l v e l o c i t y
14 double t = 0; / / time
15 double dt = 0.01; / / time st ep
16 double y = y0 ;
17 double v = v0 ;
18 double g = 9 . 8 ; / / g r a v i t a t i o n a l f i e l d
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 15
19 / / beginning of loop , n++ eq ui va le nt to n = n + 1
20 for ( int n = 0;n<100;n++) {
21 / / repeat f o ll o wi n g t h r e e statements 100 times
22 y = y+v dt ; / / indent statements in loop f o r c l a r i t y
23 v = v−g dt ; / / use Euler algorithm
24 t = t+dt ;
25 } / / end of f o r loop
26 System . out . println ( "Results" ) ;
27 System . out . println ( "final time = "+t ) ;
28 / / d i s p l a y numerical r e s u l t
29 System . out . println ( "y = "+y+" v = "+v ) ;
30 / / d i s p l a y a n a l y t i c r e s u l t
31 double yAnalytic = y0+v0 t −0.5 g t t ;
32 double vAnalytic = v0−g t ;
33 System . out . println ( "analytic y = "+yAnalytic+" v = "+vAnalytic ) ;
34 } / / end of method d e f i n i t i o n
35 } / / end of c l a s s d e f i n i t i o n
The ﬁrst line in Listing 2.1 is an example of a single line comment statement. Comment
statements are ignored by the computer but can be very important for the user. Multiple line
comments begin with /* and end with */. Javadoc comments begin with /**, but have been
removed from the code listings in the book to save space. Download the source code from
comPADRE to view the complete code with documentation.
The second line in Listing 2.1 declares a package name, which corresponds to the location
(directory) of the source and byte code ﬁles. According to the package declaration, the
ﬁle FirstFallingBallApp.java is in the directory org/opensourcephysics/sip/ch02. The package
statement must be the ﬁrst noncomment statement in the source ﬁle. For organizational convenience,
it is a good idea to put related ﬁles in the same package. When executing a Java
program, the Java Virtual Machine (the run-time environment) will search a speciﬁc set of directories
(called the classpath) for the relevant class ﬁles. The documentation for your local
development environment will describe how to specify the classpath.
The third line in Listing 2.1 declares the class name, FirstFallingBallApp. The Java convention
is to begin a class name with an uppercase letter. If a name consists of more than one
word, the words are joined together, and each succeeding word begins with an uppercase letter
(another Java convention). The keyword public means that this class can be used by any other
Java class.
Braces are used to delimit a block of code. The left brace {, after the name of the class,
begins the body of the class deﬁnition, and the corresponding right brace }, inserted at the end
of the code listing on line 31 ends the class deﬁnition.
The fourth line in Listing 2.1 begins the deﬁnition of the main method. A method describes
a sequence of actions that use the associated data and can be called (invoked) within the class or
by other classes. The main method has a special status in Java. To run a class as a stand-alone
program (an application), the class must deﬁne the main method. (In contrast, a Java applet
runs inside a browser and does not require a main method; instead, it has methods such as init
and start.) The main method is the application’s starting point. The argument of the main
method will always be the same, and understanding its syntax is not necessary here.
Because the code for this book contains hundreds of classes, we will adopt our own convention
that classes that deﬁne main methods have names that end with App. We sometimes refer
to an application that we are about to run as the target class.
Familiarize yourself with your Java development environment by doing Exercise 2.3.
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 16
Exercise 2.3. Our ﬁrst application
(a) Enter the listing of FirstFallingBallApp into a source ﬁle named FirstFallingBallApp.java.
(Java programs can be written using any text editor that supports standard ASCII characters.)
Be sure to pay attention to capitalization because Java is case sensitive. In what directory
should you place the source ﬁle?
(b) Compile and run FirstFallingBallApp. Do the results look reasonable to you? In what
directory did the compiler place the byte code?
Digital computers represent numbers in base 2, that is, sequences of ones and zeros. Each
one or zero is called a bit. For example, the number 13 is equivalent to 1101 or (1×23)+(1×22)+
(0 × 21) + (1 × 20). It would be diﬃcult to write a program if we had to write numbers in base
2. Computer languages allow us to reference memory locations using identiﬁers or variable
names. A valid variable name is a series of characters consisting of letters, digits, underscores,
and dollar signs ($) that does not begin with a digit nor contain any spaces. Because Java distinguishes
between upper and lowercase characters, T and t are diﬀerent variable names. The
Java convention is that variable names begin with a lowercase letter, except in special cases, and
each succeeding word in a variable name begins with an uppercase letter.
In a purely object-oriented language, all variables would be objects that would be introduced
by their class deﬁnitions. However, there are certain variable types that are so common
that they have a special status and are especially easy to create and access. These types are called
primitive data types and represent integer, ﬂoating point, boolean, and character variables. An
example that illustrates that classes are eﬀectively new programmer-deﬁned types is given in
Appendix 2A.
An integer variable, a ﬂoating point variable, and a boolean variable are created and initialized
by the following statements:
int n = 10;
double y0 = 10.0;
boolean inert = true ;
char c = ’A’ ; / / used f o r s i n g l e c h a r a c t e r s
There are four types of integers, byte, short, int, and long, and two types of ﬂoating
point numbers; the diﬀerences are the range of numbers that these types can store. We will
almost always use type int because it does not require as much memory as type long. There
are two types of ﬂoating point numbers, but we will always use type double, the type with
greater precision, to minimize roundoﬀ error and to avoid having to provide multiple versions
of various algorithms. A variable must be declared before it can be used, and it can be initialized
at the same time that its type is declared as is done in Listing 2.1.
Integer arithmetic is exact, in contrast to ﬂoating point arithmetic which is limited by the
maximum number of decimal places that can be stored. Important uses of integers are as counters
in loops and as indices of arrays. An example of the latter is on page 38, where we discuss
the motion of many balls.
A subtle and common error is to use integers in division when a ﬂoating point number
is needed. For example, suppose we ﬂip a coin 100 times and ﬁnd 53 heads. What is the
percentage of heads? In the following we show an unintended side eﬀect of integer division and
several ways of obtaining a ﬂoating point number from an integer.
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 17
int heads = 53;
int tosses = 100;
double percentage = heads/ tosses ; / / percentage w i l l equal 0
percentage = ( double ) heads/ tosses ; / / percentage w i l l equal 0.53
percentage = (1.0 heads )/ tosses ; / / percentage w i l l equal 0.53
These statements indicate that if at least one number is a double, the result of the division will
be a double. The expression (double)heads is called a cast and converts heads to a double.
Because a number with a decimal point is treated as a double, we can also do this conversion by
ﬁrst multiplying heads by 1.0 as is done in the last statement.
Note that we have used the assignment operator, which is the equal (=) sign. This operator
assigns the value to the memory location that is associated with a variable, such as y0 and t. The
following statements illustrate an important diﬀerence between the equal sign in mathematics
and the assignment operator in most programming languages.
int x = 10;
x = x + 1;
The equal sign replaces a value in memory and is not a statement of equality. The left and right
sides of an assignment operator are usually not equal.
A statement is analogous to a complete sentence, and an expression is similar to a phrase.
The simplest expressions are identiﬁers or variables. More interesting expressions can be created
by combining variables using operators, such as the following example of the plus (+) oper-
ator:
x + 3.0
Lines twelve through eighteen of Listing 2.1 declare and initialize variables. If a variable is
declared but not initialized, for example,
double dt ;
then the default value of the variable is 0 for numbers and false for boolean variables. It is a
good idea to initialize all variables explicitly and not rely on their default values.
A very useful control structure is the for loop in line 15 of Listing 2.1. Loops are blocks of
statements that are executed repeatedly until some condition is satisﬁed. They typically require
the initialization of a counter variable, a test to determine if the counter variable has reached
its terminal value, and a rule for changing the counter variable. These three parts of the for
loop are contained within parentheses and are separated by semicolons. It is common in Java to
iterate from 0 to 99, as is done in Listing 2.1, rather than from 1 to 100. Note the use of the ++
operator in the loop construct rather than the equivalent statement n = n + 1. It is important
to indent all the statements within a block so that they can be easily identiﬁed. Java ignores
these spaces, but they are important visual cues to the structure of the program.
After the program ﬁnishes the loop, the result is displayed using the System.out.println
method. We will explain the meaning of this syntax later. The parameter passed to this method,
which appears between the parentheses, is a String. A String is a sequence of characters and
can be created by enclosing text in quotation marks as shown in the ﬁrst println statement in
Listing 2.1. We displayed our numerical results by using the + operator. When applied to a
String and a number, the number is converted to the appropriate String and the two strings
are concatenated (joined). This use is shown in the next three println statements in lines 27,
29, and 33 of Listing 2.1. Note the diﬀerent outputs produced by the following two statements:
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 18
System . out . println ( ( "x = " + 2) + 3 ) ; / / d i s p l a y s x = 23
System . out . println ( "x = " + (2 + 3 ) ) ; / / d i s p l a y s x = 5
The parentheses in the second line force the compiler to treat the enclosed + operator as the
addition operator, but both + operators in the ﬁrst line are treated as concatenation operators.
Exercise 2.4. Exploring FirstFallingBallApp
(a) Run FirstFallingBallApp for various values of the time step ∆t. Do the numerical results
become closer to the analytic results as ∆t is made smaller?
(b) Use an acceptable value for ∆t and run the program for various values for the number of
iterations. What criteria do you have for acceptable? At approximately what time does the
ball hit the ground at y = 0?
(c) What happens if you replace the System.out.println method by the System.out.print
method?
(d) What happens if you try to access the value of the counter variable n outside the for loop?
The scope of n extends from its declaration to the end of the loop block; n is said to have
block scope. If a loop variable is not needed outside the loop, it should be declared in the
initialization expression so that its scope is limited.
You might have found that doing Exercise 2.4 was a bit tedious and frustrating. To do
Exercise 2.4(a) it would be desirable to change the number of iterations at the same time that
the value of ∆t is changed so that we could compare the results for y and v at the same time.
And it is diﬃcult to do Exercise 2.4(b) because we don’t know in advance how many iterations
are needed to reach the ground. For starters we can improve FirstFallingBallApp using a
while statement instead of the for loop.
while ( y > 0) {
/ / statements go here
}
In this example the boolean test for the while statement is done at the beginning of a loop.
It is also possible to do the test at the end:
do {
/ / statements go here
}
while ( y > 0 ) ;
Exercise 2.5. Using while statements
Modify FirstFallingBallApp so that the while statement is used and the program ends when
the ball hits the ground at y = 0. Then repeat Exercise 2.4(b).
Exercise 2.6. Summing a series
(a) Write a program to sum the following series for a given value of N:
S =
N
m=1
1
m2
. (2.10)
The following statements may be useful:
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 19
double sum = 0; / / sum i s eq ui va le nt to S in (2.10)
for ( int m = 1; m <= N; m++) {
sum = sum + 1.0/(m m) ; / / put t h i s statement in f o r loop
}
Note that in this case it is more convenient to start the loop from m = 1 instead of m = 0.
Also note that we have not followed the Java convention, because we have used the variable
name N instead of n so that the Java statements look more like the mathematical equations.
(b) First run your program with N = 10. Then run for larger values of N. Does the series
converge as N → ∞? What value of N is needed to obtain S to within two decimal places?
(c) Modify your program so that it uses a while loop so that the summation continues until the
added term to the sum is less than some value . Run your program for = 10−2, 10−3, and
10−6.
(d) Instead of using the = operator in the statement
sum = sum + 1.0/(m m) ;
use the equivalent operator:
sum += 1.0/(m m) ;
Check that you obtain the same results.
Java provides several shortcut assignment operators that allow you to combine an arithmetic
and an assignment operation. Table 2.1 shows the operators that we will use most often.
Operator Operand Description Sample Expression Result
++, - - number increment, decrement x++; 8.0 stored in x
+, - numbers addition, subtraction 3.5 + x 11.5
! boolean logical complement !(x == y) true
= any assignment y = 3; 3.0 stored in y
*, /, % numbers multiplication, division, modulus 7/2 3.0
== any test for equality x == y false
+= numbers x += 3; equivalent to x = x + 3; x += 3; 14.5 stored in x
-= numbers x -= 2; equivalent to x = x - 2; x -= 2.3; 12.2 stored in x
*= numbers x *= 4; equivalent to x = 4*x; x *= 4; 48.8 stored in x
/= numbers x /= 2; equivalent to x = x/2; x /= 2; 24.4 stored in x
%= numbers x %= 5; equivalent to x = x % 5; x %= 5; 4.4 stored in x
Table 2.1: Common operators. The result for each row assumes that the statements from previous
rows have been executed with double x = 7, y = 3 declared initially. The mod or modulus
operator % computes the remainder after the division by an integer has been performed.
2.3 Getting Started with Object-Oriented Programming
The ﬁrst step in making our program more object-oriented is to separate the implementation
of the model from the implementation of other programming tasks such as producing output.
In general, we will do so by creating two classes. The class that deﬁnes the model is shown in
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 20
Listing 2.2. The FallingBall class ﬁrst declares several (instance) variables and one constant
that can be used by any method in the class. To aid reusability, we need to be very careful
about the accessibility of these class variables to other classes. For example, if we write private
double dt, then the value of dt would only be available to the methods in FallingBall. If we
wrote public double dt, then dt would be available to any class in any package that tried to
access it. For our purposes we will use the default package protection, which means that the
instance variables can be accessed by classes in the same package.
Listing 2.2: FallingBall class.
package org . opensourcephysics . sip . ch02 ;
public class FallingBall {
double y , v , t ; / / instance v a r i a b l e s
double dt ; / / d e f a u l t package p r o t e c t i o n
final s t a t i c double g = 9 . 8 ;
public FallingBall ( ) { / / c o n s t r u c t o r
System . out . println ( "A new FallingBall object is created." ) ;
}
public void step ( ) {
y = y+v dt ; / / Euler algorithm f o r numerical s o l u t i o n
v = v−g dt ;
t = t+dt ;
}
public double analyticPosition ( double y0 , double v0 ) {
return y0+v0 t −0.5 g t t ;
}
public double analyticVelocity ( double v0 ) {
return v0−g t ;
}
}
As we will see, a class is a blueprint for creating objects, not an object itself. Except for
the constant g, all the variable declarations in Listing 2.2 are instance variables. Each time an
object is created or instantiated from the class, a separate block of memory is set aside for the
instance variables. Thus, two objects created from the same class will, in general, have diﬀerent
values of the instance variables. We can insure that the value of a variable is the same for all
objects created from the class by adding the word static to the declaration. Such a variable is
called a class variable and is appropriate for the constant g. In addition, you might not want the
quantity referred to by an identiﬁer to change. For example, g is a constant of nature. We can
prevent a change by adding the keyword final to the declaration. Thus the statement
final s t a t i c double g = 9 . 8 ;
means that a single copy of the constant g will be created and shared among all the objects
instantiated from the class. Without the final qualiﬁer, we could change the value of a class
variable in every instantiated object by changing it in any one object. Static variables and methods
are accessed from another class using the class name without ﬁrst creating an instance (see
page 25).
Another Java convention is that the names of constants should be in upper case. But in
physics the meaning of g, the gravitational ﬁeld, and G, the gravitational constant, have com-
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 21
pletely diﬀerent meanings. So we will disregard this convention if doing so makes our programs
more readable.
We have used certain words such as double, false, main, static, and final. These reserved
words cannot be used as variable names and are examples of keywords.
In addition to the four instance variables y, v, t, and dt, and one class variable g, the
FallingBall class has four methods. The ﬁrst method is FallingBall and is a special method
known as the constructor. A constructor must have the same name as the class and does not have
an explicit return type. We will see that constructors allocate memory and initialize instance
variables when an object is created.
The second method is step, a name that we will frequently use to advance a system’s coordinates
by one time step. The qualiﬁer void means that this method does not return a value.
The next two methods, analyticPosition and analyticVelocity, each return a double
value and have arguments enclosed by parentheses, the parameter list. The list of parameters
and their types must be given explicitly and be separated by commas. The parameters can be
primitive data types or class types. When the method is invoked, the argument types must
match that given in the deﬁnition or be convertible into the type given in the deﬁnition, but
need not have the same names. (Convertible means that the given variable can be unambiguously
converted into another data type. For example, an integer can always be converted into a
double.) For example, we can write
double y0 = 10; / / d e c l a r a t i o n and assignment
int v0 = 0; / / note v0 i s an i n t e g e r
/ / v0 becomes a double b e f o r e method i s c a l l e d
double y = analyticPosition ( y0 , v0 ) ;
double v = analyticVelocity ( v0 ) ;
but the following statements are incorrect:
/ / can ’ t convert String to double automatically
double y = analyticPosition ( y0 , "0" ) ;
/ / method e x p e c t s only one argument
double v = analyticVelocity ( v0 , 0 ) ;
If a method does not receive any parameters, the parentheses are still required as in method
step().
The FallingBall class in Listing 2.2 cannot be used in isolation because it does not contain
a main method. Thus, we create a target class which we place in a separate ﬁle in the same
package. This class will communicate with FallingBall and include the output statements.
This class is shown in Listing 2.3.
Listing 2.3: FallingBallApp class.
/ / package statement appears b e f o r e beginning of c l a s s d e f i n i t i o n
package org . opensourcephysics . sip . ch02 ;
/ / beginning of c l a s s d e f i n i t i o n
public class FallingBallApp {
/ / beginning of method d e f i n i t i o n
public s t a t i c void main ( String [ ] args ) {
/ / d e c l a r a t i o n and i n s t a n t i a t i o n
FallingBall ball = new FallingBall ( ) ;
/ / example of d e c l a r a t i o n and assignment statement
double y0 = 10;
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 22
double v0 = 0;
/ / note use of dot operator to a c c e s s instance v a r i a b l e
ball . t = 0;
ball . dt = 0.01;
ball . y = y0 ;
ball . v = v0 ;
while ( ball . y>0) {
ball . step ( ) ;
}
System . out . println ( "Results" ) ;
System . out . println ( "final time = "+ball . t ) ;
/ / d i s p l a y s numerical r e s u l t s
System . out . println ( "y = "+ball . y+" v = "+ball . v ) ;
/ / d i s p l a y s a n a l y t i c r e s u l t s
System . out . println ( "analytic y = "+ball . analyticPosition ( y0 , v0 ) ) ;
System . out . println ( "analytic v = "+ball . analyticVelocity ( v0 ) ) ;
System . out . println ( "acceleration = "+FallingBall . g ) ;
} / / end of method d e f i n i t i o n
} / / end of c l a s s d e f i n i t i o n
Note how FallingBall is declared and instantiated by creating an object called ball and
how the instance variables and the methods are accessed. The statement
FallingBall ball = new FallingBall ( ) ; / / d e c l a r a t i o n and i n s t a n t i a t i o n
is equivalent to two statements:
FallingBall ball ; / / d e c l a r a t i o n
ball = new FallingBall ( ) ; / / i n s t a n t i a t i o n
The declaration statement tells the compiler that the variable ball is of type FallingBall. It
is analogous to the statement int x for an integer variable. The new operator allocates memory
for this object, initializes all the instance variables, and invokes the constructor. We can create
two identical balls using the following statements:
FallingBall ball1 = new FallingBall ( ) ;
FallingBall ball2 = new FallingBall ( ) ;
The variables and methods of an object are accessed by using the dot operator. For example,
the variable t of object ball is accessed by the expression ball.t, and the method step is called
as ball.step(). Because the methods, analyticPosition and analyticVelocity return values
of type double, they can appear in any expression in which a double-valued constant or
variable can appear. In the present context the values returned by these two methods will be
displayed by the println statement. Note that the static variable g in class FallingBallApp is
accessed through the class name.
Exercise 2.7. Use of two classes
(a) Enter the listing of FallingBall into a ﬁle named FallingBall.java and FallingBallApp into
a ﬁle named FallingBallApp.java and put them in the same directory. Run your program
and make sure your results are the same as those found in Exercise 2.5.
(b) Modify FallingBallApp by adding a second instance variable ball2 of the same type as
ball. Add the necessary code to initialize ball2, iterate ball2, and display the results for
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 23
both objects. Write your program so that the only diﬀerence between the two balls is the
value of ∆t. How much smaller does ∆t have to be to reduce the error in the numerical
results by a factor of two for the same ﬁnal time? What about a factor of four? How does
the error depend on ∆t?
(c) Add the statement FallingBall.g = 2.0 to your program from part (b) and use the same
value of dt for ball and ball2. What happens when you try to compile the program?
(d) Delete the final qualiﬁer for g in FallingBall and recompile and run your program. Is
there any diﬀerence between the results for the two balls? Is there a diﬀerence between the
results compared to what you found for g = 9.8?
(e) Remove the qualiﬁer static. Now g must be accessed using the object name, ball or ball2
instead of FallingBall. Recompile your program again, and run your program. How do
the results for the two balls compare now?
(f) Explain in your own words the meaning of the qualiﬁers static and final.
It is possible for a class to have more than one constructor. For example, we could have a
second constructor deﬁned by
public FallingBall ( double dt ) {
/ / " t h i s . dt " r e f e r s to an instance v a r i a b l e that has the
/ / same name as the argument
this . dt = dt ;
}
Note the possible confusion of the variable name dt in the argument of the FallingBall constructor
and the variable deﬁned near the beginning of the FallingBall class. A variable that
is passed to a method as an argument (parameter) or that is deﬁned (created) within a method
is known as a local variable. A variable that is deﬁned outside of a method is known as an instance
variable. Instance variables are more powerful than local variables because they can be
referenced (used) anywhere within an object, and because their values are not lost when the
execution of the method is ﬁnished. When a variable name conﬂict occurs, it is necessary to
use the keyword this to access the instance variable. Otherwise, the program would access the
variable in the argument (the local variable) with the same name.
Exercise 2.8. Multiple constructors
(a) Add a second constructor with the argument double dt to FallingBall, but make no other
changes. Run your program. Nothing changed because you didn’t use this new constructor.
(b) Now modify FallingBallApp to use the new constructor:
/ / d e c l a r a t i o n and i n s t a n t i a t i o n
FallingBall ball = new FallingBall ( 0 . 0 1 ) ;
What statement in FallingBallApp can now be removed? Run your program and make
sure it works. How can you tell that the new constructor was used?
(c) Show that the number of parameters and their type in the argument list determines which
constructor is used in FallingBall. For example, show that the statements
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 24
double tau = 0.01;
/ / d e c l a r a t i o n and i n s t a n t i a t i o n
FallingBall ball = new FallingBall ( tau ) ;
are equivalent to the syntax used in part (b).
It is easy to create additional models for other kinds of motion. Cut and paste the code in
the FallingBall into a new ﬁle named SHO.java, and change the code to solve the following
two ﬁrst-order diﬀerential equations for a ball attached to a spring:
dx
dt
= v (2.11a)
dv
dt
= −
k
m
x (2.11b)
where x is the displacement from equilibrium and k is the spring constant. Note that the new
class shown in Listing 2.4 has a structure similar to that of the class shown in Listing 2.2.
Listing 2.4: SHO class.
package org . opensourcephysics . sip . ch02 ;
public class SHO {
double x , v , t ;
double dt ;
double k = 1 . 0 ; / / spring constant
double omega0 = Math . sqrt (k ) ; / / assume unit mass
public SHO( ) { / / c o n s t r u c t o r
System . out . println ( "A new harmonic oscillator object is created." ) ;
}
public void step ( ) {
/ / modified Euler algorithm
v = v−k x dt ;
x = x+v dt ; / / note that updated v i s used
t = t+dt ;
}
public double analyticPosition ( double y0 , double v0 ) {
return y0 Math . cos ( omega0 t )+v0/omega0 Math . sin ( omega0 t ) ;
}
public double analyticVelocity ( double y0 , double v0 ) {
return −y0 omega0 Math . sin ( omega0 t )+v0 Math . cos ( omega0 t ) ;
}
}
Exercise 2.9. Simple harmonic oscillator
(a) Explain how the implementation of the Euler algorithm in the step method of class SHO
diﬀers from what we did previously.
(b) The general form of the analytic solution of (2.11) can be expressed as
y(t) = Acosω0t + Bsinω0t (2.12)
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 25
where ω2
0 = k/m. What is the form of v(t)? Show that (2.12) satisﬁes (2.11) with A = y(t = 0)
and B = v(t = 0)/ω0. These analytic solutions are used in class SHO.
(c) Write a target class called SHOApp that creates an SHO object and solves (2.11). Start the ball
with displacements of x = 1, x = 2, and x = 4. Is the time it takes for the ball to reach x = 0
always the same?
The methods that we have written so far have been nonstatic methods (except for main). As
we have seen, these methods cannot be used without ﬁrst creating or instantiating an object.
In contrast, static methods can be used directly without ﬁrst creating an object. A class that is
included in the core Java distribution and that we will use often is the Math class, which provides
many common mathematical methods, including trigonometric, logarithmic, exponential, and
rounding operations, and predeﬁned constants. Some examples of the use of the Math class
include:
double theta = Math . PI /4; / / constant pi defined in Math c l a s s
double u = Math . sin ( theta ) ; / / sine of theta
double v = Math . log ( 0 . 1 ) ; / / natural logarithm of 0.1
double w = Math .pow( 1 0 , 0 . 4 ) ; / / 10 to the 0.4 power
double x = Math . atan ( 3 . 0 ) ; / / i n v e r s e tangent
Note the use of the dot notation in these statements and the Java convention that constants such
as the value of π are written in uppercase letters, that is, Math.PI. Exercise 2.10 asks you to
read the Math class documentation to learn about the methods in the Math class. To use these
methods we need only to know what mathematical functions they compute; we do not need to
know about the details of how the methods are implemented.
Exercise 2.10. The Math class
The documentation for Java is a part of most development environments. It can also be downloaded
from <docs.oracle.com/javase/8/docs/api/>. Look for API docs and a link to the
latest standard edition.
(a) Read the documentation of the Math class and describe the diﬀerence between the two versions
of the arctangent method.
(b) Write a program to verify the output of several of the methods in the Math class.
2.4 Inheritance
The falling ball and the simple harmonic oscillator have important features in common. Both
are models of physical systems that represent a physical object as if all its mass was concentrated
at a single point. Writing two separate classes by cutting and pasting is straightforward
and reasonable because the programs are small and easy to understand. But this approach
fails when the code becomes more complex. For example, suppose that you wish to simulate
a model of a liquid consisting of particles that interact with one another according to some
speciﬁed force law. Because such simulations are now standard (see Chapter 8), eﬃcient code
for such simulations is available. In principle, it would be desirable to use an already written
program, assuming that you understood the nature of such simulations. However, in practice,
using someone else’s program can require much eﬀort if the code is not organized properly. Fortunately,
this situation is changing as more programmers learn object- oriented techniques and
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 26
write their programs so that they can be used by others without needing to know the details of
the implementation.
For example, suppose that you decided to modify an already existing program by changing
to a diﬀerent force law. You change the code and save it under a new name. Later you discover
that you need a diﬀerent numerical algorithm to advance the particles’ positions and velocities.
You again change the code and save the ﬁle under yet another name. At the same time the
original author discovers a bug in the initialization method and changes her code. Your code
is now out of date because it does not contain the bug ﬁx. Although strict documentation and
programming standards can minimize these types of diﬃculties, a better approach is to use
object oriented features such as inheritance. Inheritance avoids duplication of code and makes
it easier to debug a number of classes without needing to change each class separately.
We now write a new class that encapsulates the common features of the falling ball and the
simple harmonic oscillator. We name this new class Particle. The falling ball and harmonic
oscillator that we will deﬁne later implement their distinguishing features.
Listing 2.5: Particle class.
package org . opensourcephysics . sip . ch02 ;
abstract public class P a r t i c l e {
double y , v , t ; / / instance v a r i a b l e s
double dt ; / / time st ep
public P a r t i c l e ( ) { / / c o n s t r u c t o r
System . out . println ( "A new Particle is created." ) ;
}
abstract protected void step ( ) ;
abstract protected double analyticPosition ( ) ;
abstract protected double analyticVelocity ( ) ;
}
The abstract keyword allows us to deﬁne the Particle class without knowing how the
step, analyticPosition, and analyticVelocity methods will be implemented. Abstract classes
are useful in part because they serve as templates for other classes. The abstract class contains
some but not all of what a user will need. By making the class abstract, we must express the
abstract idea of “particle” explicitly and customize the abstract class to our needs.
By using inheritance we now extend the Particle class (the superclass) to another class
(the subclass). The FallingParticle class shown in Listing 2.6 implements the three abstract
methods. Note the use of the keyword extends. We also have used a constructor with the initial
position and velocity as arguments.
Listing 2.6: FallingParticle class.
package org . opensourcephysics . sip . ch02 ;
public class F a l l i n g P a r t i c l e extends P a r t i c l e {
final s t a t i c double g = 9 . 8 ; / / constant
/ / i n i t i a l p o s i t i o n and v e l o c i t y
private double y0 = 0 , v0 = 0;
public F a l l i n g P a r t i c l e ( double y , double v ) { / / c o n s t r u c t o r
System . out . println ( "A new FallingParticle object is created." ) ;
this . y = y ; / / instance value s e t equal to passed value
this . v = v ; / / instance value s e t equal to passed value
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 27
y0 = y ; / / no need to use " t h i s " because t h e r e i s only one y0
v0 = v ;
}
public void step ( ) {
y = y+v dt ; / / Euler algorithm
v = v−g dt ;
t = t+dt ;
}
public double analyticPosition ( ) {
return y0+v0 t −(g t t ) / 2 . 0 ;
}
public double analyticVelocity ( ) {
return v0−g t ;
}
}
FallingParticle is a subclass of its superclass Particle. Because the methods and data
of the superclass are available to the subclass (except those that are explicitly labeled private),
FallingParticle inherits the variables y, v, t, and dt.1
We now write a target class to make use of our new abstraction. Note that we create a new
FallingParticle, but assign it to a variable of type Particle.
Listing 2.7: FallingParticleApp class.
package org . opensourcephysics . sip . ch02 ;
/ / beginning of c l a s s d e f i n i t i o n
public class FallingParticleApp {
/ / beginning of method d e f i n i t i o n
public s t a t i c void main ( String [ ] args ) {
/ / d e c l a r a t i o n and i n s t a n t i a t i o n
P a r t i c l e ball = new F a l l i n g P a r t i c l e (10 , 0 ) ;
ball . t = 0;
ball . dt = 0.01;
while ( ball . y>0) {
ball . step ( ) ;
}
System . out . println ( "Results" ) ;
System . out . println ( "final time = "+ball . t ) ;
/ / numerical r e s u l t
System . out . println ( "y = "+ball . y+" v = "+ball . v ) ;
/ / a n a l y t i c r e s u l t
System . out . println ( "y analytic = "+ball . analyticPosition ( ) ) ;
} / / end of method d e f i n i t i o n
} / / end of c l a s s d e f i n i t i o n
Problem 2.11. Inheritance
(a) Run the FallingParticleApp class. How can you tell that the constructor of the superclass
was called?
1In this case Particle and FallingParticle must be in the same package. If FallingParticle was in a diﬀerent
package, it would be able to access these variables only if they were declared protected or public.
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 28
(b) Rewrite the SHO class so that it is a subclass of Particle. Remove all unnecessary variables
and implement the abstract methods.
(c) Write the target class SHOParticleApp to use the new SHOParticle class. Use the analyticPosition
and analyticVelocity methods to compare the accuracy of the numerical
and analytic answers in both the falling particle and harmonic oscillator models.
(d) Try to instantiate a Particle directly by calling the Particle constructor. Explain what
happens when you compile this program.
If you examine the console output in Problem 2.11a, you should ﬁnd that whenever an
object from the subclass is instantiated, the constructor of the superclass is executed as well as
the constructor of the subclass. You will also ﬁnd that an abstract class cannot be instantiated
directly; it must be extended ﬁrst.
Exercise 2.12. Extending classes
(a) Extend the FallingParticle and SHOParticle classes and give them names such as FallingParticleEC
and SHOParticleEC, respectively. These subclasses should redeﬁne the
step method so that it ﬁrst calculates the new velocity and then calculates the new position
using the new velocity, that is,
public void step ( ) {
v = v − g dt ; / / f a l l i n g b a l l
y = y + v dt ; f
t = t + dt ;
}
public void step ( ) {
v = v − k x dt ; / / harmonic o s c i l l a t o r
x = x + v dt ;
t = t + dt ;
}
Methods can be redeﬁned (overloaded) in the subclass by writing a new method in the
subclass deﬁnition with the same name and parameter list as the superclass deﬁnition.
(b) Conﬁrm that your new step method is executed instead of the one in the superclass.
(c) The algorithm that is implemented in the redeﬁned step method is known as the Euler–
Cromer algorithm. Compare the accuracy of this algorithm to the original Euler algorithm
for both the falling particle and the harmonic oscillator. We will explore the Euler–Cromer
algorithm in more detail in Problem 3.1.
The falling particle and harmonic oscillator programs are simple, but they demonstrate
important object-oriented concepts. However, we typically will not build our models using
inheritance because our focus is on the physics and not on producing a software library, and also
because readers will not use our programs in the same order. We will ﬁnd that our main use of
inheritance will be to extend abstract classes in the Open Source Physics library to implement
calculations and simulations by customizing a small number of methods.
So far our target classes have only included one method, main. We could have used more
than one method, but for the short demonstration and test programs we have written so far,
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 29
Figure 2.1: An Open Source Physics control that is used to input parameter values and display
results.
such a practice is unnecessary. When you send a short email to a friend, you are not likely to
break up your message into paragraphs. But when you write a paper longer than about a half
a page, it makes sense to use more than one paragraph. The same sensitivity to the need for
structure should be used in programming. Most of the programs in the following chapters will
consist of two classes, each of which will have several instance variables and methods.
2.5 The Open Source Physics Library
For each exercise in this chapter, you have had to change the program, compile it, and then run
it. It would be much more convenient to input initial conditions and values for the parameters
without having to recompile. However, a discussion of how to make input ﬁelds and buttons
using Java would distract us from our goal of learning how to simulate physical systems. Moreover,
the code we would use for input (and output) would be almost the same in every program.
For this reason input and output should be in separate classes so that we can easily use them
in all our programs. Our emphasis will be to describe how to use the Open Source Physics library
as a tool for writing graphical interfaces, plotting graphs, and doing visualizations. If you
are interested, you can read the source code of the many Open Source Physics classes and can
modify or subclass them to meet your special needs.
We ﬁrst introduce the Open Source Physics library in several simple contexts. Download
the Open Source Physics library from <www.opensourcephysics.org> and include the library
in your development environment. The following program illustrates how to make a simple
plot.
Listing 2.8: An example of a simple plot.
package org . opensourcephysics . sip . ch02 ;
import org . opensourcephysics . frames . PlotFrame ;
public class PlotFrameApp {
public s t a t i c void main ( String [ ] args ) {
PlotFrame frame = new PlotFrame ( "x" , "sin(x)/x" , "Plot example" ) ;
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 30
for ( int i = 1; i <=100; i ++) {
double x = i 0 . 2 ;
frame . append (0 , x , Math . sin ( x )/ x ) ;
}
frame . s e t V i s i b l e ( true ) ;
frame . setDefaultCloseOperation ( javax . swing . JFrame .EXIT_ON_CLOSE ) ;
}
}
The import statement tells the Java compiler where to ﬁnd the Open Source Physics classes
that are needed. A frame is often referred to as a window and can include a title and a menu
bar as well as objects such as buttons, graphics, and text information. The Open Source Physics
frames package deﬁnes several frames that contain data visualization and analysis tools. We
will use the PlotFrame class to plot x-y data. The constructor for PlotFrame has three arguments
corresponding to the name of the horizontal axis, the name of the vertical axis, and the
title of the plot. To add data to the plot, we use the append method. The ﬁrst argument of append
is an integer that labels a particular set of data points, the second argument is the horizontal (x)
value of the data point, and the third argument is the vertical (y) value. The setVisible(true)
method makes a frame appear on the screen or brings it to the front. The last statement makes
the program exit when the frame is closed. What happens when this statement is not included?
The example from the Open Source Physics library in Listing 2.9 illustrates how to control a
calculation with two buttons, determine the value of an input parameter, and display the result
in the text message area.
Listing 2.9: An example of a Calculation.
package org . opensourcephysics . sip . ch02 ;
/ / means get a l l c l a s s e s in c o n t r o l s s u b d i r e c t o r y
import org . opensourcephysics . controls . ;
public class CalculationApp extends AbstractCalculation {
public void calculate ( ) { / / Does a c a l c u l a t i o n
control . println ( "Calculation button pressed." ) ;
/ / String must match argument of setValue
double x = control . getDouble ( "x value" ) ;
control . println ( "x*x = "+(x x ) ) ;
control . println ( "random = "+Math . random ( ) ) ;
}
public void reset ( ) {
/ / d e s c r i b e s parameter and s e t s i t s value
control . setValue ( "x value" , 1 0 . 0 ) ;
}
/ / Creates a c a l c u l a t i o n c o n t r o l s t r u c t u r e using t h i s c l a s s
public s t a t i c void main ( String [ ] args ) {
CalculationControl . createApp (new CalculationApp ( ) ) ;
}
}
AbstractCalculation is an abstract class, which as we have seen means that it cannot be
instantiated directly and must be extended in order to implement the calculate method, that
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 31
is, you must write (implement) the calculate method. You can also write an optional reset
method, which is called whenever the Reset button is clicked. Finally, we need to create a
graphical user interface that will invoke methods when the Calculate and Reset buttons are
clicked. This user interface is an object of type CalculationControl:
CalculationControl . createApp (new CalculationApp ( ) ) ;
The method createApp is a static method that instantiates an object of type CalculationControl
and returns this object. We could have written
CalculationControl control = CalculationControl . createApp (new CalculationApp ( ) ) ;
which shows explicitly the returned object which we gave the name control. However, because
we do not use the object control explicitly in the main method, we do not need to actually
declare an object name for it.
Exercise 2.13. CalculationApp
Compile and run CalculationApp. Describe what the graphical user interface looks like and
how it works by clicking the buttons (see Figure 2.1).
The reset method is called automatically when a program is ﬁrst created and whenever
the Reset button is clicked. The purpose of this method is to clear old data and recreate the
initial state with the default values of the parameters and instance variables. The default values
of the parameters are displayed in the control window so that they can be changed by the user.
An example of how to show values in a control follows:
public void reset ( ) {
/ / d e s c r i b e s parameter and s e t s the value
control . setValue ( "x value" , 1 0 . 0 ) ;
}
The string appearing in the setValue method must be identical to the one appearing in the
getDouble method. If you write your own reset method, it will override the reset method
that is already deﬁned in the AbstractCalculation superclass.
After the reset method stores the parameters in the control, the user can edit the parameters
and we can later read these parameters using the calculate method:
public void calculate ( ) {
/ / String must match argument of setValue
double x = control . getDouble ( "x value" ) ;
control . println ( "x*x = " + ( x x ) ) ;
}
Exercise 2.14. Changing parameters
(a) Run CalculateApp to see how the control window can be used to change a program’s parameters.
What happens if the string in the getDouble method does not match the string in
the setValue method?
(b) Incorporate the plot statements in Listing 2.8 into a class that extends the AbstractCalculation
class and plots the function sinkx for various values of the input parameter k.
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 32
When you run the modiﬁed CalculationApp in Exercise 2.14, you should see a window
with two buttons and an input parameter and its default value. Also, there should be a text
area below the buttons where messages can appear. When the Calculate button is clicked,
the calculate method is executed. The control.getDouble method reads in values from the
control window. These values can be changed by the user. Then the calculation is performed
and the result displayed in the message area using the control.println method, similar to the
way we used System.out.println earlier. If the Reset button is clicked, the message area is
cleared and the reset method is called.
We will now use a CalculationControl to change the input parameters for a falling particle.
The modiﬁed FallingParticleApp is shown in Listing 2.10.
Listing 2.10: FallingParticleCalcApp class.
package org . opensourcephysics . sip . ch02 ;
import org . opensourcephysics . controls . ;
/ / beginning of c l a s s d e f i n i t i o n
public class FallingParticleCalcApp extends AbstractCalculation {
public void calculate ( ) {
/ / g e t s i n i t i a l c o n d i t i o n s
double y0 = control . getDouble ( "Initial y" ) ;
double v0 = control . getDouble ( "Initial v" ) ;
/ / s e t s i n i t i a l c o n d i t i o n s
P a r t i c l e ball = new F a l l i n g P a r t i c l e ( y0 , v0 ) ;
/ / reads parameters and s e t s dt
ball . dt = control . getDouble ( "dt" ) ;
while ( ball . y>0) {
ball . step ( ) ;
}
control . println ( "final time = "+ball . t ) ;
/ / d i s p l a y s numerical r e s u l t s
control . println ( "y = "+ball . y+" v = "+ball . v ) ;
/ / d i s p l a y s a n a l y t i c p o s i t i o n
control . println ( "analytic y = "+ball . analyticPosition ( ) ) ;
/ / d i s p l a y s a n a l y t i c v e l o c i t y
control . println ( "analytic v = "+ball . analyticVelocity ( ) ) ;
}
public void reset ( ) {
control . setValue ( "Initial y" , 10);
control . setValue ( "Initial v" , 0 ) ;
control . setValue ( "dt" , 0 . 0 1 ) ;
}
/ / c r e a t e s a c a l c u l a t i o n c o n t r o l s t r u c t u r e using t h i s c l a s s
public s t a t i c void main ( String [ ] args ) {
CalculationControl . createApp (new FallingParticleCalcApp ( ) ) ;
}
} / / end of c l a s s d e f i n i t i o n
Exercise 2.15. Input of parameters and initial conditions
(a) Run FallingParticleCalcApp and make sure you understand how the control works. Try
inputting diﬀerent values of the parameters and the initial conditions.
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 33
(b) Vary ∆t and ﬁnd the value of t when y = 0 to two decimal places.
Exercise 2.16. Displaying ﬂoating point numbers
Double precision numbers store 16 signiﬁcant digits and every digit is included when the number
is converted to a string. We can reduce the number of digits that are displayed using the
DecimalFormat class in the java.text package. A formatter is created using a pattern, such as
#0.00 or #0.00E0, and this format is applied to a number to produce a string.
DecimalFormat decimal2 = new DecimalFormat ( "#0.00" ) ;
double x = 1 . 0 / 3 . 0 ;
System . out . println ( "x = "+decimal2 . format ( x ) ) ; / / d i s p l a y s 3.33
(a) Use the DecimalFormat class to modify the output from FallingParticleCalcApp so that
it matches the output shown in Figure 2.1.
(b) Modify the output so that results are shown using scientiﬁc notation with three decimal
places.
(c) The Open Source Physics ControlUtils class in the controls package contains a static
method f3 that formats a ﬂoating point number using three decimal places. Use this method
to format the output from FallingParticleCalcApp.
You probably have found that it is diﬃcult to write a program so that it ends exactly when
the falling ball is at y = 0. We could write the program so that ∆t keeps changing near y = 0
so that the last value computed is at y = 0. Another limitation of our programs that we have
written so far is that we have shown the results only at the end of the calculation. We could put
println statements inside the while loop, but it would be better to plot the results and have a
table of the data. An example is shown in Listing 2.11.
Listing 2.11: FallingParticlePlotApp class.
package org . opensourcephysics . sip . ch02 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . ;
public class FallingParticlePlotApp extends AbstractCalculation {
PlotFrame plotFrame = new PlotFrame ( "t" , "y" , "Falling Ball" ) ;
public void calculate ( ) {
/ / data not c l e a r e d at beginning of each c a l c u l a t i o n
plotFrame . setAutoclear ( false ) ;
/ / g e t s i n i t i a l c o n d i t i o n s
double y0 = control . getDouble ( "Initial y" ) ;
double v0 = control . getDouble ( "Initial v" ) ;
/ / s e t s i n i t i a l c o n d i t i o n s
P a r t i c l e ball = new F a l l i n g P a r t i c l e ( y0 , v0 ) ;
/ / g e t s parameters
ball . dt = control . getDouble ( "dt" ) ;
double t = ball . t ; / / g e t s value of time from b a l l o b j e c t
while ( ball . y>0) {
ball . step ( ) ;
plotFrame . append (0 , ball . t , ball . y ) ;
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 34
plotFrame . append (1 , ball . t , ball . analyticPosition ( ) ) ;
}
}
public void reset ( ) {
control . setValue ( "Initial y" , 10);
control . setValue ( "Initial v" , 0 ) ;
control . setValue ( "dt" , 0 . 0 1 ) ;
}
/ / s e t s up c a l c u l a t i o n c o n t r o l s t r u c t u r e using t h i s c l a s s
public s t a t i c void main ( String [ ] args ) {
CalculationControl . createApp (new FallingParticlePlotApp ( ) ) ;
}
}
The two data sets, indexed by 0 and 1, correspond to the numerical data and the analytic results,
respectively. The default action in the Open Source Physics library is to clear the data and
redraw data frames when the Calculate button is clicked. This automatic clearing of data
can be disabled using the setAutoclear method. We have disabled it here to allow the user
to compare the results of multiple calculations. Data is automatically cleared when the Reset
button is clicked.
Exercise 2.17. Data output
(a) Run FallingParticlePlotApp. Under the Views menu choose DataTable to see a table of
data corresponding to the plot. You can copy this data and use it in another program for
further analysis.
(b) Your plotted results probably look like one set of data because the numerical and analytic
results are similiar. Let dt = 0.1 and click the Calculate button. Does the discrepancy
between the numerical and analytic results become larger with increasing time? Why?
(c) Run the program for two diﬀerent values of dt. How do the plot and the table of data diﬀer
when two runs are done, ﬁrst separated without clicking Reset, and then done by clicking
Reset between calculations? Make sure you look at the entire table to see the diﬀerence.
When is the data cleared? What happens if you eliminate the plotFrame.setAutoclear(false)
statement? When is the data cleared now?
(d) Modify your program so that the velocity is shown in a separate window from the position.
2.6 Animation and Simulation
The AbstractCalculation class provides a structure for doing a single computation for a ﬁxed
amount of time. However, frequently we do not know how long we want to run a program, and
it would be desirable if the user could intervene at any time. In addition, we would like to be
able to visualize the results of a simulation and do an animation. To do so involves a programming
construct called a thread. Threads enable a program to execute statements independently
of each other as if they were run on separate processors (which would be the case on a multiprocessor
computer). We will use one thread to update the model and display the results. The
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 35
other thread, the event thread, will monitor the keyboard and mouse so that we can stop the
computation whenever we desire.
The AbstractSimulation class provides a structure for doing simulations by performing
a series of computations (steps) that can be started and stopped by the user using a graphical
user interface. You will need to know nothing about threads because their use is “hidden” in the
AbstractSimulation class. However, it is good to know that the Open Source Physics library is
written so that the graphical user interface does not let us change a program’s input parameters
while the simulation is running. Most of the programs in the text will be done by extending the
AbstractSimulation class and implementing the doStep method as shown in Listing 2.12. Just
as the AbstractCalculation class uses the graphical user interface of type CalculationControl,
the AbstractSimulation class uses one of type SimulationControl. This graphical user
interface has three buttons whose labels change depending on the user’s actions. As was the
case with CalculationControl, the buttons in SimulationControl invoke speciﬁc methods.
Listing 2.12: A simple example of the extension of the AbstractSimulation class.
package org . opensourcephysics . sip . ch02 ;
import org . opensourcephysics . controls . AbstractSimulation ;
import org . opensourcephysics . controls . SimulationControl ;
public class SimulationApp extends AbstractSimulation {
int counter = 0;
public void doStep ( ) { / / does a simulation st ep
control . println ( "Counter = "+( counter − −));
}
public void i n i t i a l i z e ( ) {
counter = control . getInt ( "counter" ) ;
}
public void reset ( ) { / / invoked when r e s e t button i s pressed
/ / allows dt to be changed a f t e r i n i t i a l i z a t o n
control . setAdjustableValue ( "counter" , 100);
}
public s t a t i c void main ( String [ ] args ) {
/ / c r e a t e s a simulation s t r u c t u r e using t h i s c l a s s
SimulationControl . createApp (new SimulationApp ( ) ) ;
}
}
Exercise 2.18. AbstractSimulation class
Run SimulationApp and see how it works by clicking the buttons. Explain the role of the various
buttons. How many times per second is the doStep method invoked when the simulation
is running?
The buttons in the SimulationControl that were used in SimulationApp in Listing 2.12
invoke methods in the AbstractSimulation class. These methods start and stop threads and
perform other housekeeping chores. When the user clicks the Initialize button, the simulation’s
Initialize method is executed. When the Reset button is clicked, the reset method is
executed. If you don’t write your own versions of these two methods, their default versions will
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 36
be used. After the Initialize button is clicked, it becomes the Start button. After the Start
button is clicked, it is replaced by a Stop button, and the doStep method is invoked continually
until the Stop button is clicked. The default is that the frames are redrawn every time doStep is
executed. Clicking the Step button will cause the doStep method to be executed once. The New
button changes the Start button to an Initialize button, which forces the user to initialize
a new simulation before restarting. Later we will learn how to add other buttons that give the
user even more control over the simulation.
A typical simulation needs to (1) specify the initial state of the system in the initialize
method, (2) tell the computer what to execute while the thread is running in the doStep method,
and (3) specify what state the system should return to in the reset method.
We could modify the falling particle model to use AbstractSimulation, but such a modiﬁcation
would not be very interesting because there is only one particle and all motion takes
place in one dimension. Instead, we will deﬁne a new class that models a ball moving in two
dimensions, and we will allow the ball to bounce oﬀ the ground and oﬀ of the walls.
Listing 2.13: BouncingBall class.
package org . opensourcephysics . sip . ch02 ;
import org . opensourcephysics . display . Circle ;
/ / C i r c l e i s a c l a s s that can draw i t s e l f
public class BouncingBall extends Circle {
final s t a t i c double g = 9 . 8 ;
final s t a t i c double WALL = 10;
/ / i n i t i a l p o s i t i o n and v e l o c i t y
private double x , y , vx , vy ;
public BouncingBall ( double x , double vx , double y , double vy ) {
this . x = x ; / / s e t s instance value equal to passed value
this . vx = vx ; / / s e t s instance value equal to passed value
this . y = y ;
this . vy = vy ;
/ / s e t s the p o s i t i o n using setXY in C i r c l e s u p e r c l a s s
setXY ( x , y ) ;
}
public void step ( double dt ) {
x = x+vx dt ; / / Euler algorithm f o r numerical s o l u t i o n
y = y+vy dt ;
vy = vy−g dt ;
i f ( x>WALL) {
vx = −Math . abs ( vx ) ; / / bounce o f f r i g h t wall
} else i f ( x<−WALL) {
vx = Math . abs ( vx ) ; / / bounce o f f l e f t wall
}
i f ( y<0) {
vy = Math . abs ( vy ) ; / / bounce o f f f l o o r
}
setXY ( x , y ) ;
}
}
To model the bounce of the ball oﬀ a wall, we have added statements such as
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 37
i f ( y < 0) vy = Math . abs ( vy ) ;
This statement insures that the ball will move up if y < 0, and is a crude implementation of an
elastic collision. (The Math.abs method returns the absolute value of its argument.)
Note our ﬁrst use of the if statement. The general form of an if statement is as follows:
i f ( boolean_expression ) {
/ / code executed i f boolean e x p r e s s i o n i s true
} else {
/ / code executed i f boolean e x p r e s s i o n i s f a l s e
}
We can test multiple conditions by chaining if statements:
i f ( boolean_expression ) {
/ / code goes here
} else i f ( boolean_expression ) {
/ / code goes here
} else {
/ / code goes here
}
If the ﬁrst boolean expression is true, then only the statements within the ﬁrst brace will be executed.
If the ﬁrst boolean expression is false, then the second boolean expression in the else if
expression will be tested, and so forth. If there is an else expression, then the statements after
it will be executed if all the other boolean expressions are false. If there is only one statement
to execute, the braces are optional.
The BouncingBall class is similar to the FallingBall class except that it extends Circle.
We inherit from the Circle class because this class includes a simple method that allows the
object to draw itself in an Open Source Physics frame called DisplayFrame, which we will use
in BouncingBallApp. In the latter we instantiate BouncingBall and DisplayFrame objects so
that the circle will be drawn at its x-y location when the frame is displayed or while a simulation
is running.
To make the animation more interesting, we will animate the motion of many noninteracting
balls with random initial velocities. BouncingBallApp creates an arbitrary number of
noninteracting bouncing balls by creating an array of BouncingBall objects.
Listing 2.14: BouncingBallApp class.
package org . opensourcephysics . sip . ch02 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . ;
public class BouncingBallApp extends AbstractSimulation {
/ / d e c l a r e s and i n s t a n t i a t e s a window to draw b a l l s
DisplayFrame frame = new DisplayFrame ( "x" , "y" , "Bouncing Balls" ) ;
BouncingBall [ ] ball ; / / d e c l a r e s an array of BouncingBall o b j e c t s
double time , dt ;
public void i n i t i a l i z e ( ) {
/ / s e t s boundaries of window in world c o o r d i n a t e s
frame . setPreferredMinMax ( −10.0 , 10.0 , 0 , 10);
time = 0;
frame . clearDrawables ( ) ; / / removes old p a r t i c l e s
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 38
int n = control . getInt ( "number of balls" ) ;
int v = control . getInt ( "speed" ) ;
/ / i n s t a n t i a t e s array of n BouncingBall o b j e c t s
ball = new BouncingBall [n ] ;
for ( int i = 0; i <n ; i ++) {
double theta = Math . PI Math . random ( ) ; / / random angle
/ / i n s t a n t i a t e s the i t h BouncingBall o b j e c t
ball [ i ] = new BouncingBall (0 , v Math . cos ( theta ) , 0 , v Math . sin ( theta ) ) ;
/ / adds b a l l to frame so that i t w i l l be displayed
frame . addDrawable ( ball [ i ] ) ;
}
/ / decimalFormat i n s t a n t i a t e d in s u p e r c l a s s and used to format numbers conveniently
/ / message appears in lower r i g h t hand corner
frame . setMessage ( "t = "+decimalFormat . format ( time ) ) ;
}
/ / invoked every 1/10 second by timer in AbstractSimulation s u p e r c l a s s
public void doStep ( ) {
for ( int i = 0; i <ball . length ; i ++) {
ball [ i ] . step ( dt ) ;
}
time += dt ;
frame . setMessage ( "t="+decimalFormat . format ( time ) ) ;
}
/ / invoked when s t a r t or st ep button i s pressed
public void startRunning ( ) {
dt = control . getDouble ( "dt" ) ; / / g e t s time st ep
}
public void reset ( ) { / / invoked when r e s e t button i s pressed
/ / allows dt to be changed a f t e r i n i t i a l i z a t o n
control . setAdjustableValue ( "dt" , 0 . 1 ) ;
control . setValue ( "number of balls" , 40);
control . setValue ( "speed" , 10);
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl . createApp (new BouncingBallApp ( ) ) ;
}
}
Because we will advance the dynamical variables of each ball using a loop, we store them in
an array. An array such as ball is a data structure that holds many objects (or primitive data)
of the same type. The elements of an array are accessed using an index in square brackets. The
index begins at 0 and ends at the length of the array minus 1. Arrays are created with the new
operator and have several properties such as length. We will discuss arrays in more detail in
Section 3.4.
In Listing 2.13 we represent each ball as an object of type BouncingBall in an array. This
use of objects is appealing, but for better performance, it is usually better to store the positions
and the velocities of the balls in an array of doubles. In Chapter 8 we will simulate a system of
N mutually interacting particles. Because computational speed will be very important in this
case, we will not allocate separate objects for each particle, and instead will treat the system of
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 39
N particles as one object.
The initialize method in BouncingBallApp reads the number of particles and creates
an array of the appropriate length. Creating an array sets primitive variables to zero and object
values to null. For this reason we next loop to create the balls and add each ball to the frame. We
place each ball initially at (0,0) with a random velocity. To produce random angles for the initial
velocity, the Math.random() method is used. This method returns a random double between 0
and 1, not including the exact value 1. We deﬁne the random angle to be between 0 and π so
that the initial vertical component of the velocity is positive. Clicking the Initialize button
removes old objects from the drawing.
Most programming languages, including Java, use pixels to deﬁne the location on a window,
with the origin at the upper left-hand corner and the vertical coordinate increasing in
the downward direction. This choice of coordinates is usually not convenient in physics, and
it often is more convenient to choose coordinates such that the vertical coordinate increases
upward. The Circle.setXY method uses world or physical coordinates to set the position of
the circle, and its implementation converts these coordinates to pixels so that the Java graphics
methods can be used. In initialize we set the boundaries for the world coordinates using
the setPreferredMinMax method whose arguments are the minimum x-coordinate, maximum
x-coordinate, minimum y-coordinate, and maximum y-coordinate, respectively.
The doStep method implements a straightforward loop to advance the dynamical state of
each ball in the array. It then advances the time and displays the time in the frame. Frames are
automatically redrawn each time the doStep method is executed.
Finally, we note that there are two types of input parameters. Some parameters, such as the
number of particles, determine properties of the model that should not be changed after the
model has been created. We refer to these parameters as ﬁxed because their values should be
determined when the model is initialized. Other parameters, such as the time step ∆t, can be
changed between computations, but should not be changed during a computation. For example,
if the time step is changed while a diﬀerential equation is being solved, one variable might
be advanced using the old value of the time step while another variable is advanced using
the new value. This type of synchronization error can be avoided by reading the parameters
before the doStep method is executed. If you wish to allow a parameter to be changed between
computations, you can use the optional startRunning method. This method is invoked once
when the Step button is clicked and once when the Run button is clicked. In other words this
method is called before the thread starts and insures that the simulation has the opportunity to
read the most recent values.
In BouncingBallApp the time step dt is set using the setAdjustableValue method rather
than the setValue method. Parameters that are set using setAdjustableValue are editable in
the SimulationControl after the program has been initialized, whereas those that are set using
setValue are only editable before the program has been initialized.
Exercise 2.19. Follow the bouncing balls
(a) Run BouncingBallApp and try the diﬀerent buttons and note how they aﬀect the input
parameters.
(b) Add the statement enableStepsPerDisplay(true) to the reset method, and run your program
again. You should see a new input in the control window that lets you change the
number of simulation steps that are computed between redrawing the frame. Vary this
input and note what happens.
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 40
(c) What is wrong with the physics of the simulation?
(d) Add a method to the BouncingBall class to calculate and return the total energy. Sum the
energy of the balls in the program’s doStep method and display this value in the message
box. Does the simple model make sense?
(e) Look at the source code for the setXY method. If you are using an Integrated Development
Environment (IDE), ﬁnding the method and looking at the source code is easy. What would
you need to do to change the radius of the circle that is drawn?
Many of the visualization components in the Open Source Physics library are written using
classes provided by others. The goal of this library is to make it easier for you to begin writing
your own programs. You are encouraged to look under the hood as you gain experience. The
Open Source Physics controls and visualizations will almost always inherit from the JFrame
class. Drawing is almost always done on a DrawingPanel which inherits from the JPanel class.
Both these superclasses are deﬁned in the javax.swing package.
Exercise 2.20. Peeking into Open Source Physics
(a) Look at the source code for PlotFrame (in the frames package) and follow its inheritance
until you reach the JFrame class. How many subclasses are there between JFrame and
PlotFrame? Follow the inheritance from SimulationControl (in the controls package) to
JFrame. Describe in general terms what features are added in each subclass.
(b) Read through the diﬀerent methods in PlotFrame. Don’t worry about how the methods are
implemented, but try to understand what they do. Which methods have not yet appeared
in a program listing? When might you use them?
(c) Look at the source code for PlottingPanel (in the display package), which is used in many
of the frames. Follow its inheritance until you reach the JPanel class. Do you see why we
have not described the PlottingPanel class in detail? Look through the various methods,
and describe in your own words what several of them do and how they might be used.
(d) Find the closest common ancestor (superclass) for JFrame and JPanel in the core Java library.
Note that all objects have Object as a common ancestor.
2.7 Model-View-Controller
Developing large software programs is best viewed as a design process. One criterion for good
design is the reuse of data structures and behaviors that can facilitate reuse. Separating the
physics (the model) from the user interface (the controller) and the data visualization (the view)
facilitate good design. In Open Source Physics, the control object is responsible for handling
user initiated events, such as button clicks, and passing them to other objects. The plots that
we have constructed present visual representations of the data and are examples of a view. By
using this design strategy, it is possible to have multiple views of the same data. For example,
we can show a plot and a table view of the same data. The physics is expressed in terms of a
model which contains the data and provides the methods by which the data can change.
At this point we have described a large fraction of the Java syntax and Open Source Physics
tools that we will need in the rest of this book. One important topic that we still need to discuss
is the use of interfaces. There is also much more in the Open Source Physics library that we can
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 41
-4
-3
-2
-1
1
imaginary
real
4
3
2
(real, imag)
θ
|z|
-4 -3 -2 -1 1 2 3 4
Figure 2.2: A complex number z can be deﬁned by its real and imaginary parts, real and imag,
respectively, or by its magnitude |z| and phase angle θ.
use. For example, there are classes to draw and manipulate lattices as well as classes to iterate
diﬀerential equations more accurately than the Euler method used in this chapter.
At this stage we hope that you have gained a feel for how Java works, and can focus on the
physics in the rest of the text. Additional aspects of Java will be taught by example as they are
needed.
Appendix 2A: Complex Numbers
Complex numbers are used in physics to represent quantities such as alternating currents and
quantum mechanical wave functions which have an amplitude and phase (see Figure 2.2). Java
does not provide a complex number as a primitive data type, so we will write a class that implements
some common complex arithmetic operations. This class is an explicit example of the
fact that classes are eﬀectively new programmer-deﬁned types.
If our new class is called Complex, we could test it by using code such as the following:
package org . opensourcephysics . sip . ch02 ;
public class ComplexApp {
public s t a t i c void main ( String [ ] args ) {
Complex a = new Complex (3.0 , 2 . 0 ) ; / / complex number 3 + i2
Complex b = new Complex (1.0 , −4.0); / / complex number 1 − i4
System . out . println ( a ) ; / / d i s p l a y a using a . t o S t r i n g ( )
System . out . println (b ) ; / / d i s p l a y b using b . t o S t r i n g ( )
Complex sum = b . add ( a ) ; / / add a to b
System . out . println (sum ) ; / / d i s p l a y sum
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 42
Complex product = b . multiply ( a ) ; / / multiply b by a
System . out . println ( product ) ; / / d i s p l a y product
a . conjugate ( ) ; / / complex conjugate of a
System . out . println ( a ) ;
}
}
Because the methods of class Complex are not static, we must ﬁrst instantiate a Complex object
with a statement such as
Complex a = new Complex (3.0 , 2 . 0 ) ;
The variable a is an object of class Complex. As before, we can think of new as creating the
instance variables and memory of the object. Compare the form of this statement to the decla-
ration
double x = 3 . 0 ;
A variable of class type Complex is literally more complex than a primitive variable because its
deﬁnition also involves associated methods and instance variables.
Note that we have ﬁrst written a class that uses the Complex class before we have actually
written the latter. Although programming is an iterative process, it is usually a good idea to
ﬁrst think about how the objects of a class are to be used. Exercise 2.21 encourages you to do
so.
Exercise 2.21. Complex number test
What will be the output when ComplexApp is run? Make reasonable assumptions about how the
methods of the Complex class will perform using your knowledge of Java and complex numbers.
We need to deﬁne methods that add, multiply, and take the conjugate of complex numbers
and deﬁne a method that prints their values. We next list the code for the Complex class.
Listing 2.15: Listing of the Complex class.
package org . opensourcephysics . sip . ch02 ;
public class Complex {
private double real = 0;
private double imag = 0;
public Complex ( ) {
this (0 , 0 ) ; / / invokes second c o n s t r u c t o r with 0 + i0
}
public Complex ( double real , double imag ) {
this . real = real ;
this . imag = imag ;
}
public void conjugate ( ) {
imag = −imag ;
}
public Complex add ( Complex c ) {
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 43
/ / r e s u l t a l s o i s complex so need to introduce another v a r i a b l e
/ / of type Complex
Complex sum = new Complex ( ) ;
sum. real = real+c . real ;
sum. imag = imag+c . imag ;
return sum;
}
public Complex multiply ( Complex c ) {
Complex product = new Complex ( ) ;
product . real = ( real c . real ) −(imag c . imag ) ;
product . imag = ( real c . imag )+( imag c . real ) ;
return product ;
}
public String toString ( ) {
/ / note example of method o v e r r i d i n g
i f ( imag>=0) {
return real+" + i"+Math . abs ( imag ) ;
} else {
return real+" - i"+Math . abs ( imag ) ;
}
}
}
The Complex class deﬁnes two constructors that are distinguished by their parameter list.
The constructor with two arguments allows us to initialize the values of the instance variables.
Notice how the class encapsulates (hides) both the data and the methods that characterize a
complex number. That is, we can use the Complex class without any knowledge of how its
methods are implemented or how its data is stored.
The general features of this class deﬁnition are as before. The variables real and imag are
the instance variables of class Complex. In contrast, the variable sum in method add is a local
variable because it can be accessed only within the method in which it is deﬁned.
The most important new feature of the Complex class is that the add and multiply methods
return new Complex objects. One reason we need to return a variable of type Complex is that a
method returns (at most) a single value. For this reason we cannot return both sum.real and
sum.imag. More importantly, we want the sum of two complex numbers to be also of type
Complex so that we can add a third complex number to the result. Note also that we have
deﬁned add and multiply so that they do not change the values of the instance variables of the
numbers to be added, but create a new complex number that stores the sum.
Exercise 2.22. Complex numbers
Another way to represent complex numbers is by their magnitude and phase, |z|eıθ. If z = a+ıb,
then
|z| =
√
a2 + b2 (2.13a)
and
θ = arctan
b
a
. (2.13b)
(a) Write methods to get the magnitude and phase of a complex number, getMagnitude and
CHAPTER 2. TOOLS FOR DOING SIMULATIONS 44
getPhase, respectively. Add test code to invoke these methods. Be sure to check the phase
in all four quadrants.
(b) Create a new class named ComplexPolar that stores a complex number as a magnitude and
phase. Deﬁne methods for this class so that it behaves the same as the Complex class. Test
this class using the code for ComplexApp.
This example of the Complex class illustrates the nature of objects, their limitations, and
the tradeoﬀs that enter into design choices. Because accessing an object requires more computer
time than accessing primitive variables, it is faster to represent a complex number by two
doubles, corresponding to its real and imaginary parts. Thus N complex data points could be
represented by an array of 2N doubles, with the ﬁrst N values corresponding to the real values.
Considerations of computational speed are important only if complex data types are used
extensively.
References and Suggestions for Further Reading
By using the Open Source Physics library, we have hidden most of the Java code needed to
use threads, and have only touched on the graphical capabilities of Java. See the Open Source
Physics: A User’s Guide with Examples for a description of additional details on how threads
and the other Open Source Physics tools are implemented and used. The source code for
all the programs in the text and the Open Source Physics library can be downloaded from
<www.compadre.org/portal/items/detail.cfm?ID=7147>.
There are many good books on Java graphics and Java threads. We list a few of our favorites
in the following.
David M. Geary, Graphic Java: Vol. 2, Swing, 3rd ed. (Prentice Hall, 1999).
Jonathan Knudsen, Java 2D Graphics (O’Reilly, 1999).
Scott Oaks and Henry Wong, Java Threads, 3rd ed. (O’Reilly, 2004).
Chapter 3
Simulating Particle Motion
We discuss several numerical methods needed to simulate the motion of particles using Newton’s
laws and introduce interfaces, an important Java construct that makes it possible for unrelated
objects to declare that they perform the same methods.
3.1 Modiﬁed Euler algorithms
To motivate the need for a general diﬀerential equation solver, we discuss why the simple Euler
algorithm is insuﬃcient for many problems. The Euler algorithm assumes that the velocity and
acceleration do not change signiﬁcantly during the time step ∆t. Thus, to achieve an acceptable
numerical solution, the time step ∆t must be chosen to be suﬃciently small. However, if
we make ∆t too small, we run into several problems. As we do more and more iterations, the
round-oﬀ error due to the ﬁnite precision of any ﬂoating point number will accumulate, and
eventually the numerical results will become inaccurate. Also, the greater the number of iterations,
the greater the computer time required for the program to ﬁnish. In addition to these
problems, the Euler algorithm is unstable for many systems, which means that the errors accumulate
exponentially, and thus the numerical solution becomes inaccurate very quickly. For
these reasons more accurate and stable numerical algorithms are necessary.
To illustrate why we need algorithms other than the simple Euler algorithm, we make a very
simple change in the Euler algorithm and write
v(t + ∆t) = v(t) + a(t)∆t (3.1a)
y(t + ∆t) = y(t) + v(t + ∆t)∆t (3.1b)
where a is the acceleration. The only diﬀerence between this algorithm and the simple Euler
algorithm,
v(t + ∆t) = v(t) + a(t)∆t (3.2a)
y(t + ∆t) = y(t) + v(t)∆t (3.2b)
is that the computed velocity at the end of the interval, v(t + ∆t), is used to compute the new
position, y(t + ∆t) in (3.1b). As we found in Problem 2.12 and will see in more detail in Problem
3.1, this modiﬁed Euler algorithm is signiﬁcantly better for oscillating systems. We refer to
this algorithm as the Euler–Cromer algorithm.
45
CHAPTER 3. SIMULATING PARTICLE MOTION 46
Problem 3.1. Comparing Euler algorithms
(a) Write a class that extends Particle and models a simple harmonic oscillator for which
F = −kx. For simplicity, choose units such that k = 1 and m = 1. Determine the numerical
error in the position of the simple harmonic oscillator after the particle has evolved for
several cycles. Is the original Euler algorithm stable for this system? What happens if you
run for longer times?
(b) Repeat part (a) using the Euler–Cromer algorithm. Does this algorithm work better? If so,
in what way?
(c) Modify your program so that it computes the total energy, Esho = v2/2 + x2/2. How well is
the total energy conserved for the two algorithms? Also consider the quantity ˜E = Esho +
(∆t/2)xp. What is the behavior of this quantity for the Euler–Cromer algorithm?
Perhaps it has occurred to you that it would be better to compute the velocity at the middle
of the interval rather than at the beginning or at the end. The Euler–Richardson algorithm is
based on this idea. This algorithm is particularly useful for velocity-dependent forces, but does
as well as other simple algorithms for forces that do not depend on the velocity. The algorithm
consists of using the Euler algorithm to ﬁnd the intermediate position ymid and velocity vmid at
a time tmid = t +∆t/2. We then compute the force, F(ymid,vmid,tmid) and the acceleration amid at
t = tmid. The new position yn+1 and velocity vn+1 at time tn+1 are found using vmid and amid and
the Euler algorithm. We summarize the Euler–Richardson algorithm as:
an = F(yn,vn,tn)/m (3.3a)
vmid = vn +
1
2
an∆t (3.3b)
ymid = yn +
1
2
vn∆t (3.3c)
amid = F ymid,vmid,t +
1
2
∆t /m (3.3d)
and
vn+1 = vn + amid∆t (3.4a)
yn+1 = yn + vmid∆t (Euler–Richardson algorithm). (3.4b)
Although we need to do twice as many computations per time step, the Euler–Richardson
algorithm is much faster than the Euler algorithm because we can make the time step larger and
still obtain better accuracy than with either the Euler or Euler–Cromer algorithms. A derivation
of the Euler–Richardson algorithm is given in Appendix 3A.
Exercise 3.2. The Euler–Richardson algorithm
(a) Extend FallingParticle in Listing 2.6 to a new class that implements the Euler–Richardson
algorithm. All you need to do is write a new step method.
(b) Use ∆t = 0.08, 0.04, 0.02, and 0.01 and determine the error in the computed position when
the particle hits the ground. How do your results compare with the Euler algorithm? How
does the error in the velocity depend on ∆t for each algorithm?
CHAPTER 3. SIMULATING PARTICLE MOTION 47
(c) Repeat part (b) for the simple harmonic oscillator and compute the error after several cycles.
As we gain more experience simulating various physical systems, we will learn that no
single algorithm for solving Newton’s equations of motion numerically is superior under all
conditions.
The Open Source Physics library includes classes that can be used to solve systems of coupled
ﬁrst-order diﬀerential equations using diﬀerent algorithms. To understand how to use this
library, we ﬁrst discuss interfaces and then arrays.
3.2 Interfaces
We have seen how to combine data and methods into a class. A class deﬁnition encapsulates this
information in one place, thereby simplifying the task of the programmer who needs to modify
the class and the user who needs to understand or use the class.
Another tool for data abstraction is known as an interface. An interface speciﬁes methods
that an object performs but does not implement these methods. In other words, an interface
describes the behavior or functionality of any class that implements it. Because an interface is
not tied to a given class, any class can implement any particular interface as long as it deﬁnes
all the methods speciﬁed by the interface. An important reason for interfaces is that a class can
inherit from only one superclass, but it can implement more than one interface.
An example of an interface is the Function interface in the numerics package:
public interface Function {
public double evaluate ( double x ) ;
}
The interface contains one method, evaluate, with one argument, but no body. Notice that the
deﬁnition uses the keyword interface rather then the keyword class.
We can deﬁne a class that encapsulates a quadratic polynomial as follows:
public class QuadraticPolynomial implements Function {
double a , b , c ;
public QuadraticPolynomial ( double a , double b , double c ) {
this . a = a ;
this . b = b ;
this . c = c ;
}
public double evaluate ( double x ) {
return a x x + b x + c ;
}
}
Quadratic polynomials can now be instantiated and used as needed.
Function f = new QuadraticPolynomial ( 1 , 0 , 2 ) ;
for ( int x = 0; x < 10; x++) {
System . out . println ( "x = " + x + " f(x)" + f . evaluate ( x ) ) ;
}
CHAPTER 3. SIMULATING PARTICLE MOTION 48
By using the Function interface, we can write methods that use this mathematical abstraction.
For example, we can program a simple plot as follows:
public void plotFunction ( Function f , double xmin , double xmax) {
PlotFrame frame = new PlotFrame ( "x" ,"y" , "Function" ) ;
double n = 100; / / number of points in p l o t
double x = xmin , dx = (xmax − xmin ) / ( n−1);
for ( int i = 0; i < 100; i ++) {
frame . append (0 , x , f . evaluate ( x ) ) ;
x += dx ;
}
frame . s e t V i s i b l e ( true ) ; / / d i s p l a y frame on screen
}
We can also compute a numerical derivative based on the deﬁnition of the derivative found
in calculus textbooks.
public double derivative ( Function f , double x , double dx ) {
return ( f . evaluate ( x+dx ) − f . evaluate ( x ) ) / dx ;
}
This way of approximating a derivative is not optimum, but that is not the point here. (A better
approximation is given in Problem 3.8.) The important point is that the interface enables us to
deﬁne the abstract concept y = f (x) and to write code that uses this abstraction.
Exercise 3.3. Function interface
(a) Deﬁne a class that encapsulates the function f (u) = ae−bu2
.
(b) Write a test program that plots f (u) with b = 1 and b = 4. Choose a = 1 for simplicity.
(c) Write a test program that plots the derivatives of the functions used in part (b) without
using the analytic expression for the derivative.
Although interfaces are very useful for developing large scale software projects, you will
not need to deﬁne interfaces to do the problems in this book. However, you will use several
interfaces, including the Function interface, that are deﬁned in the Open Source Physics library.
We describe two of the more important interfaces in the following sections.
3.3 Drawing
An interface that we will use often is the Drawable interface:
package org . opensourcephysics . display ;
import java . awt . ;
public interface Drawable {
public void draw ( DrawingPanel panel , Graphics g ) ;
}
Notice that this interface contains only one method, draw. Objects that implement this interface
are rendered in a DrawingPanel after they have been added to a DisplayFrame. As we saw in
Chapter 2, a DisplayFrame consists of components including a title bar, menu, and buttons
CHAPTER 3. SIMULATING PARTICLE MOTION 49
for minimizing and closing the frame. The DisplayFrame contains a DrawingPanel on which
graphical output will be displayed. The Graphics class contains methods for drawing simple
geometrical objects such as lines, rectangles, and ovals on the panel. In Listing 3.1 we deﬁne a
class that draws a rectangle using pixel-based coordinates.
Listing 3.1: PixelRectangle.
package org . opensourcephysics . sip . ch03 ;
import java . awt . ; / / uses Abstract Window T o o l k i t
import org . opensourcephysics . display . ;
public class PixelRectangle implements Drawable {
int left , top ; / / p o s i t i o n of r e c t a n g l e in p i x e l s
int width , height ; / / s i z e of r e c t a n g l e in p i x e l s
PixelRectangle ( int left , int top , int width , int height ) {
this . l e f t = l e f t ; / / l o c a t i o n of l e f t edge
this . top = top ; / / l o c a t i o n of top edge
this . width = width ;
this . height = height ;
}
public void draw ( DrawingPanel panel , Graphics g ) {
/ / t h i s method implements the Drawable i n t e r f a c e
g . setColor ( Color .RED) ; / / s e t drawing c o l o r to red
g . f i l l R e c t ( left , top , width , height ) ; / / draws r e c t a n g l e
}
}
In method draw we used fillRect, a primitive method in the Graphics class. This method
draws a ﬁlled rectangle using pixel coordinates with the origin at the top left corner of the
panel.
To use PixelRectangle, we instantiate an object and add it to a DisplayFrame as shown in
Listing 3.2.
Listing 3.2: Listing of DrawingApp.
package org . opensourcephysics . sip . ch03 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . frames . ;
public class DrawingApp extends AbstractCalculation {
DisplayFrame frame = new DisplayFrame ( "x" , "y" , "Graphics" ) ;
public DrawingApp ( ) {
frame . setPreferredMinMax (0 , 10 , 0 , 10);
}
public void calculate ( ) {
/ / g e t s r e c t a n g l e l o c a t i o n
int l e f t = control . getInt ( "xleft" ) ;
int top = control . getInt ( "ytop" ) ;
/ / g e t s r e c t a n g l e dimensions
int width = control . getInt ( "width" ) ;
CHAPTER 3. SIMULATING PARTICLE MOTION 50
int height = control . getInt ( "height" ) ;
Drawable rectangle = new PixelRectangle ( left , top , width , height ) ;
frame . addDrawable ( rectangle ) ;
/ / frame i s automatically rendered a f t e r Calculate button
/ / i s c l i c k e d
}
public void reset ( ) {
/ / removes drawables added by the user
frame . clearDrawables ( ) ;
/ / s e t s d e f a u l t input values
control . setValue ( "xleft" , 60);
control . setValue ( "ytop" , 70);
control . setValue ( "width" , 100);
control . setValue ( "height" , 150);
}
/ / c r e a t e s a c a l c u l a t i o n c o n t r o l s t r u c t u r e using t h i s c l a s s
public s t a t i c void main ( String [ ] args ) {
CalculationControl . createApp (new DrawingApp ( ) ) ;
}
}
Note that multiple rectangles are drawn in the order that they are added to the drawing panel.
Rectangles or portions of rectangles may be hidden because they are outside the drawing panel.
Although it is possible to use pixel-based drawing methods to produce visualizations, creating
even a simple graph in such an environment would require much tedious programming.
The DrawingPanel object passed to the draw method simpliﬁes this task by deﬁning a system
of world coordinates that enable us to specify the location and size of various objects in physical
units rather than pixels. In the WorldRectangle class in Listing 3.3, methods from the DrawingPanel
class are used to convert pixel coordinates to world coordinates. The range of the world
coordinates in the horizontal and vertical directions is deﬁned in the frame.setPreferredMinMax
method in DrawingApp. (This method is not needed if pixel coordinates are used.)
Listing 3.3: WorldRectangle illustrates the use of world coordinates.
package org . opensourcephysics . sip . ch03 ;
import java . awt . ;
import org . opensourcephysics . display . ;
public class WorldRectangle implements Drawable {
double left , top ; / / p o s i t i o n of r e c t a n g l e in world c o o r d i n a t e s
double width , height ; / / s i z e of r e c t a n g l e in world units
public WorldRectangle ( double left , double top , double width ,
double height ) {
this . l e f t = l e f t ; / / l o c a t i o n of l e f t edge
this . top = top ; / / l o c a t i o n of top edge
this . width = width ;
this . height = height ;
}
public void draw ( DrawingPanel panel , Graphics g ) {
/ / This method implements the Drawable i n t e r f a c e
CHAPTER 3. SIMULATING PARTICLE MOTION 51
g . setColor ( Color .RED) ; / / s e t drawing c o l o r to red
/ / converts from world to p i x e l c o o r d i n a t e s
int l e f t P i x e l s = panel . xToPix ( l e f t ) ;
int topPixels = panel . yToPix ( top ) ;
int widthPixels = ( int ) ( panel . getXPixPerUnit ( ) width ) ;
int heightPixels = ( int ) ( panel . getYPixPerUnit ( ) height ) ;
/ / draws r e c t a n g l e
g . f i l l R e c t ( l e f t P i x e l s , topPixels , widthPixels , heightPixels ) ;
}
}
Exercise 3.4. Simple graphics
(a) Run DrawingApp and test how the diﬀerent inputs change the size and location of the rectangle.
Note that the pixel coordinates that are obtained from the control window are not
the same as the world coordinates that are displayed.
(b) Read the documentation at <java.sun.com/reference/api/> for the Graphics class, and
modify the WorldRectangle class to draw lines, ﬁlled ovals, and strings of characters. Also
play with diﬀerent colors.
(c) Modify DrawingApp to use the WorldRectangle class and repeat part (a). Note that the
coordinates that are displayed and the inputs are now consistent.
(d) Deﬁne and test a TextMessage class to display text messages in a drawing panel using world
coordinates to position the text. In the draw method use the syntax g.drawString("string
to draw",x,y), where (x,y) are the pixel coordinates.
Although simple geometric shapes such as circles and rectangles are often all that are
needed to visualize many physical models, Java provides a drawing environment based on the
Java 2D Application Programming Interface (API) which can render arbitrary geometric shapes,
images, and text using composition and matrix-based transformations. We will use a subset of
these features to deﬁne the DrawableShape and InteractiveShape classes in the display package
of Open Source Physics, which we will introduce in Chapter 9. (See also the Open Source
Physics User’s Guide.)
So far we have created rectangles using two diﬀerent classes. Each implementation of a
Drawable rectangle deﬁned a diﬀerent draw method. Notice that in the display frame’s deﬁnition
of addDrawable in DrawingApp, the argument is speciﬁed to be the interface Drawable
rather than a speciﬁc class. Any class that implements Drawable can be an argument of addDrawable.
Without the interface construct, we would need to write an addDrawable method
for each type of class.
3.4 Specifying The State of a System Using Arrays
Imagine writing the code for the numerical solution of the motion of three particles in three dimensions
using the Euler–Richardson algorithm. The resulting code would be tedious to write.
In addition, for each problem we would need to write and debug new code to implement the
numerical algorithm. The complications become worse for better algorithms, most of which are
algebraically more complex. Moreover, the numerical solution of simple ﬁrst-order diﬀerential
equations is a well- developed part of numerical analysis, and thus there is little reason to worry
CHAPTER 3. SIMULATING PARTICLE MOTION 52
about the details of these algorithms now that we know how they work. In Section 3.5 we will
introduce an interface for solving the diﬀerential equations associated with Newton’s equations
of motion. Before we do so we discuss a few features of arrays that we will need.
As we discussed on page 38, ordered lists of data are most easily stored in arrays. For
example, if we have an array variable named x, then we can access its ﬁrst element as x[0], its
second element as x[1], etc. All elements must be of the same data type, but they can be just
about anything: primitive data types such as doubles or integers, objects, or even other arrays.
The following statements show how arrays of primitive data types are deﬁned and instantiated:
/ / x defined to be an array of doubles
double [ ] x ;
double x [ ] ; / / same meaning as double [ ] x
/ / x array c r e a t e d with 32 elements
x = new double [ 3 2 ] ;
/ / y array defined and c r e a t e d in one statement
double [ ] y = new double [ 3 2 ] ;
int [ ] num = new int [1 00 ]; / / array of 100 i n t e g e r s
double [ ] x , y / / p r e f e r r e d notation
/ / same meaning as double [ ] x , y
double x [ ] , y [ ]
/ / array of doubles s p e c i f i e d by two i n d i c e s
double [ ] [ ] sigma = new double [ 3 ] [ 3 ] ;
/ / r e f e r e n c e to f i r s t row of sigma array
double [ ] row = sigma [ 0 ] ;
We will adopt the syntax double[] x instead of double x[]. The array index starts at zero,
and the largest index is one less than the number of elements. Note that Java supports multiple
array indices by creating arrays of arrays. Although sigma[0][0] refers to a single value of type
double in the sigma object, we can refer to an entire row of values in the sigma object using the
syntax sigma[i].
As shown in Chapter 2, arrays can contain objects such as bouncing balls.
/ / array of two BouncingBall o b j e c t s
BouncingBall [ ] ball = new BouncingBall [ 2 ] ;
ball [0 ] = new BouncingBall ( 0 , 1 0 . 0 , 0 , 5 . 0 ) ; / / c r e a t e s f i r s t b a l l
ball [1 ] = new BouncingBall (0 , −13.0 ,0 ,7.0); / / c r e a t e s second b a l l
The ﬁrst statement allocates an array of BouncingBall objects, each of which is initialized to
null. We need to create each object in the array using the new operator.
The numerical solution of an ordinary diﬀerential equation (frequently called an ODE)
begins by expressing the equation as several ﬁrst-order diﬀerential equations. If the highest
derivative in the ODE is order n (for example, dnx/dtn), then it can be shown that the ODE can
be written equivalently as n ﬁrst-order diﬀerential equations. For example, Newton’s equation
of motion is a second-order diﬀerential equation and can be written as two ﬁrst-order diﬀerential
equations for the position and velocity in each spatial dimension. For example, in one
dimension we can write
dy
dt
= v(t) (3.5a)
dv
dt
= a(t) = F(t)/m. (3.5b)
CHAPTER 3. SIMULATING PARTICLE MOTION 53
If we have more than one particle, there are additional ﬁrst-order diﬀerential equations for each
particle. It is convenient to have a standard way of handling all these cases.
Let us assume that each diﬀerential equation is of the form
dxi
dt
= ri(x0,xi,x2,...,xn−1,t) (3.6)
where xi is a dynamical variable such as a position or a velocity. The rate function ri can depend
on any of the dynamical variables including the time t. We will store the values of the dynamical
variables in the state array and the values of the corresponding rates in the rate array. In the
following we show some examples:
/ / one p a r t i c l e in one dimension :
s t a t e [ 0 ] / / s t o r e s x
s t a t e [ 1 ] / / s t o r e s v
s t a t e [ 2 ] / / s t o r e s t ( time )
/ / one p a r t i c l e in two dimensions :
s t a t e [ 0 ] / / s t o r e s x
s t a t e [ 1 ] / / s t o r e s vx
s t a t e [ 2 ] / / s t o r e s y
s t a t e [ 3 ] / / s t o r e s vy
s t a t e [ 4 ] / / s t o r e s t
/ / two p a r t i c l e s in one dimension :
s t a t e [ 0 ] / / s t o r e s x1
s t a t e [ 1 ] / / s t o r e s v1
s t a t e [ 2 ] / / s t o r e s x2
s t a t e [ 3 ] / / s t o r e s v2
s t a t e [ 4 ] / / s t o r e s t
Although the Euler algorithm does not assume any special ordering of the state variables,
we adopt the convention that a velocity rate follows every position rate in the state array so
that we can eﬃciently code the more sophisticated numerical algorithms that we discuss in Appendix
3A and in later chapters. To solve problems for which the rate contains an explicit time
dependence, such as a driven harmonic oscillator (see Section 4.4), we store the time variable in
the last element of the state array. Thus, for one particle in one dimension, the time is stored in
state[2]. In this way we can treat all dynamical variables on an equal footing.
Because arrays can be arguments of methods, we need to understand how Java passes variables
from the class that calls a method to the method being called. Consider the following
method:
public void example ( int r , int s [ ] ) {
r = 20;
s [ 0] = 20;
}
What do you expect the output of the following statements to be?
int x = 10;
int [ ] y = { 1 0 } ; / / array of one element i n i t i a l i z e d to y [0] = 10
example ( x , y ) ;
System . out . println ( "x = " + x + " y[0] = " + y [ 0 ] ) ;
The answer is that the output will be x = 10, y[0] = 20. Java parameters are “passed-byvalue,”
which means that the values are copied. The method cannot modify the value of the
CHAPTER 3. SIMULATING PARTICLE MOTION 54
x variable because the method received only a copy of its value. In contrast, when an object
or an array is in a method’s parameter list, Java passes a copy of the reference to the object or
the array. The method can use the reference to read or modify the data in the array or object.
For this reason the step method of the ODE solvers, discussed in Section 3.6, does not need to
explicitly return an updated state array, but implicity changes the contents of the state array.
Exercise 3.5. Pass by value
As another example of how Java handles primitive variables diﬀerently from arrays and objects,
consider the statements
int x = 10;
int y = x ;
x = 20;
What is y? Next consider
/ / d e c l a r e s an array of one element i n i t i a l i z e d to the value 10
int [ ] x = { 1 0 } ;
int [ ] y = x ;
x [ 0 ] = 20;
What is y[0]?
We are now ready to discuss the classes and interfaces from the Open Source Physics library
for solving ordinary diﬀerential equations.
3.5 The ODE Interface
To introduce the ODE interface, we again consider the equations of motion for a falling particle.
We use a state array ordered as s = (y,v,t), so that the dynamical equations can be written as:
˙s0 = s1 (3.7a)
˙s1 = −g (3.7b)
˙s2 = 1. (3.7c)
The ODE interface enables us to encapsulate (3.7) in a class. The interface contains two methods,
getState and getRate, as shown in Listing 3.4.
Listing 3.4: The ODE interface.
package org . opensourcephysics . numerics ;
public interface ODE {
public double [ ] getState ( ) ;
public void getRate ( double [ ] state , double [ ] rate ) ;
}
The getState method returns the state array (s0,s1,...,sn). The getRate method evaluates
the derivatives using the given state array and stores the result in the rate array, (˙s0, ˙s1,..., ˙sn).
An example of a Java class that implements the ODE interface for a falling particle is shown
in Listing 3.5.
CHAPTER 3. SIMULATING PARTICLE MOTION 55
Listing 3.5: Example of the implementation of the ODE interface for a falling particle.
package org . opensourcephysics . sip . ch03 ;
import org . opensourcephysics . numerics . ;
public class FallingParticleODE implements ODE {
final s t a t i c double g = 9 . 8 ;
double [ ] s t a t e = new double [ 3 ] ;
public FallingParticleODE ( double y , double v ) {
s t a t e [ 0 ] = y ;
s t a t e [ 1 ] = v ;
s t a t e [ 2 ] = 0; / / i n i t i a l time
}
/ / required to implement ODE i n t e r f a c e
public double [ ] getState ( ) {
return s t a t e ;
}
public void getRate ( double [ ] state , double [ ] rate ) {
rate [ 0] = s t a t e [ 1 ] ; / / r a t e of change of y i s v
rate [ 1] = −g ;
rate [ 2] = 1; / / r a t e of change of time i s 1
}
}
3.6 The ODESolver Interface
There are many possible numerical algorithms for advancing a system of ﬁrst-order ODEs from
an initial state to a ﬁnal state. The Open Source Physics library deﬁnes ODE solvers such as
Euler and EulerRichardson, as well as RK4, a fourth-order algorithm that is discussed in Appendix
3. You can write additional classes for other algorithms if they are needed. Each of these
classes implements the ODESolver interface, which is deﬁned in Listing 3.6.
Listing 3.6: The ODE solver interface. Note the four methods that must be deﬁned.
package org . opensourcephysics . numerics ;
public interface ODESolver {
public void i n i t i a l i z e ( double stepSize ) ;
public double step ( ) ;
public void setStepSize ( double stepSize ) ;
public double getStepSize ( ) ;
}
A system of ﬁrst-order diﬀerential equations is now solved by creating an object that implements
a particular algorithm and repeatedly invoking the step method for that solver class.
The argument for the solver class constructor must be a class that implements the ODE interface.
As an example of the use of ODESolver, we again consider the dynamics of a falling particle.
Listing 3.7: A falling particle program that uses an ODESolver.
package org . opensourcephysics . sip . ch03 ;
CHAPTER 3. SIMULATING PARTICLE MOTION 56
import org . opensourcephysics . controls . ;
import org . opensourcephysics . numerics . ;
public class FallingParticleODEApp extends AbstractCalculation {
public void calculate ( ) {
/ / g e t s i n i t i a l c o n d i t i o n s
double y0 = control . getDouble ( "Initial y" ) ;
double v0 = control . getDouble ( "Initial v" ) ;
/ / c r e a t e s b a l l with i n i t i a l c o n d i t i o n s
FallingParticleODE ball = new FallingParticleODE ( y0 , v0 ) ;
/ / note how p a r t i c u l a r algorithm i s chosen
ODESolver solver = new Euler ( ball ) ;
/ / s e t s time st ep dt in the s o l v e r
solver . setStepSize ( control . getDouble ( "dt" ) ) ;
while ( ball . s t a t e [0] >0) {
solver . step ( ) ;
}
control . println ( "final time = "+ball . s t a t e [ 2 ] ) ;
control . println ( "y = "+ball . s t a t e [0]+ " v = "+ball . s t a t e [ 1 ] ) ;
}
public void reset ( ) {
/ / s e t s d e f a u l t input values
control . setValue ( "Initial y" , 10);
control . setValue ( "Initial v" , 0 ) ;
control . setValue ( "dt" , 0 . 0 1 ) ;
}
/ / c r e a t e s a c a l c u l a t i o n c o n t r o l s t r u c t u r e f o r t h i s c l a s s
public s t a t i c void main ( String [ ] args ) {
CalculationControl . createApp (new FallingParticleODEApp ( ) ) ;
}
}
The ODE classes are located in the numerics package, and thus we need to import this package
as done in the third statement of FallingParticleODEApp. We declare and instantiate the
variables ball and solver in the calculate method. Note that ball, an instance of FallingParticleODE,
is the argument of the Euler constructor. The object ball can be an argument
because FallingParticleODE implements the ODE interface.
It would be a good idea to look at the source code of the ODE Euler class in the numerics
package. The Euler class gets the state of the system using getState and then sends this state
to getRate which stores the rates in the rate array. The state array is then modiﬁed using the
rate array in the Euler algorithm. You don’t need to know the details, but you can read the
step method of the various classes that implement ODESolver if you are interested in how the
diﬀerent algorithms are programmed.
Because FallingParticleODE appears to be more complicated than FallingParticle, you
might ask what we have gained. One answer is that it is now much easier to use a diﬀerent
numerical algorithm. The only modiﬁcation we need to make is to change the statement
ODESolver solver = new Euler ( ball ) ;
to, for example,
ODESolver solver = new EulerRichardson ( ball ) ;
CHAPTER 3. SIMULATING PARTICLE MOTION 57
We have separated the physics (in this case a freely falling particle) from the implementation of
the numerical method.
Exercise 3.6. ODE solvers
Run FallingParticleODEApp and compare your results with our previous implementation of
the Euler algorithm in FallingParticleApp. How easy is it to use a diﬀerent algorithm?
3.7 Eﬀects of Drag Resistance
We have introduced most of the programming concepts that we will use in the remainder of
this text. If you are new to programming, you will likely feel a bit confused at this point by
all the new concepts and syntax. However, it is not necessary to understand all the details to
continue and begin to write your own programs. A prototypical simulation program is given
in Listings 3.8 and 3.9. These classes simulate a projectile on the surface of the Earth with no
air friction, including a plot of position versus time and an animation of a projectile moving
through the air. In the following, we discuss more realistic models that can be simulated by
modifying the projectile classes.
Listing 3.8: A simple projectile simulation that is useful as a template for other simulations.
package org . opensourcephysics . sip . ch03 ;
import java . awt . ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . numerics . ;
public class P r o j e c t i l e implements Drawable , ODE {
s t a t i c final double g = 9 . 8 ;
double [ ] s t a t e = new double [ 5 ] ; / / {x , vx , y , vy , t }
/ pixel radius for drawing of p r o j e c t i l e
int pixRadius = 6; /
EulerRichardson odeSolver = new EulerRichardson ( this ) ;
public void setStepSize ( double dt ) {
odeSolver . setStepSize ( dt ) ;
}
public void step ( ) {
odeSolver . step ( ) ; / / do one time st ep using s e l e c t e d algorithm
}
public void s e t S t a t e ( double x , double vx , double y , double vy ) {
s t a t e [ 0 ] = x ;
s t a t e [ 1 ] = vx ;
s t a t e [ 2 ] = y ;
s t a t e [ 3 ] = vy ;
s t a t e [ 4 ] = 0;
}
public double [ ] getState ( ) {
return s t a t e ;
}
CHAPTER 3. SIMULATING PARTICLE MOTION 58
public void getRate ( double [ ] state , double [ ] rate ) {
rate [ 0] = s t a t e [ 1 ] ; / / r a t e of change of x
rate [ 1] = 0; / / r a t e of change of vx
rate [ 2] = s t a t e [ 3 ] ; / / r a t e of change of y
rate [ 3] = −g ; / / r a t e of change of vy
rate [ 4] = 1; / / dt / dt = 1
}
public void draw ( DrawingPanel drawingPanel , Graphics g ) {
int xpix = drawingPanel . xToPix ( s t a t e [ 0 ] ) ;
int ypix = drawingPanel . yToPix ( s t a t e [ 2 ] ) ;
g . setColor ( Color . red ) ;
g . f i l l O v a l ( xpix−pixRadius , ypix−pixRadius , 2 pixRadius , 2 pixRadius ) ;
g . setColor ( Color . green ) ;
int xmin = drawingPanel . xToPix ( −100);
int xmax = drawingPanel . xToPix (100);
int y0 = drawingPanel . yToPix ( 0 ) ;
/ / draw a l i n e to r e p r e s e n t the ground
g . drawLine (xmin , y0 , xmax , y0 ) ;
}
}
Listing 3.9: A target class for projectile motion simulation.
package org . opensourcephysics . sip . ch03 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . ;
public class ProjectileApp extends AbstractSimulation {
PlotFrame plotFrame = new PlotFrame ( "Time" , "x,y" , "Position versus time" ) ;
P r o j e c t i l e p r o j e c t i l e = new P r o j e c t i l e ( ) ;
PlotFrame animationFrame = new PlotFrame ( "x" , "y" , "Trajectory" ) ;
public ProjectileApp ( ) {
animationFrame . addDrawable ( p r o j e c t i l e ) ;
plotFrame . setXYColumnNames (0 , "t" , "x" ) ;
plotFrame . setXYColumnNames (1 , "t" , "y" ) ;
}
public void i n i t i a l i z e ( ) {
double dt = control . getDouble ( "dt" ) ;
double x = control . getDouble ( "initial x" ) ;
double vx = control . getDouble ( "initial vx" ) ;
double y = control . getDouble ( "initial y" ) ;
double vy = control . getDouble ( "initial vy" ) ;
p r o j e c t i l e . s e t S t a t e ( x , vx , y , vy ) ;
p r o j e c t i l e . setStepSize ( dt ) ;
/ / estimate of s i z e needed f o r d i s p l a y
double size = ( vx vx+vy vy )/10;
animationFrame . setPreferredMinMax ( −1 , size , −1, size ) ;
}
public void doStep ( ) {
/ / x vs time data added
CHAPTER 3. SIMULATING PARTICLE MOTION 59
plotFrame . append (0 , p r o j e c t i l e . s t a t e [ 4 ] , p r o j e c t i l e . s t a t e [ 0 ] ) ;
/ / y vs time data added
plotFrame . append (1 , p r o j e c t i l e . s t a t e [ 4 ] , p r o j e c t i l e . s t a t e [ 2 ] ) ;
/ / t r a j e c t o r y data added
animationFrame . append (0 , p r o j e c t i l e . s t a t e [ 0 ] , p r o j e c t i l e . s t a t e [ 2 ] ) ;
p r o j e c t i l e . step ( ) ; / / advance the s t a t e by one time st ep
}
public void reset ( ) {
control . setValue ( "initial x" , 0 ) ;
control . setValue ( "initial vx" , 10);
control . setValue ( "initial y" , 0 ) ;
control . setValue ( "initial vy" , 10);
control . setValue ( "dt" , 0 . 0 1 ) ;
enableStepsPerDisplay ( true ) ;
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl . createApp (new ProjectileApp ( ) ) ;
}
}
The analytic solution for free fall near the Earth’s surface, (2.4), is well known, and thus
ﬁnding a numerical solution is useful only as an introduction to numerical methods. It is not
diﬃcult to think of more realistic models of motion near the Earth’s surface for which the equations
of motion do not have simple analytic solutions. For example, if we take into account the
variation of the Earth’s gravitational ﬁeld with the distance from the center of the Earth, then
the force on a particle is not constant. According to Newton’s law of gravitation, the force due
to the Earth on a particle of mass m is given by
F =
GMm
(R + y)2
=
GMm
R2(1 + y/R)2
= mg 1 − 2
y
R
+ ··· (3.8)
where y is measured from the Earth’s surface, R is the radius of the Earth, M is the mass of the
Earth, G is the gravitational constant, and g = GM/R2.
Problem 3.7. Position-dependent force
Extend FallingParticleODE to simulate the fall of a particle with the position-dependent force
law (3.8). Assume that a particle is dropped from a height h with zero initial velocity and
compute its impact velocity (speed) when it hits the ground at y = 0. Determine the value of h
for which the impact velocity diﬀers by one percent from its value with a constant acceleration
g = 9.8 m/s2. Take R = 6.37 × 106 m. Make sure that the one percent diﬀerence is due to the
physics of the force law and not the accuracy of your algorithm.
For particles near the Earth’s surface, a more important modiﬁcation is to include the drag
force due to air resistance. The direction of the drag force Fd(v) is opposite to the velocity of the
particle (see Figure 3.1). For a falling body, Fd(v) is upward as shown in Figure 3.1(b). Hence,
the total force F on the falling body can be expressed as
F = −mg + Fd. (3.9)
The velocity dependence of Fd(v) is known theoretically in the limit of very low speeds for
small objects. In general, it is necessary to determine the velocity dependence of Fd(v) empirically
over a limited range of velocities. One way to obtain the form of Fd(v) is to measure y as
CHAPTER 3. SIMULATING PARTICLE MOTION 60
y
(a)
mg
Fd
(b)
mg
Fd
(c)
Figure 3.1: (a) Coordinate system with y measured positive upward from the ground. (b) The
force diagram for upward motion. (c) The force diagram for downward motion.
a function of t and then compute v(t) by calculating the numerical derivative of y(t). Similarly,
we can use v(t) to compute a(t) numerically. From this information, it is possible in principle to
ﬁnd the acceleration as a function of v and to extract Fd(v) from (3.9). However, this procedure
introduces errors (see Problem 3.8b) because the accuracy of the derivatives will be less than the
accuracy of the measured position. An alternative is to reverse the procedure, that is, assume an
explicit form for the v dependence of Fd(v), and use it to solve for y(t). If the calculated values
of y(t) are consistent with the experimental values of y(t), then the assumed v dependence of
Fd(v) is justiﬁed empirically.
The two common assumed forms of the velocity dependence of Fd(v) are
F1,d(v) = C1v (3.10a)
and
F2,d(v) = C2v2
(3.10b)
where the parameters C1 and C2 depend on the properties of the medium and the shape of
the object. In general, (3.10a) and (3.10b) are useful phenomenological expressions that yield
approximate results for Fd(v) over a limited range of v.
Because Fd(v) increases as v increases, there is a limiting or terminal velocity (speed) at which
the net force on a falling object is zero. This terminal speed can be found from (3.9) and (3.10)
by setting Fd = mg and is given by
v1,t =
mg
C1
(linear drag) (3.11a)
v2,t =
mg
C2
1/2
(quadratic drag) (3.11b)
for the linear and quadratic cases, respectively. It is often convenient to express velocities
in terms of the terminal velocity. We can use (3.10) and (3.11) to write Fd in the linear and
quadratic cases as
F1,d = C1v1,t
v
v1,t
= mg
v
v1,t
(3.12a)
F2,d = C2v2,t
2 v
v2,t
2
= mg
v
v2,t
2
. (3.12b)
CHAPTER 3. SIMULATING PARTICLE MOTION 61
t (s) Position (m) t (s) Position (m) t (s) Position (m)
0.2055 0.4188 0.4280 0.3609 0.6498 0.2497
0.2302 0.4164 0.4526 0.3505 0.6744 0.2337
0.2550 0.4128 0.4773 0.3400 0.6990 0.2175
0.2797 0.4082 0.5020 0.3297 0.7236 0.2008
0.3045 0.4026 0.5266 0.3181 0.7482 0.1846
0.3292 0.3958 0.5513 0.3051 0.7728 0.1696
0.3539 0.3878 0.5759 0.2913 0.7974 0.1566
0.3786 0.3802 0.6005 0.2788 0.8220 0.1393
0.4033 0.3708 0.6252 0.2667 0.8466 0.1263
Table 3.1: Results for the vertical fall of a coﬀee ﬁlter. Note that the initial time is not zero.
The time diﬀerence is ≈ 0.0247. This data is also available in the falling.txt ﬁle in the ch03
package.
Hence, we can write the net force (per unit mass) on a falling object in the convenient forms
F1(v)/m = −g 1 −
v
v1,t
, (3.13a)
F2(v)/m = −g 1 −
v2
v2,t
2
. (3.13b)
To determine if the eﬀects of air resistance are important during the fall of ordinary objects,
consider the fall of a pebble of mass m = 10−2 kg. To a good approximation, the drag
force is proportional to v2. For a spherical pebble of radius 0.01 m, C2 is found empirically to
be approximately 10−2 kg/m. From (3.11b) we ﬁnd the terminal velocity to be about 30 m/s.
Because this speed would be achieved by a freely falling body in a vertical fall of approximately
50 m in a time of about 3 s, we expect that the eﬀects of air resistance would be appreciable for
comparable times and distances.
Data often is stored in text ﬁles, and it is convenient to be able to read this data into a
program for analysis. The ResourceLoader class in the Open Source Physics tools package
makes reading these ﬁles easy. This class can read many diﬀerent data types including images
and sound. An example of how to use the ResourceLoader class to read string data is given in
DataLoaderApp.
Listing 3.10: Example of the use of the ResourceLoader class to read data into a program.
package org . opensourcephysics . sip . ch03 ;
import org . opensourcephysics . tools . ;
public class DataLoaderApp {
public s t a t i c void main ( String [ ] args ) {
/ / reads from d i r e c t o r y where DataLoaderApp i s l o c a t e d
String fileName = "falling.txt" ;
/ / g e t s the data f i l e
Resource res = ResourceLoader . getResource ( fileName ,
DataLoaderApp . class ) ;
String data = res . getString ( ) ;
/ / s p l i t s t r i n g on newline c h a r a c t e r
String [ ] l in e s = data . s p l i t ( "\n" ) ;
/ / e x t r a c t x−y data from every l i n e
CHAPTER 3. SIMULATING PARTICLE MOTION 62
Figure 3.2: A falling coﬀee ﬁlter does not fall with constant acceleration due to the eﬀects of
air resistance. The motion sensor below the ﬁlter is connected to a computer which records
position data and stores it in a text ﬁle.
for ( int i = 0 , n = l i ne s . length ; i <n ; i ++) {
i f ( l i ne s [ i ] . trim ( ) . startsWith ( "//" ) ) {
continue ;
}
/ / s p l i t on any white space
String [ ] numbers = l in e s [ i ] . trim ( ) . s p l i t ( "\\s" ) ;
System . out . print ( "t = "+numbers [ 0 ] ) ;
System . out . println ( " y = "+numbers [ 1 ] ) ;
}
}
}
Problem 3.8. The fall of a coﬀee ﬁlter
(a) Use the empirical data for the height y(t) of a coﬀee ﬁlter in the falling.txt data ﬁle to
determine the velocity v(t) using the central diﬀerence approximation given by
v(t) ≈
y(t + ∆t) − y(t − ∆t)
2∆t
(central diﬀerence approximation). (3.14)
Show that if we write the acceleration as a(t) ≈ [v(t + ∆t) − v(t)]/∆t and use the backward
diﬀerence approximation for the velocity,
v(t) ≈
y(t) − y(t − ∆t)
∆t
(backward diﬀerence approximation), (3.15)
we can express the acceleration as
a(t) ≈
y(t + ∆t) − 2y(t) + y(t − ∆t)
(∆t)2
. (3.16)
CHAPTER 3. SIMULATING PARTICLE MOTION 63
Use (3.16) to determine the acceleration.
(b) Determine the terminal velocity from the data given in the falling.txt ﬁle. This determination
is diﬃcult, in part because the terminal velocity has not been reached during the
time that the fall of the coﬀee ﬁlter was observed. Use your approximate results for v(t)
and a(t) to plot a as a function of v and, if possible, determine the nature of the velocity
dependence of a. Discuss the accuracy of your results for the acceleration.
(c) Choose one of the numerical algorithms that we have discussed and write a class that encapsulates
this algorithm for the motion of a particle with quadratic drag resistance.
(d) Choose the terminal velocity as an input parameter and take as your ﬁrst guess for the
terminal velocity the value you found in part (b). Make sure that your computed results
for the height of the particle do not depend on ∆t to the necessary accuracy. Compare your
plot of the computed values of y(t) for diﬀerent choices of the terminal velocity with the
empirical values of y(t) in falling.txt.
(e) Repeat parts (c) and (d) assuming linear drag resistance. What are the qualitative diﬀerences
between the two computed forms of y(t) for the same terminal velocity?
(f) Visually determine which form of the drag force yields the best overall ﬁt to the data. If
the ﬁt is not perfect, what is your criteria for which ﬁt is better? Is it better to match your
results to the experimental data at early times or at later times? Or did you adopt another
criterion? What can you conclude about the velocity dependence of the drag resistance on
a coﬀee ﬁlter?
Problem 3.9. Eﬀect of air resistance on the ascent and descent of a pebble
(a) Verify the claim made in Section 3.7 that the eﬀects of air resistance on a falling pebble can
be appreciable. Compute the speed at which a pebble reaches the ground if it is dropped
from rest at a height of 50 m. Compare this speed to that of a freely falling object under
the same conditions. Assume that the drag force is proportional to v2 and that the terminal
velocity is 30 m/s.
(b) Suppose a pebble is thrown vertically upward with an initial velocity v0. In the absence of
air resistance, we know that the maximum height reached by the pebble is v2
0/2g, its velocity
upon return to the Earth equals v0, the time of ascent equals the time of descent, and the
total time in the air is 2v0/g. Before doing a simulation, give a simple qualitative explanation
of how you think these quantities will be aﬀected by air resistance. In particular, how will
the time of ascent compare with the time of descent?
(c) Do a simulation to determine if your qualitative answers in part (b) are correct. Assume
that the drag force is proportional to v2. Choose the coordinate system shown in Figure 3.1
with y positive upward. What is the net force for v > 0 and v < 0? We can characterize
the magnitude of the drag force by a terminal velocity even if the motion of the pebble
is upward and even if the pebble never attains this velocity. Choose the terminal velocity
vt = 30 m/s, corresponding to a drag coeﬃcient of C2 ≈ 0.01089. It is a good idea to choose
an initial velocity that allows the pebble to remain in the air for a time suﬃciently long so
that the eﬀect of the drag force is appreciable. A reasonable choice is v(t = 0) = 50 m/s. You
might ﬁnd it convenient to express the drag force in the form Fd ∝ −v∗Math.abs(v). One
way to determine the maximum height of the pebble is to use the statement
CHAPTER 3. SIMULATING PARTICLE MOTION 64
i f ( v vold < 0) {
control . println ( "maximum height = " + y ) ;
}
where v = vn+1 and vold = vn. Why is this criterion preferable to other criteria that you
might imagine using?
3.8 Two-Dimensional Trajectories
You are probably familiar with two-dimensional trajectory problems in the absence of air resistance.
For example, if a ball is thrown in the air with an initial velocity v0 at an angle θ0 with
respect to the ground, how far will the ball travel in the horizontal direction, and what is its
maximum height and time of ﬂight? Suppose that a ball is released at a nonzero height h above
the ground. What is the launch angle for the maximum range? Are your answers still applicable
if air resistance is taken into account? We consider these and similar questions in the following.
Consider an object of mass m whose initial velocity v0 is directed at an angle θ0 above
the horizontal [see Figure 3.3(a)]. The particle is subjected to gravitational and drag forces of
magnitude mg and Fd; the direction of the drag force is opposite to v (see Figure 3.33.3(b)).
Newton’s equations of motion for the x and y components of the motion can be written as
m
dvx
dt
= −Fd cosθ (3.17a)
m
dvy
dt
= −mg − Fd sinθ. (3.17b)
For example, let us maximize the range of a round steel ball of radius 4 cm. A reasonable
assumption for a steel ball of this size and typical speed is that Fd = C2v2. Because vx = v cosθ
and vy = v sinθ, we can rewrite (3.17) as
m
dvx
dt
= −C2vvx (3.18a)
m
dvy
dt
= −mg − C2vvy. (3.18b)
Note that −C2vvx and −C2vvy are the x and y components of the drag force −C2v2. Because
(3.18a) and (3.18b) for the change in vx and vy involve the square of the velocity, v2 = vx
2 + vy
2,
we cannot calculate the vertical motion of a falling body without reference to the horizontal
component, that is, the motion in the x and y direction is coupled.
CHAPTER 3. SIMULATING PARTICLE MOTION 65
θ0
y
x
h
(a)
θ
Fd
mg
v
(b)
Figure 3.3: (a) A ball is thrown from a height h at a launch angle θ0 measured with respect to
the horizontal. The initial velocity is v0. (b) The gravitational and drag forces on a particle.
Problem 3.10. Trajectory of a steel ball
(a) Use Projectile and ProjectileApp to compute the two-dimensional trajectory of a ball
moving in air without air friction and plot y as a function of x. Compare your computed
results with the exact results. For example, assume that a ball is thrown from ground level
at an angle θ0 above the horizontal with an initial velocity of v0 = 15 m/s. Vary θ0 and show
that the maximum range occurs at θ0 = θmax = 45◦. What is Rmax, the maximum range, at
this angle? Compare your numerical result to the analytic result Rmax = v2
0/g.
(b) Suppose that a steel ball is thrown from a height h at an angle θ0 above the horizontal with
the same initial speed as in part (a). If you neglect air resistance, do you expect θmax to
be larger or smaller than 45◦? What is θmax for h = 2 m? By what percent is the range R
changed if θ is varied by 2% from θmax?
(c) Consider the eﬀects of air resistance on the range and optimum angle of a steel ball. For
a ball of mass 7 kg and cross-sectional area 0.01 m2, the parameter C2 ≈ 0.1. What are the
units of C2? It is convenient to exaggerate the eﬀects of air resistance so that you can more
easily determine the qualitative nature of the eﬀects. Hence, compute the optimum angle
for h = 2 m, v0 = 30 m/s, and C2/m = 0.1 and compare your answer to the value found in part
(b). Is R more or less sensitive to changes in θ0 from θmax than in part (b)? Determine the
optimum launch angle and the corresponding range for the more realistic value of C2 = 0.1.
A detailed discussion of the maximum range of the ball has been given by Lichtenberg and
Wills.
Problem 3.11. Comparing the motion of two objects
Consider the motion of two identical objects that both start from a height h. One object is
dropped vertically from rest and the other is thrown with a horizontal velocity v0. Which object
reaches the ground ﬁrst?
(a) Give reasons for your answer assuming that air resistance can be neglected.
(b) Assume that air resistance cannot be neglected and that the drag force is proportional to v2.
Give reasons for your anticipated answer for this case. Then perform numerical simulations
using, for example, C2/m = 0.1, h = 10 m, and v0 = 30 m/s. Are your qualitative results
consistent with your anticipated answer? If they are not, the source of the discrepancy
CHAPTER 3. SIMULATING PARTICLE MOTION 66
might be an error in your program. Or the discrepancy might be due to your failure to
anticipate the eﬀects of the coupling between the vertical and horizontal motion.
(c) Suppose that the drag force is proportional to v rather than to v2. Is your anticipated answer
similar to that in part (b)? Do a numerical simulation to test your intuition.
3.9 Decay Processes
The power of mathematics when applied to physics comes in part from the fact that seemingly
unrelated problems frequently have the same mathematical formulation. Hence, if we can solve
one problem, we can solve other problems that might appear to be unrelated. For example, the
growth of bacteria, the cooling of a cup of hot water, the charging of a capacitor in a RC circuit,
and nuclear decay can all be formulated in terms of equivalent diﬀerential equations.
Consider a large number of radioactive nuclei. Although the number of nuclei is discrete,
we may often treat this number as a continuous variable because the number of nuclei is very
large. In this case the law of radioactive decay is that the rate of decay is proportional to the
number of nuclei. Thus we can write
dN
dt
= −λN (3.19)
where N is the number of nuclei and λ is the decay constant. Of course, we do not need to use
a computer to solve this decay equation, and the analytic solution is
N(t) = N0e−λt
(3.20)
where N0 is the initial number of particles. The quantity λ in (3.19) or (3.20) has dimensions of
inverse time.
Problem 3.12. Single nuclear species decay
(a) Write a class that solves and plots the nuclear decay problem. Input the decay constant
λ from the control window. For λ = 1 and ∆t = 0.01, compute the diﬀerence between the
analytic result and the result of the Euler algorithm for N(t)/N(0) at t = 1 and t = 2. Assume
that time is measured in seconds.
(b) A common time unit for radioactive decay is the half-life T1/2, the time it takes for one-half
of the original nuclei to decay. Another natural time scale is the time τ it takes for 1/e of the
original nuclei to decay. Use your modiﬁed program to verify that T1/2 = ln2/λ. How long
does it take for 1/e of the original nuclei to decay? How is T1/2 related to τ?
(c) Because it is awkward to treat very large or very small numbers on a computer, it is convenient
to choose units so that the computed values of the variables are not too far from unity.
Determine the decay constant λ in units of s−1 for 238U → 234Th if the half-life is 4.5 × 109
years. What units and time step would be appropriate for the numerical solution of (3.19)?
How would these values change if the particle being modeled was a muon with a half-life
of 2.2 × 10−6 s?
(d) Modify your program so that the time t is expressed in terms of the half-life. That is, at t = 1,
one half of the particles would have decayed and at t = 2, one quarter of the particles would
have decayed. Use your program to determine the time for 1000 atoms of 238U to decay to
20% of their original number. What would be the corresponding time for muons?
CHAPTER 3. SIMULATING PARTICLE MOTION 67
Multiple nuclear decays produce systems of ﬁrst-order diﬀerential equations. Problem 3.13
asks you to model such a system using the techniques similar to those that we have already
used.
207 208 209 210 211
mass number
82
83
84
85
86
nuclearcharge
Pb
Bi
Po Po
At
Rn
5.7 hr
30 yr
74%
15 hr
7.2 hr
0.52 s
15 hr
26%
Figure 3.4: The decay scheme of 211Rn. Note that 211Rn decays via two branches, and the ﬁnal
product is the stable isotope 207Pb. All vertical transitions are by electron capture, and all
diagonal transitions are by alpha decay. The times represent half-lives.
Problem 3.13. Multiple nuclear decays
(a) 76Kr decays to 76Br via electron capture with a half-life of 14.8 h, and 76Br decays to 76Se
via electron capture and positron emission with a half-life of 16.1 h. In this case there are
two half-lives, and it is convenient to measure time in units of the smallest half-life. Write a
program to compute the time dependence of the amount of 76Kr and 76Se over an interval
of one week. Assume that the sample initially contains 1 gm of pure 76Kr.
(b) 28Mn decays via beta emission to 28Al with a half-life of 21 h, and 28Al decays by positron
emission to 28Si with a half-life of 2.31 min. If we were to use minutes as the unit of time,
our program would have to do many iterations before we would see a signiﬁcant decay of
the 28Mn. What simplifying assumption can you make to speed up the computation?
(c) 211Rn decays via two branches as shown in Figure 3.4. Make any necessary approximations
and compute the amount of each isotope as a function of time, assuming that the sample
initially consists of 1µg of 211Rn.
CHAPTER 3. SIMULATING PARTICLE MOTION 68
Problem 3.14. Cooling of a cup of coﬀee
The nature of the energy transfer from the hot water in a cup of coﬀee to the surrounding air
is complicated and, in general, involves the mechanisms of convection, radiation, evaporation,
and conduction. However, if the temperature diﬀerence between the water and its surroundings
is not too large, the rate of change of the temperature of the water may be assumed to be
proportional to the temperature diﬀerence. We can formulate this statement more precisely in
terms of a diﬀerential equation:
dT
dt
= −r (T − Ts) (3.21)
where T is the temperature of the water, Ts is the temperature of its surroundings, and r is
the cooling constant. The minus sign in (3.21) implies that if T > Ts, the temperature of the
water will decrease with time. The value of the cooling constant r depends on the heat transfer
mechanism, the contact area with the surroundings, and the thermal properties of the water.
The relation (3.21) is sometimes known as Newton’s law of cooling, even though the relation is
only approximate, and Newton did not express the rate of cooling in this form.
(a) Write a program that computes the numerical solution of (3.21). Test your program by
choosing the initial temperature T0 = 100◦C, Ts = 0◦C, r = 1, and ∆t = 0.1.
(b) Model the cooling of a cup of coﬀee by choosing r = 0.03. What are the units of r? Plot
the temperature T as a function of the time using T0 = 87◦C and Ts = 17◦C. Make sure
that your value of ∆t is suﬃciently small so that it does not aﬀect your results. What is the
appropriate unit of time in this case?
(c) Suppose that the initial temperature of a cup of coﬀee is 87◦C, but the coﬀee can be sipped
comfortably only when its temperature is ≤ 75◦C. Assume that the addition of cream cools
the coﬀee by 5◦C. If you are in a hurry and want to wait the shortest possible time, should
the cream be added ﬁrst and the coﬀee be allowed to cool, or should you wait until the
coﬀee has cooled to 80◦C before adding the cream? Use your program to “simulate” these
two cases. Choose r = 0.03 and Ts = 17◦C. What is the appropriate unit of time in this case?
Assume that the value of r does not change when the cream is added.
3.10 ∗
Visualizing Three-Dimensional Motion
The world in which we live is three-dimensional (3D), and it sometimes is necessary to visualize
phenomena in three dimensions. There are several 3D visualization packages available, including
Java3D developed by Oracle. Because we want a three-dimensional visualization framework
designed for physics simulations, we have developed our own API.1
The Open Source Physics 3D drawing framework is deﬁned in subpackages in the display3d
package and provides a high level of abstraction for rendering three-dimensional objects.
These 3D drawable objects implement the Element interface in the core package, which
enables their position, size, and appearance to be controlled. Elements can be grouped with
other elements, can change their visibility, and respond to mouse actions. Listing 3.11 shows
that it is not much more diﬃcult to deﬁne and manipulate a three-dimensional model than a
two-dimensional model. The most signiﬁcant change is that the program instantiates a Display3DFrame
and adds Element objects such as spheres and boxes to this frame.
1A framework consists of several classes and an API that does a particular task. In general, these classes are in
diﬀerent packages.
CHAPTER 3. SIMULATING PARTICLE MOTION 69
Listing 3.11: A three-dimensional bouncing ball created using the Open Source Physics display3D.simple3d
package.
package org . opensourcephysics . sip . ch03 ;
import java . awt . ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . Display3DFrame ;
import org . opensourcephysics . display3d . simple3d . ;
import org . opensourcephysics . display3d . core . Resolution ;
public class Ball3DApp extends AbstractSimulation {
Display3DFrame frame = new Display3DFrame ( "3D Ball" ) ;
Element ball = new ElementEllipsoid ( ) ;
double time = 0 , dt = 0 . 1 ;
double vz = 0;
public Ball3DApp ( ) {
frame . setPreferredMinMax ( −5.0 , 5.0 , −5.0 , 5.0 , 0.0 , 1 0 . 0 ) ;
ball . setXYZ (0 , 0 , 9 ) ;
/ / b a l l displayed in 3D as a planar e l l i p s e of s i z e ( dx , dy , dz )
ball . setSizeXYZ (1 , 1 , 1 ) ;
frame . addElement ( ball ) ;
Element box = new ElementBox ( ) ;
box . setXYZ (0 , 0 , 0 ) ;
box . setSizeXYZ (4 , 4 , 1 ) ;
box . getStyle ( ) . s e t F i l l C o l o r ( Color .RED) ;
/ / di vid e s i d e s of box into smaller r e c t a n g l e s
box . getStyle ( ) . setResolution (new Resolution (5 , 5 , 2 ) ) ;
frame . addElement ( box ) ;
frame . setMessage ( "time = "+ControlUtils . f2 ( time ) ) ;
}
protected void doStep ( ) {
time += 0 . 1 ;
double z = ball . getZ ()+ vz dt −4.9 dt dt ;
vz −= 9.8 dt ;
i f ( ( vz<0)&&(z <1)) {
vz = −vz ;
}
ball . setZ ( z ) ;
frame . setMessage ( "time = "+ControlUtils . f2 ( time ) ) ;
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl . createApp (new Ball3DApp ( ) ) ;
}
}
Note that the 3D drawing API is similar to the 2D drawing API described in Section 3.3.
The setPreferredMinMax method, for example, has a variant that accepts up to six double
parameters. You can set the size and location of objects in three dimensions before or after they
are added to the frame.
Although the Display3DFrame is designed for three-dimensional visualizations, it can also
show two-dimensional projections. For example, we can project onto the yz-plane by invoking
CHAPTER 3. SIMULATING PARTICLE MOTION 70
ω
v
Fm
Figure 3.5: The Magnus force on a spinning ball pushes a ball with topspin down.
frame . setDisplayMode ( VisualizationHints .DISPLAY_PLANAR_YZ ) ;
Projections onto various planes are available at runtime using the frame’s menu. The full capabilities
of Open Source Physics 3D are discussed in the Open Source Physics User’s Guide.
We will require only a small subset of the methods of the Open Source Physics 3D framework
to create the three-dimensional visualizations in this book and will introduce the necessary
objects as needed. Readers may wish to run the demonstration programs in the ch03
directory to obtain an overview of its drawing capabilities.
Of particular interest to baseball fans is the curve of balls in ﬂight due to their rotation.
This force was ﬁrst investigated in 1850 by G. Magnus, and the curvature of the trajectories
of spinning objects is now known as the Magnus eﬀect. It can be explained qualitatively by
observing that the speed of the ball’s surface relative to the air is diﬀerent on opposite edges
of the ball. If the drag force has the form Fdrag ∼ v2, then the unbalanced force due to the
diﬀerence in the velocity on opposite sides of the ball due to its rotation is given by
Fmagnus ∼ v∆v. (3.22)
We can express the velocity diﬀerence in terms of the ball’s angular velocity and radius and
write
Fmagnus ∼ vrω. (3.23)
The direction of the Magnus force is perpendicular to both the velocity and the rotation
axis. For example, if we observe a ball moving to the right and rotating clockwise (that is, with
topspin), then the velocity of the ball’s surface relative to the air at the top, v+ωr, is higher than
the velocity at the bottom, v − ωr. Because the larger velocity will produce a larger force, the
Magnus eﬀect will contribute a force in the downward direction. These considerations suggest
that the Magnus force can be expressed as a vector product:
Fmagnus/m = CM(ω × v) (3.24)
where m is the mass of the ball. The constant, CM, depends on the radius of the ball, the
viscosity of air, and other factors such as the orientation of the stitching. We will assume that
the ball is rotating fast enough so that it can be modeled using an average value. (If the ball
does not rotate, the pitcher has thrown a knuckleball.) The total force on the baseball is given
by
F/m = g − CD|v|v + CM(ω × v). (3.25)
CHAPTER 3. SIMULATING PARTICLE MOTION 71
Equation (3.25) leads to the following rates for the velocity components:
dvx
dt
= −CDvvx + CM(ωyvz − ωzvy) (3.26a)
dvy
dt
= −CDvvy + CM(ωzvx − ωxvz) (3.26b)
dvz
dt
= −CDvvz + CM(ωxvy − ωyvx) − g (3.26c)
where we will assume that ω is a constant. The rate for each of the three position variables is
the corresponding velocity. Typical parameter values for a 149 gram baseball are CD = 6 × 10−3
and CM = 4 × 10−4. See the book by Adair for a more complete discussion.
Problem 3.15. Curveballs
(a) Create a class that implements (3.26). Assume that the initial ball is released at z = 1.8m
above and x = 18m from home plate. Set the initial angle above the horizontal and the
initial speed using the constructor.
(b) Write a program that plots the vertical and horizontal deﬂection of the baseball as it travels
toward home plate. First set the drag and Magnus forces to zero and test your program
using analytic results for a 40 m/s fast ball. What initial angle is required for the pitch to
pass over home plate at a height of 1.5 m?
(c) Add the drag force with CD = 6 × 10−3. What initial angle is required for this pitch to be a
strike assuming that the other initial conditions are unchanged? Plot the vertical deﬂection
with and without drag for comparison.
(d) Add topspin to the pitch using a typical spin of ωy = 200rad/s and CM = 4 × 10−4. How
much does topspin change the height of the ball as it passes over the plate? What about
backspin?
(e) How much does a 35 m/s curve ball deﬂect if it is pitched with an initial spin of 200 rad/s?
Problem 3.16. Visualizing baseball trajectories in three dimensions
Add a 3D visualization of the baseball’s trajectory to Problem 3.15 using ElementTrail to display
the path of the ball. The following code fragment shows how a trail is created and used.
ElementTrail t r a i l = new ElementTrail ( ) ;
t r a i l . setMaximumPoints (500);
t r a i l . getStyle ( ) . setLineColor ( java . awt . Color .RED) ;
/ / frame3D i s an OSP3DFrame
frame3D . addElement ( t r a i l ) ;
/ / points are added to a t r a i l to show a t r a j e c t o r y
t r a i l . addPoint ( x , y , z ) ; / / adds a point to the t r a c e
Coupled three-dimensional equations of motion occur in electrodynamics when a charged
particle travels through electric and magnetic ﬁelds. The equation of motion can be written in
vector form as
m˙v = qE + q(v × B) (3.27)
CHAPTER 3. SIMULATING PARTICLE MOTION 72
where m is the mass of the particle, q is the charge, and E and B represent the electric and
magnetic ﬁelds, respectively. For the special case of a constant magnetic ﬁeld, the trajectory
of a charged particle is a spiral along the ﬁeld lines with a cyclotron orbit whose period of
revolution is 2πm/qB. The addition of an electric ﬁeld changes this motion dramatically.
The rates for the velocity components of a charged particle using units such that m = q = 1
are
dvx
dt
= Ex + vyBz − vzBy (3.28a)
dvy
dt
= Ey + vzBx − vxBz (3.28b)
dvz
dt
= Ez + vxBy − vyBx. (3.28c)
The rate for each of the three position variables is again the corresponding velocity.
Problem 3.17. Motion in electric and magnetic ﬁelds
(a) Write a program to simulate the two-dimensional motion of a charged particle in a constant
electric and magnetic ﬁeld with the magnetic ﬁeld in the ˆz direction and the electric ﬁeld in
the ˆy direction. Assume that the initial velocity is in the xy plane.
(b) Why does the trajectory in part (a) remain in the x-y plane?
(c) In what direction does the charged particle drift if there is an electric ﬁeld in the x direction
and a magnetic ﬁeld in the z direction if it starts at rest from the origin? What type of curve
does the charged particle follow?
(d) Create a three-dimensional simulation of the trajectory of a particle in constant electric
and magnetic ﬁelds. Verify that a charged particle undergoes spiral motion in a constant
magnetic ﬁeld and zero electric ﬁeld. Predict the trajectory if an electric ﬁeld is added and
compare the results of the simulation to your prediction. Consider electric ﬁelds that are
parallel to and perpendicular to the magnetic ﬁeld.
Although the trajectory of a charged particle in constant electric and magnetic ﬁelds can be
solved analytically, the trajectories in the presence of dipole ﬁelds cannot. A magnetic dipole
with dipole moment p = |p| ˆp produces the following magnetic ﬁeld:
B =
µ0m
4π 0r3
[3 ˆp · ˆr)ˆr − ˆp]. (3.29)
(The distinction between the symbol p for the dipole moment and p for momentum should be
clear from the context.)
∗Problem 3.18. Motion in a magnetic dipole ﬁeld
Model the Earth’s Van Allen radiation belt using the following formula for the dipole ﬁeld:
B = B0
RE
R
3
3 ˆp · ˆr ˆr − ˆp (3.30)
where RE is the radius of the Earth, and the magnetic ﬁeld at the equator is B0 = 3.5×10−5 tesla.
Note that a 1 MeV electron at 2 Earth radii travels in very tight spirals with a cyclotron period
that is much smaller than the travel time between the north and south poles. Better visual
results can be obtained by raising the electron energies by a factor of ∼ 1000. Use classical
dynamics, but include the relativistic dependence of the mass on the particle speed.
CHAPTER 3. SIMULATING PARTICLE MOTION 73
3.11 Levels of Simulation
So far we have considered models in which the microscopic complexity of the system of interest
has been simpliﬁed considerably. Consider, for example, the motion of a pebble falling through
the air. First we reduced the complexity by representing the pebble as a particle with no internal
structure. Then we reduced the number of degrees of freedom even more by representing the
collisions of the pebble with the many molecules in the air by a velocity-dependent friction
term. The resultant phenomenological model is a fairly accurate representation of realistic
physical systems. However, what we gain in simplicity, we lose in range of applicability.
In a more detailed model, the individual physical processes would be represented microscopically.
For example, we could imagine doing a simulation in which the eﬀects of the air
are represented by a ﬂuid of particles that collide with one another and with the falling body.
How accurately do we need to represent the potential energy of interaction between the ﬂuid
particles? Clearly the level of detail that is needed depends on the accuracy of the corresponding
experimental data and the type of information in which we are interested. For example, we
do not need to take into account the inﬂuence of the moon on a pebble falling near the Earth’s
surface. And the level of detail that we can simulate depends in part on the available computer
resources.
The terms simulation and modeling are frequently used interchangeably, and their precise
meanings are not important. Many practitioners might say that so far we have solved several
mathematical models numerically and have not yet done a simulation. Beginning with the next
chapter, we will be able to say that we are doing simulations. The diﬀerence is that our models
will represent physical systems in more detail, and we will give more attention to what physical
quantities we should measure. In other words our simulations will become more analogous to
laboratory experiments.
Appendix 3A: Numerical Integration of Newton’s Equation of
Motion
We summarize several of the common ﬁnite diﬀerence methods for the solution of Newton’s
equations of motion with continuous force functions. The number and variety of algorithms
currently in use is evidence that no single method is superior under all conditions.
To simplify the notation, we consider the motion of a particle in one dimension and write
Newton’s equations of motion in the form
dv
dt
= a(t) (3.31a)
dx
dt
= v(t) (3.31b)
where a(t) ≡ a(x(t),v(t),t). The goal of ﬁnite diﬀerence methods is to determine the values of
xn+1 and vn+1 at time tn+1 = tn + ∆t. We already have seen that ∆t must be chosen so that
the integration method generates a stable solution. If the system is conservative, ∆t must be
suﬃciently small so that the total energy is conserved to the desired accuracy.
The nature of many of the integration algorithms can be understood by expanding vn+1 =
CHAPTER 3. SIMULATING PARTICLE MOTION 74
v(tn + ∆t) and xn+1 = x(tn + ∆t) in a Taylor series. We write
vn+1 = vn + an∆t + O (∆t)2
, (3.32a)
xn+1 = xn + vn∆t +
1
2
an(∆t)2
+ O (∆t)3
. (3.32b)
The familiar Euler algorithm is equivalent to retaining the O(∆t) terms in (3.32):
vn+1 = vn + an∆t (3.33a)
xn+1 = xn + vn∆t (Euler algorithm). (3.33b)
Because order ∆t terms are retained in (3.33), the local truncation error, the error in one time
step, is order (∆t)2. The global error, the total error over the time of interest, due to the accumulation
of errors from step to step is order ∆t. This estimate of the global error follows from
the fact that the number of steps into which the total time is divided is proportional to 1/∆t.
Hence, the order of the global error is reduced by a factor of 1/∆t relative to the local error. We
say that an algorithm is nth order if its global error is order (∆t)n. The Euler algorithm is an
example of a ﬁrst-order algorithm.
The Euler algorithm is asymmetrical because it advances the solution by a time step ∆t, but
uses information about the derivative only at the beginning of the interval. We have already
found that the accuracy of the Euler algorithm is limited and that frequently its solutions are
not stable. We have found also that a simple modiﬁcation of (3.33) yields solutions that are
stable for oscillatory systems. For completeness, we repeat the Euler–Cromer algorithm here:
vn+1 = vn + an∆t (3.34a)
xn+1 = xn + vn+1∆t (Euler–Cromer algorithm). (3.34b)
An obvious way to improve the Euler algorithm is to use the mean velocity during the
interval to obtain the new position. The corresponding midpoint algorithm can be written as
vn+1 = vn + an∆t (3.35a)
and
xn+1 = xn +
1
2
(vn+1 + vn)∆t (midpoint algorithm). (3.35b)
Note that if we substitute (3.35a) for vn+1 into (3.35b), we obtain
xn+1 = xn + vn∆t +
1
2
an ∆t2
. (3.36)
Hence, the midpoint algorithm yields second-order accuracy for the position and ﬁrst-order
accuracy for the velocity. Although the midpoint approximation yields exact results for constant
acceleration, it does not usually yield much better results than the Euler algorithm. In fact, both
algorithms are equally poor because the errors increase with each time step.
A higher order algorithm whose error is bounded is the half-step algorithm. In this algorithm,
the average velocity during an interval is taken to be the velocity in the middle of the
interval. The half-step algorithm can be written as
vn+ 1
2
= vn− 1
2
+ an∆t, (3.37a)
xn+1 = xn + vn+ 1
2
∆t. (half-step algorithm) (3.37b)
Note that the half-step algorithm is not self-starting, that is, (3.37a) does not allow us to calculate
v1
2
. This problem can be overcome by adopting the Euler algorithm for the ﬁrst half
step:
CHAPTER 3. SIMULATING PARTICLE MOTION 75
v1
2
= v0 +
1
2
a0 ∆t. (3.37c)
Because the half-step algorithm is stable, it is a common textbook algorithm. The Euler–Richardson
algorithm can be motivated as follows. We ﬁrst write x(t + ∆t) as
x1 ≈ x(t + ∆t) = x(t) + v(t)∆t +
1
2
a(t)(∆t)2
. (3.38)
The notation x1 implies that x(t + ∆t) is related to x(t) by one time step. We may also divide the
step ∆t into half steps and write the ﬁrst half step, x(t + 1
2 ∆t), as
x t +
1
2
∆t ≈ x(t) + v(t)
∆t
2
+
1
2
a(t)
∆t
2
2
. (3.39)
The second half step, x2(t + ∆t), may be written as
x2(t + ∆t) ≈ x t +
1
2
∆t + v t +
1
2
∆t
∆t
2
+
1
2
a t +
1
2
∆t
∆t
2
2
. (3.40)
We substitute (3.39) into (3.40) and obtain
x2(t + ∆t) ≈ x(t) +
1
2
v t) + v(t +
1
2
∆t ∆t +
1
2
a(t) + a t +
1
2
∆t)
1
2
∆t
2
. (3.41)
Now a(t + 1
2 ∆t) ≈ a(t) + 1
2 a (t)∆t. Hence to order (∆t)2, (3.41) becomes
x2(t + ∆t) = x(t) +
1
2
v(t) + v t +
1
2
∆t ∆t +
1
2
2a(t)
1
2
∆t
2
. (3.42)
We can ﬁnd an approximation that is accurate to order (∆t)3 by combining (3.38) and (3.42)
so that the terms to order (∆t)2 cancel. The combination that works is 2x2 − x1, which gives the
Euler–Richardson result:
xer(t + ∆t) ≡ 2x2(t + ∆t) − x1(t + ∆t) = x(t) + v t +
1
2
∆t ∆t + O(∆t)3
. (3.43)
The same reasoning leads to an approximation for the velocity accurate to (∆t)3 giving
ver ≡ 2v2(t + ∆t) − v1(t + ∆t) = v(t) + a t +
1
2
∆t ∆t + O(∆t)3
. (3.44)
A bonus of the Euler–Richardson algorithm is that the quantities |x2 − x1| and |v2 − v1| give
an estimate for the error. We can use these estimates to change the time step so that the error is
always within some desired level of precision. We will see that the Euler–Richardson algorithm
is equivalent to the second-order Runge–Kutta algorithm [see (3.54)].
One of the most common drift-free higher order algorithms is commonly attributed to Verlet.
We write the Taylor series expansion for xn−1 in a form similar to (3.32b):
xn−1 = xn − vn∆t +
1
2
an(∆t)2
. (3.45)
If we add the forward and reverse forms, (3.32b) and (3.45) respectively, we obtain
xn+1 + xn−1 = 2xn + an(∆t)2
+ O (∆t)4
(3.46)
CHAPTER 3. SIMULATING PARTICLE MOTION 76
or
xn+1 = 2xn − xn−1 + an(∆t)2
(leapfrog algorithm). (3.47a)
Similarly, the subtraction of the Taylor series for xn+1 and xn−1 yields
vn =
xn+1 − xn−1
2∆t
(leapfrog algorithm). (3.47b)
Note that the global error associated with the leapfrog algorithm (3.47) is third-order for the
position and second-order for the velocity. However, the velocity plays no part in the integration
of the equations of motion. The leapfrog algorithm is also known as the explicit central
diﬀerence algorithm. Because this form of the leapfrog algorithm is not self-starting, another
algorithm must be used to obtain the ﬁrst few terms. An additional problem is that the new
velocity (3.47b) is found by computing the diﬀerence between two quantities of the same order
of magnitude. Such an operation results in a loss of numerical precision and may give rise to
roundoﬀ errors.
A mathematically equivalent version of the leapfrog algorithm is given by
xn+1 = xn + vn∆t +
1
2
an(∆t)2
(3.48a)
vn+1 = vn +
1
2
(an+1 + an)∆t (velocity Verlet algorithm). (3.48b)
We see that (3.48), known as the velocity form of the Verlet algorithm, is self-starting and minimizes
roundoﬀ errors. Because we will not use (3.47) in the text, we will refer to (3.48) as the
Verlet algorithm.
We can derive (3.48) from (3.47) by the following considerations. We ﬁrst solve (3.47b) for
xn−1 and write xn−1 = xn+1 − 2vn∆t. If we substitute this expression for xn−1 into (3.47a) and
solve for xn+1, we ﬁnd the form (3.48a). Then we use (3.47b) to write vn+1 as
vn+1 =
xn+2 − xn
2∆t
(3.49)
and use (3.47a) to obtain xn+2 = 2xn+1 − xn + an+1(∆t)2. If we substitute this form for xn+2 into
(3.49), we obtain
vn+1 =
xn+1 − xn
∆t
+
1
2
an+1∆t. (3.50)
Finally, we use (3.48a) for xn+1 to eliminate xn+1 − xn from (3.50); after some algebra we obtain
the desired result (3.48b).
Another useful algorithm that avoids the roundoﬀ errors of the leapfrog algorithm is due
to Beeman and Schoﬁeld. We write the Beeman algorithm in the form
xn+1 = xn + vn∆t +
1
6
(4an − an−1)(∆t)2
(3.51a)
vn+1 = vn +
1
6
(2an+1 + 5an − an−1)∆t (Beeman algorithm). (3.51b)
Note that (3.51) does not calculate particle trajectories more accurately than the Verlet algorithm.
Its advantage is that it generally does a better job of maintaining energy conservation.
However, the Beeman algorithm is not self-starting.
CHAPTER 3. SIMULATING PARTICLE MOTION 77
The most common ﬁnite diﬀerence method for solving ordinary diﬀerential equations is the
Runge–Kutta method. To explain the many algorithms based on this method, we consider the
solution of the ﬁrst-order diﬀerential equation
dx
dt
= f (x,t). (3.52)
Runge–Kutta algorithms evaluate the rate f (x,t) multiple times in the interval [t,t +∆t]. For example,
the classic fourth-order Runge–Kutta algorithm, which we will discuss in the following,
evaluates f (x,t) at four times tn, tn + a1∆t, tn + a2∆t, and tn + a3∆t. Each evaluation of f (x,t)
produces a slightly diﬀerent rate r1, r2, r3, and r4. The idea is to advance the solution using a
weighted average of the intermediate rates:
yn+1 = yn + (c1r1 + c2r2 + c3r3 + c4r4)∆t. (3.53)
The various Runge–Kutta algorithms correspond to diﬀerent choices for the constants ai
and ci. These algorithms are classiﬁed by the number of intermediate rates {ri, i = 1,...,N}. The
determination of the Runge–Kutta coeﬃcients is diﬃcult for all but the lowest order methods,
because the coeﬃcients must be chosen to cancel as many terms in the Taylor series expansion of
f (x,t) as possible. The ﬁrst non-zero expansion coeﬃcient determines the order of the Runge–
Kutta algorithm. Fortunately, these coeﬃcients are tabulated in most numerical analysis books.
To illustrate how the various sets of Runge–Kutta constants arise, consider the case N = 2.
The second-order Runge–Kutta solution of (3.52) can be written using standard notation as
xn+1 = xn + k2 + O (∆t)3
(3.54a)
where
k2 = f (xn +
k1
2
,tn +
∆t
2
)∆t (3.54b)
k1 = f (xn,tn)∆t. (3.54c)
Note that the weighted average uses c1 = 0 and c2 = 1. The interpretation of (3.54) is as follows.
The Euler algorithm assumes that the slope f (xn,tn) at (xn,tn) can be used to extrapolate to the
next step, that is, xn+1 = xn +f (xn,tn)∆t. A plausible way of making a more accurate determination
of the slope is to use the Euler algorithm to extrapolate to the midpoint of the interval and
then to use the midpoint slope across the full width of the interval. Hence, the Runge–Kutta
estimate for the rate is f (x∗,tn + 1
2 ∆t), where x∗ = xn + 1
2 f (xn,tn)∆t [see (3.54c)].
The application of the second-order Runge–Kutta algorithm to Newton’s equation of motion
(3.31) yields
k1v = an(xn,vn,tn)∆t (3.55a)
k1x = vn∆t (3.55b)
k2v = a xn +
k1x
2
,vn +
k1v
2
,t +
∆t
2
∆t (3.55c)
k2x = vn +
k1v
2
∆t (3.55d)
and
vn+1 = vn + k2v (3.56a)
xn+1 = xn + k2x. (second-order Runge Kutta) (3.56b)
CHAPTER 3. SIMULATING PARTICLE MOTION 78
Note that the second-order Runge–Kutta algorithm in (3.55) and (3.56) is identical to the Euler–
Richardson algorithm.
Other second-order Runge–Kutta type algorithms exist. For example, if we set c1 = c2 = 1
2
we obtain the endpoint method:
yn+1 = yn +
1
2
k1 +
1
2
k2 (3.57a)
where
k1 = f (xn,tn)∆t (3.57b)
k2 = f (xn + k1,tn + ∆t)∆t. (3.57c)
And if we set c1 = 1
3 and c2 = 2
3 , we obtain Ralston’s method:
yn+1 = yn +
1
3
k1 +
2
3
k2 (3.58a)
where
k1 = f (xn,tn)∆t (3.58b)
k2 = f xn +
3
4
k1,tn +
3
4
∆t ∆t. (3.58c)
Note that Ralston’s method does not calculate the rate at uniformly spaced subintervals of ∆t.
In general, a Runge–Kutta method adjusts the partition of ∆t as well as the constants ai and ci
so as to minimize the error.
In the fourth-order Runge–Kutta algorithm, the derivative is computed at the beginning of
the time interval, in two diﬀerent ways at the middle of the interval, and again at the end of
the interval. The two estimates of the derivative at the middle of the interval are given twice
the weight of the other two estimates. The algorithm for the solution of (3.52) can be written in
standard notation as
k1 = f (xn,tn)∆t (3.59a)
k2 = f xn +
k1
2
,tn +
∆t
2
∆t (3.59b)
k3 = f xn +
k2
2
,tn +
∆t
2
∆t (3.59c)
k4 = f (xn + k3,tn + ∆t)∆t (3.59d)
and
xn+1 = xn +
1
6
(k1 + 2k2 + 2k3 + k4). (3.60)
The application of the fourth-order Runge–Kutta algorithm to Newton’s equation of motion
CHAPTER 3. SIMULATING PARTICLE MOTION 79
(3.31) yields
k1v = a(xn,vn,tn)∆t (3.61a)
k1x = vn∆t (3.61b)
k2v = a xn +
k1x
2
,vn +
k1v
2
,tn +
∆t
2
∆t (3.61c)
k2x = vn +
k1v
2
∆t (3.61d)
k3v = a xn +
k2x
2
,vn +
k2v
2
,tn +
∆t
2
∆t (3.61e)
k3x = (vn +
k2v
2
)∆t (3.61f)
k4v = a(xn + k3x,vn + k3v,t + ∆t)∆t (3.61g)
k4x = (vn + k3v)∆t, (3.61h)
and
vn+1 = vn +
1
6
(k1v + 2k2v + 2k3v + k4v) (3.62a)
xn+1 = xn +
1
6
(k1x + 2k2x + 2k3x + k4x) (fourth-order Runge–Kutta). (3.62b)
Because Runge–Kutta algorithms are self-starting, they are frequently used to obtain the ﬁrst
few iterations for an algorithm that is not self-starting.
As we have discussed, one way to determine the accuracy of a solution is to calculate it
twice with two diﬀerent values of the time step. One way to make this comparison is to choose
time steps ∆t and ∆t/2 and compare the solution at the desired time. If the diﬀerence is small,
the error is assumed to be small. This estimate of the error can be used to adjust the value of
the time step. If the error is too large, than the time step can be halved. And if the error is much
less than the desired value, the time step can be increased so that the program runs faster.
A better way of controlling the step size was developed by Fehlberg who showed that it
is possible to evaluate the rate in such a way as to simultaneously obtain two Runge–Kutta
approximations with diﬀerent orders. For example, it is possible to run a fourth-order and
ﬁfth-order algorithm in tandem by evaluating ﬁve rates. We thus obtain diﬀerent estimates of
the true solution using diﬀerent weighed averages of these rates:
yn+1 = yn + c1k1 + c2k2 + c3k3 + c4k4 + c5k5 (3.63a)
y∗
n+1 = yn + c∗
1k1 + c∗
2k2 + c∗
3k3 + c∗
4k4. (3.63b)
Because we can assume that the ﬁfth-order solution is closer to the true solution than the fourthorder
algorithm, the diﬀerence |y − y∗| provides a good estimate of the error of the fourth-order
method. If this estimated error is larger than the desired tolerance, then the step size is decreased.
If the error is smaller than the desired tolerance, the step size is increased. The RK45
ODE solver in the numerics package implements this technique for choosing the optimal step
size.
In applications where the accuracy of the numerical solution is important, adaptive time
step algorithms should always be used. As stated in Numerical Recipes: “Many small steps
should tiptoe through treacherous terrain, while a few great strides should speed through uninteresting
countryside. The resulting gains in eﬃciency are not mere tens of percents or factors
of two; they can sometimes be factors of ten, a hundred, or more."
CHAPTER 3. SIMULATING PARTICLE MOTION 80
Adaptive step size algorithms are not well suited for tabulating functions or for simulation
because the intervals between data points are not constant. An easy way to circumvent this
problem is to use a method that takes multiple adaptive steps while checking to insure that the
last step does not overshoot the desired ﬁxed step size. The ODEMultistepSolver implements
this technique. The solver acts like a ﬁxed step size solver, even though the solver monitors its
internal step size so as to achieve the desired accuracy.
It also is possible to combine the results from a calculation using two diﬀerent values of the
time step to yield a more accurate expression. Consider the Taylor series expansion of f (t + ∆t)
about t:
f (t + ∆t) = f (t) + f (t)∆t +
1
2!
f (t)(∆t)2
+ ··· . (3.64)
Similarly, we have
f (t − ∆t) = f (t) − f (t)∆t +
1
2!
f (t)(∆t)2
+ ··· . (3.65)
We subtract (3.65) from (3.64) to ﬁnd the usual central diﬀerence approximation for the deriva-
tive
f (t) ≈ D1(∆t) =
f (t + ∆t) − f (t − ∆t)
2∆t
−
(∆t)2
6
f (t). (3.66)
The truncation error is order (∆t)2. Next consider the same relation, but for a time step that is
twice as large:
f (t) ≈ D1(2∆t) =
f (t + 2∆t) − f (t − 2∆t)
4∆t
−
4(∆t)2
6
f (t). (3.67)
Note that the truncation error is again order (∆t)2, but is four times bigger. We can eliminate
this error to leading order by dividing (3.67) by 4 and subtracting it from (3.66):
f (t) −
1
4
f (t) =
3
4
f (t) ≈ D1(∆t) −
1
4
D1(2∆t)
or
f (t) ≈
4D1(∆t) − D1(2∆t)
3
. (3.68)
It is easy to show that the error for f (t) is order (∆t)4. Recursive diﬀerence formulas for derivatives
can be obtained by canceling the truncation error at each order. This method is called
Richardson extrapolation.
Another class of algorithms are predictor–corrector algorithms. The idea is to ﬁrst predict the
value of the new position:
xp = xn−1 + 2vn∆t (predictor). (3.69)
The predicted value of the position allows us to predict the acceleration ap. Then using ap, we
obtain the corrected values of vn+1 and xn+1:
vn+1 = vn +
1
2
(ap + an)∆t (3.70a)
xn+1 = xn +
1
2
(vn+1 + vn)∆t (corrected). (3.70b)
The corrected values of xn+1 and vn+1 are used to obtain a new predicted value of an+1, and,
hence, a new predicted value of vn+1 and xn+1. This process is repeated until the predicted and
corrected values of xn+1 diﬀer by less than the desired value.
CHAPTER 3. SIMULATING PARTICLE MOTION 81
Note that the predictor–corrector algorithm is not self-starting. The predictor–corrector algorithm
gives more accurate positions and velocities than the leapfrog algorithm and is suitable
for very accurate calculations. However, it is computationally expensive, needs signiﬁcant storage
(the forces at the last two stages and the coordinates and velocities at the last step), and
becomes unstable for large time steps.
As we have emphasized, there is no single algorithm for solving Newton’s equations of
motion that is superior under all conditions. It is usually a good idea to start with a simple
algorithm and then to try a higher order algorithm to see if any real improvement is obtained.
We now discuss an important class of algorithms, known as symplectic algorithms, which
are particularly suitable for doing long time simulations of Newton’s equations of motion when
the force is only a function of position. The basic idea of these algorithms derives from the
Hamiltonian theory of classical mechanics. We ﬁrst give some basic results needed from this
theory to understand the importance of symplectic algorithms.
In Hamiltonian theory the generalized coordinates qi and momenta pi take the place of the
usual positions and velocities familiar from Newtonian theory. The index i labels both a particle
and a component of the motion. For example, in a two- particle system in two dimensions, i
would run from 1 to 4. The Hamiltonian (which for our purposes can be thought of as the total
energy) is written as
H(qi,pi) = T + V (3.71)
where T is the kinetic energy and V is the potential energy. Hamilton’s theory is most relevant
for nondissipative systems, which we consider here. For example, for a two particle system in
two dimensions connected by a spring, H would take the form
H =
p2
1
2m
+
p2
2
2m
+
p2
3
2m
+
p2
4
2m
+
1
2
k(q1 − q3)2
+
1
2
k(q2 − q4)2
(3.72)
where if the particles are labeled as A and B, we have p1 = px,A, p2 = py,A, p3 = px,B, p4 = py,B,
and similarly for the qi. The equations of motion are written as ﬁrst-order diﬀerential equations
known as Hamilton’s equations:
˙pi = −
∂H
∂qi
(3.73a)
˙qi =
∂H
∂pi
(3.73b)
which are equivalent to Newton’s second law and an equation relating the velocity to the momentum.
The beauty of Hamiltonian theory is that these equations are correct for other coordinate
systems such as polar coordinates, and they also describe rotating systems where the
momenta become angular momenta, and the position coordinates become angles.
Because the coordinates and momenta are treated on an equal footing, we can consider
the properties of ﬂow in phase space where the dimension of phase space includes both the
coordinates and momenta. Thus, one particle moving in one dimension corresponds to a twodimensional
phase space. If we imagine a collection of initial conditions in phase space forming
a volume in phase space, then one of the results of Hamiltonian theory is that this volume does
not change as the system evolves. A slightly diﬀerent result, called the symplectic property, is
that the sum of the areas formed by the projection of the phase space volume onto the planes
qi,pi for each pair of coordinates and momenta also does not change with time. Numerical
algorithms that have this property are called symplectic algorithms. These algorithms are built
CHAPTER 3. SIMULATING PARTICLE MOTION 82
from the following two statements which are repeated M times for each time step.
p
(k+1)
i = p
(k)
i + akF
(k)
i δt (3.74a)
q
(k+1)
i = q
(k)
i + bkp
(k+1)
i δt (3.74b)
where F
(k)
i ≡ −∂V (q
(k)
i )/∂q
(k)
i . The label k runs from 0 to M − 1 and one time step is given by
∆t = Mδt. (We will see that δt is the time step of an intermediate calculation that is made
during the time step ∆t.) Note that in the update for qi, the already updated pi is used. For
simplicity, we assume that the mass is unity.
Diﬀerent algorithms correspond to diﬀerent values of M, ak, and bk. For example, a0 = b0 =
M = 1 corresponds to the Euler–Cromer algorithm, and M = 2, a0 = a1 = 1, b0 = 2, and b1 = 0
is equivalent to the Verlet algorithm as we will now show. If we substitute in the appropriate
values for ak and bk into (3.74), we have
p
(1)
i = p
(0)
i + F
(0)
i δt (3.75a)
q
(1)
i = q
(0)
i + 2p
(1)
i δt (3.75b)
p
(2)
i = p
(1)
i + F
(1)
i δt (3.75c)
q
(2)
i = q
(1)
i . (3.75d)
We next combine (3.75a) and (3.75c) for the momentum coordinate and (3.75b) and (3.75d) for
the position and obtain
p
(2)
i = p
(0)
i + (F
(0)
i + F
(1)
i )δt (3.76a)
q
(2)
i = q
(0)
i + 2p
(1)
i δt. (3.76b)
We take δt = ∆t/2 and combine (3.76b) with (3.75a) and ﬁnd
p
(2)
i = p
(0)
i +
1
2
(F
(0)
i + F
(1)
i )∆t (3.77a)
q
(2)
i = q
(0)
i + p
(0)
i ∆t +
1
2
F
(0)
i (∆t)2
(3.77b)
which is identical to the Verlet algorithm (3.48), because for unit mass the force and acceleration
are equal.
Reversing the order of the updates for the coordinates and the momenta also leads to symplectic
algorithms:
q
(k+1)
i = q
(k)
i + bkδtp
(k)
i (3.78a)
p
(k+1)
i = p
(k)
i + akδtF
(k+1)
i . (3.78b)
A third variation uses (3.74) and (3.78) for diﬀerent values of k in one algorithm. Thus, if M = 2,
which corresponds to two intermediate calculations per time step, we could use (3.74) for the
ﬁrst intermediate calculation and (3.78) for the second.
Why are these algorithms important? Because of the symplectic property, these algorithms
will simulate an exact Hamiltonian, although not the one we started with in general (see Problem
3.1c, for example). However, this Hamiltonian will be close to the one we wish to simulate
if the ak and bk are properly chosen. Second, these algorithms are frequently more accurate
and stable than nonsymplectic algorithms. Finally, for even values of M, the algorithms are
time-reversible invariant, which is a property of the actual systems we are trying to simulate.
Examples and comparisons for various algorithms are given in the paper by Gray et al.
CHAPTER 3. SIMULATING PARTICLE MOTION 83
References and Suggestions for Further Reading
F. S. Acton, Numerical Methods That Work (The Mathematical Association of America, 1990),
Chapter 5.
Robert. K. Adair, The Physics of Baseball, 3rd ed. (Harper Collins, 2002).
Byron L. Coulter and Carl G. Adler, “Can a body pass a body falling through the air?,” Am. J.
Phys. 47, 841–846 (1979). The authors discuss the limiting conditions for which the drag
force is linear or quadratic in the velocity.
Alan Cromer, “Stable solutions using the Euler approximation,” Am. J. Phys. 49, 455–459
(1981). The author shows that a minor modiﬁcation of the usual Euler approximation
yields stable solutions for oscillatory systems including planetary motion and the harmonic
oscillator (see Chapter 4).
Paul L. DeVries, A First Course in Computational Physics (John Wiley & Sons, 1994).
Denis Donnelly and Edwin Rogers, “Symplectic integrators: An introduction,” Am. J. Phys. 73,
938 (2005).
A. P. French, Newtonian Mechanics (W. W. Norton & Company, 1971). Chapter 7 has an excellent
discussion of air resistance and a detailed analysis of motion in the presence of drag
resistance.
Ian R. Gatland, “Numerical integration of Newton’s equations including velocity-dependent
forces,” Am J. Phys. 62, 259–265 (1994). The author discusses the Euler–Richardson algo-
rithm.
Stephen K. Gray, Donald W. Noid, and Bobby G. Sumpter, “Symplectic integrators for large
scale molecular dynamics simulations: A comparison of several explicit methods,” J. Chem.
Phys. 101 (5), 4062–4072 (1994).
Margaret Greenwood, Charles Hanna, and John Milton, “Air resistance acting on a sphere: Numerical
analysis, strobe photographs, and videotapes,” Phys. Teacher 24, 153–159 (1986).
More experimental data and theoretical analysis are given for the fall of ping-pong and
styrofoam balls. Also see Mark Peastrel, Rosemary Lynch, and Angelo Armenti, “Terminal
velocity of a shuttlecock in vertical fall,” Am. J. Phys. 48, 511–513 (1980).
Michael J. Kallaher, editor, Revolutions in Diﬀerential Equations: Exploring ODEs with Modern
Technology (The Mathematical Association of America, 1999).
K. S. Krane, “The falling raindrop: variations on a theme of Newton,” Am. J. Phys. 49, 113–117
(1981). The author discusses the problem of mass accretion by a drop falling through a
cloud of droplets.
D. B. Lichtenberg and J. G. Wills, “Maximizing the range of the shot put,” Am. J. Phys. 46 (5),
546–549 (1978).
George C. McGuire, “Using computer algebra to investigate the motion of an electric charge in
magnetic and electric dipole ﬁelds,” Am. J. Phys. 71 (8), 809–812 (2003).
Rabindra Mehta, “Aerodynamics of sports balls,” in Ann. Rev. Fluid Mech. 17, 151–189 (1985).
CHAPTER 3. SIMULATING PARTICLE MOTION 84
Neville de Mestre, The Mathematics of Projectiles in Sport (Cambridge University Press, 1990).
The emphasis of this text is on solving many problems in projectile motion, for example,
baseball, basketball, and golf, in the context of mathematical modeling. Many references
to the relevant literature are given.
Tao Pang, Computational Physics (Cambridge University Press, 1997).
William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery, Numerical
Recipes, 2nd ed. (Cambridge University Press, 1992). Chapter 16 discusses the integration
of ordinary diﬀerential equations.
Emilio Segré, Nuclei and Particles, 2nd ed. (W. A. Benjamin, 1977). Chapter 5 discusses decay
cascades. The decay schemes described brieﬂy in Problem 3.13 are taken from C. M.
Lederer, J. M. Hollander, and I. Perlman, Table of Isotopes, 6th ed. (John Wiley & Sons,
1967).
Lawrence F. Shampine, Numerical Solution of Ordinary Diﬀerential Equations (Chapman and
Hall, 1994).
Chapter 4
Oscillations
We explore the behavior of oscillatory systems, including the simple harmonic oscillator, a simple
pendulum, and electrical circuits, and introduce the concept of phase space.
4.1 Simple Harmonic Motion
There are many physical systems that undergo regular, repeating motion. Motion that repeats
itself at deﬁnite intervals, for example, the motion of the earth about the sun, is said to be
periodic. If an object undergoes periodic motion between two limits over the same path, we
call the motion oscillatory. Examples of oscillatory motion that are familiar to us from our
everyday experience include a plucked guitar string and the pendulum in a grandfather clock.
Less obvious examples are microscopic phenomena such as the oscillations of the atoms in
crystalline solids.
To illustrate the important concepts associated with oscillatory phenomena, consider a
block of mass m connected to the free end of a spring. The block slides on a frictionless, horizontal
surface (see Figure 4.1). We specify the position of the block by x and take x = 0 to be
the equilibrium position of the block, that is, the position when the spring is relaxed. If the
block is moved from x = 0 and then released, the block oscillates along a horizontal line. If the
spring is not compressed or stretched too far from x = 0, the force on the block at position x is
proportional to x:
F = −kx. (4.1)
The force constant k is a measure of the stiﬀness of the spring. The negative sign in (4.1) implies
that the force acts to restore the block to its equilibrium position. Newton’s equation of motion
for the block can be written as
d2x
dt2
= −ω0
2
x (4.2)
where the angular frequency ω0 is deﬁned by
ω0
2
=
k
m
. (4.3)
The dynamical behavior described by (4.2) is called simple harmonic motion and can be
solved analytically in terms of sine and cosine functions. Because the form of the solution will
85
CHAPTER 4. OSCILLATIONS 86
x
y
xx = 0
Figure 4.1: A one-dimensional harmonic oscillator. The block slides horizontally on the frictionless
surface.
help us introduce some of the terminology needed to discuss oscillatory motion, we include the
solution here. One form of the solution is
x(t) = Acos(ω0t + δ) (4.4)
where A and δ are constants and the argument of the cosine is in radians. It is straightforward
to check by substitution that (4.4) is a solution of (4.2). The constants A and δ are called the
amplitude and the phase, respectively, and are determined by the initial conditions for x and
the velocity v = dx/dt.
Because the cosine is a periodic function with period 2π, we know that x(t) in (4.4) is also
periodic. We deﬁne the period T as the smallest time for which the motion repeats itself, that
is,
x(t + T ) = x(t). (4.5)
Because ω0T corresponds to one cycle, we have
T =
2π
ω0
=
2π
√
k/m
. (4.6)
The frequency ν of the motion is the number of cycles per second and is given by ν = 1/T . Note
that T depends on the ratio k/m and not on A and δ. Hence, the period of simple harmonic
motion is independent of the amplitude of the motion.
Although the position and velocity of the oscillator are continuously changing, the total
energy E remains constant and is given by
E =
1
2
mv2
+
1
2
kx2
=
1
2
kA2
. (4.7)
The two terms in (4.7) are the kinetic and potential energies, respectively.
Problem 4.1. Energy conservation
(a) Use the Euler ODESolver to solve the dynamical equations for a simple harmonic oscillator
by extending AbstractSimulation and implementing the doStep method. (See Section 4.2
for an example of such a program for the pendulum.) Have your program plot ∆En = En−E0,
where E0 is the initial energy and En is the total energy at time tn = t0 +n∆t. (It is necessary
CHAPTER 4. OSCILLATIONS 87
to consider only the energy per unit mass.) Plot the diﬀerence ∆En as a function of tn for
several cycles for a given value of ∆t. Choose x(t = 0) = 1, v(t = 0) = 0 and ω0
2 = k/m = 9
and start with ∆t = 0.05. Is the diﬀerence ∆En uniformly small throughout the cycle? Does
∆En drift, that is, become bigger with time? What is the optimum choice of ∆t?
(b) Implement the Euler–Cromer algorithm by writing an Euler–Cromer ODESolver and answer
the same questions as in part (a).
(c) Modify your program so that the Euler–Richardson or Verlet algorithms are used and answer
the same questions as in part (a). (The Verlet algorithm is discussed in Appendix 3A.)
(d) Describe the qualitative diﬀerences between the time dependence of ∆En using the various
algorithms. Which algorithm is most consistent with the requirement of conservation of
energy? For ﬁxed ∆t, which algorithm yields better results for the position in comparison
to the analytic solution (4.4)? Is the requirement of conservation of energy consistent with
the relative accuracy of the computed positions?
(e) Choose the best algorithm based on your criteria, and determine the values of ∆t that are
needed to conserve the total energy to within 0.1% over one cycle for ω0 = 3 and for ω0 = 12.
Can you use the same value of ∆t for both values of ω0? If not, how do the values of ∆t
correspond to the relative values of the period in the two cases?
Problem 4.2. Analysis of simple harmonic motion
a) Use your results from Problem 4.1 to select an appropriate numerical algorithm and value of
∆t for the simple harmonic oscillator, and modify your program so that the time dependence
of the potential and kinetic energies is plotted. Where in the cycle is the kinetic energy
(potential energy) a maximum?
b) Compute the average value of the kinetic energy and the potential energy during a complete
cycle. What is the relation between the two averages?
c) Compute x(t) for diﬀerent values of A and show that the shape of x(t) is independent of A;
that is, show that x(t)/A is a universal function of t for a ﬁxed value of ω0. In what units
should the time be measured so that the ratio x(t)/A is independent of ω0?
d) The dynamical behavior of the one-dimensional oscillator is completely speciﬁed by x(t) and
p(t), where p is the momentum of the oscillator. These quantities may be interpreted as the
coordinates of a point in a two-dimensional space known as phase space. As the time increases,
the point (x(t),p(t)) moves along a trajectory in phase space. Modify your program
so that the momentum p is plotted as a function of x; that is, choose p and x as the vertical
and horizontal axes, respectively. Choose ω0 = 3 and compute the phase space trajectory for
the initial condition x(t = 0) = 1,v(t = 0) = 0. What is the shape of this trajectory? What is
the shape for the initial conditions x(t = 0) = 0,v(t = 0) = 1 and x(t = 0) = 4,v(t = 0) = 0? Do
you ﬁnd a diﬀerent phase trajectory for each initial condition? What physical quantity distinguishes
the phase space trajectories? Is the motion of a representative point (x,p) always
in the clockwise or counterclockwise direction?
Problem 4.3. Lissajous ﬁgures
A computer display can be used to simulate the output seen on an oscilloscope. Imagine that the
vertical and horizontal inputs to an oscilloscope are sinusoidal in time; that is, x = Ax sin(ωxt +
φx) and y = Ay sin(ωyt + φy). If the curve that is drawn repeats itself, such a curve is called
CHAPTER 4. OSCILLATIONS 88
a Lissajous ﬁgure. Write a program to plot y versus x, as t advances from t = 0. First choose
Ax = Ay = 1, ωx = 2, ωy = 3, φx = π/6, and φy = π/4. For what values of the angular frequencies
ωx and ωy do you obtain a Lissajous ﬁgure? How do the phase factors φx and φy and the
amplitudes Ax and Ay aﬀect the curves?
Waves are ubiquitous in nature and give rise to important phenomena such as beats and
standing waves. We investigate their behavior in Problem 4.4. We will study the behavior of
waves more systematically in Chapter 9.
Problem 4.4. Superposition of waves
(a) Write a program to plot Asin(kx + ωt) from x = xmin to x = xmax as a function of t. (Implement
an AbstractSimulation rather than an AbstractCalculation.) For simplicity, take
A = 1, ω = 2π, k = 2π/λ, with λ = 2.
(b) Modify your program so that it plots the sum of y1 = sin(kx − ωt) and y2 = sin(kx + ωt). The
quantity y1 + y2 corresponds to the superposition of two waves. Choose λ = 2 and ω = 2π.
What kind of a wave do you obtain?
(c) Use your program to demonstrate beats by plotting y1 + y2 as a function of time in the
range xmin = −10 and xmax = 10. Determine the beat frequency for each of the following
superpositions: y1(x,t) = sin[8.4(x − 1.1t)], y2(x,t) = sin[8.0(x − 1.1t)]; y1(x,t) = sin[8.4(x −
1.2t)], y2(x,t) = sin[8.0(x − 1.0t)]; and y1(x,t) = sin[8.4(x − 1.0t)], y2(x,t) = sin[8.0(x − 1.2t)].
What diﬀerences do you observe between these superpositions?
4.2 The Motion of a Pendulum
A common example of a mechanical system that exhibits oscillatory motion is the simple pendulum
(see Figure 4.2). A simple pendulum is an idealized system consisting of a particle or
bob of mass m attached to the lower end of a rigid rod of length L and negligible mass; the upper
end of the rod pivots without friction. If the bob is pulled to one side from its equilibrium
position and released, the pendulum swings in a vertical plane.
Because the bob is constrained to move along the arc of a circle of radius L about the center
O, the bob’s position is speciﬁed by its arc length or by the angle θ (see Figure 4.2). The linear
velocity and acceleration of the bob as measured along the arc are given by
v = L
dθ
dt
(4.8)
a = L
d2θ
dt2
. (4.9)
In the absence of friction, two forces act on the bob: the force mg vertically downward and the
force of the rod which is directed inward to the center if |θ| < π/2. Note that the eﬀect of the
rigid rod is to constrain the motion of the bob along the arc. From Figure 4.2, we can see that the
component of mg along the arc is mg sinθ in the direction of decreasing θ. Hence, the equation
of motion can be written as
mL
d2θ
dt2
= −mg sinθ (4.10)
or
CHAPTER 4. OSCILLATIONS 89
θ
L
mg
Figure 4.2: Force diagram for a simple pendulum. The angle θ is measured from the vertical
direction and is positive if the mass is to the right of the vertical and negative if it is to the left.
d2θ
dt2
= −
g
L
sinθ. (4.11)
Equation (4.11) is an example of a nonlinear equation because sinθ rather than θ appears.
Most nonlinear equations do not have analytic solutions in terms of well-known functions, and
(4.11) is no exception. However, if the amplitude of the pendulum oscillations is suﬃciently
small, then sinθ ≈ θ, and (4.11) reduces to
d2θ
dt2
≈ −
g
L
θ (θ 1). (4.12)
Remember that θ is measured in radians.
Part of the fun of studying physics comes from realizing that equations that appear in different
contexts are often similar. An example can be seen by comparing (4.2) and (4.12). If we
associate x with θ, we see that the two equations are identical in form, and we can immediately
conclude that for θ 1, the period of a pendulum is given by
T = 2π
L
g
(small amplitude oscillations). (4.13)
One way to understand the motion of a pendulum with large oscillations is to solve (4.11)
numerically. Because we know that the numerical solutions must be consistent with conservation
of energy, we derive the form of the total energy here. The potential energy can be found
from the following considerations. If the rod is deﬂected by the angle θ, then the bob is raised
by the distance h = L − Lcosθ (see Figure 4.2). Hence, the potential energy of the bob in the
gravitational ﬁeld of the earth is
U = mgh = mgL(1 − cosθ) (4.14)
where the zero of the potential energy corresponds to θ = 0. Because the kinetic energy of the
pendulum is 1
2 mv2 = 1
2 mL2(dθ/dt)2, the total energy E of the pendulum is
E =
1
2
mL2 dθ
dt
2
+ mgL(1 − cosθ). (4.15)
CHAPTER 4. OSCILLATIONS 90
We use two classes to simulate and visualize the motion of a pendulum problem, Pendulum
and PendulumApp. The Pendulum class implements the Drawable and ODE interfaces and solves
the dynamical equations using the Euler–Richardson algorithm.
Listing 4.1: A Drawable class that models the simple pendulum.
package org . opensourcephysics . sip . ch04 ;
import java . awt . ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . numerics . ;
public class Pendulum implements Drawable , ODE {
double omega0Squared = 3; / / g / L
double [ ] s t a t e = new double [ ] {0 , 0 , 0 } ; / / { theta , dtheta / dt , t }
Color color = Color .RED;
int pixRadius = 6;
EulerRichardson odeSolver = new EulerRichardson ( this ) ;
public void setStepSize ( double dt ) {
odeSolver . setStepSize ( dt ) ;
}
public void step ( ) {
odeSolver . step ( ) ; / / execute one Euler −Richardson st ep
}
public void s e t S t a t e ( double theta , double thetaDot ) {
s t a t e [ 0 ] = theta ;
s t a t e [ 1 ] = thetaDot ; / / time r a t e of change of theta
}
public double [ ] getState ( ) {
return s t a t e ;
}
public void getRate ( double [ ] state , double [ ] rate ) {
rate [ 0] = s t a t e [ 1 ] ; / / r a t e of change of angle
/ / r a t e of change of dtheta / dt
rate [ 1] = −omega0Squared Math . sin ( s t a t e [ 0 ] ) ;
rate [ 2] = 1; / / r a t e of change of time dt / dt = 1
}
public void draw ( DrawingPanel drawingPanel , Graphics g ) {
int xpivot = drawingPanel . xToPix ( 0 ) ;
int ypivot = drawingPanel . yToPix ( 0 ) ;
int xpix = drawingPanel . xToPix (Math . sin ( s t a t e [ 0 ] ) ) ;
int ypix = drawingPanel . yToPix(−Math . cos ( s t a t e [ 0 ] ) ) ;
g . setColor ( Color . black ) ;
g . drawLine ( xpivot , ypivot , xpix , ypix ) ; / / the s t r i n g
g . setColor ( color ) ;
g . f i l l O v a l ( xpix−pixRadius , ypix−pixRadius , 2 pixRadius ,
2 pixRadius ) ; / / the bob
}
}
CHAPTER 4. OSCILLATIONS 91
Note that Pendulum implements the draw method as required by the Drawable interface.
The target class, PendulumApp, is shown in Listing 4.2. The angle θ is plotted as a function
of time and an animation of the motion is drawn.
Listing 4.2: Visualization of the motion of a pendulum.
package org . opensourcephysics . sip . ch04 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . ;
public class PendulumApp extends AbstractSimulation {
PlotFrame plotFrame = new PlotFrame ( "Time" , "Theta" ,
"Theta versus time" ) ;
Pendulum pendulum = new Pendulum ( ) ;
DisplayFrame displayFrame = new DisplayFrame ( "Pendulum" ) ;
public PendulumApp ( ) {
displayFrame . addDrawable (pendulum ) ;
displayFrame . setPreferredMinMax ( −1.2 , 1.2 , −1.2 , 1 . 2 ) ;
}
public void i n i t i a l i z e ( ) {
double dt = control . getDouble ( "dt" ) ;
double theta = control . getDouble ( "initial theta" ) ;
double thetaDot = control . getDouble ( "initial dtheta/dt" ) ;
pendulum . s e t S t a t e ( theta , thetaDot ) ;
pendulum . setStepSize ( dt ) ;
}
public void doStep ( ) {
/ / angle vs time data added
plotFrame . append (0 , pendulum . s t a t e [ 2 ] , pendulum . s t a t e [ 0 ] ) ;
pendulum . step ( ) ; / / advances the s t a t e by one time st ep
}
public void reset ( ) {
pendulum . s t a t e [ 2 ] = 0; / / s e t time = 0
control . setValue ( "initial theta" , 0 . 2 ) ;
control . setValue ( "initial dtheta/dt" , 0 ) ;
control . setValue ( "dt" , 0 . 1 ) ;
}
/ / c r e a t e s a simulation c o n t r o l s t r u c t u r e using t h i s c l a s s
public s t a t i c void main ( String [ ] args ) {
SimulationControl . createApp (new PendulumApp ( ) ) ;
}
}
Problem 4.5. Oscillations of a pendulum
(a) Make the necessary changes so that the analytic solution for small angles is also plotted.
(b) Test the program at suﬃciently small amplitudes so that sinθ ≈ θ. Choose ω0 = g/L =
3 and the initial conditions θ(t = 0) = 0.2 and dθ(t = 0)/dt = 0. Determine the period
CHAPTER 4. OSCILLATIONS 92
numerically and compare your result to the expected analytic result for small amplitudes.
Explain your method for determining the period. Estimate the error due to the small angle
approximation for these initial conditions.
(c) Consider larger amplitudes and make plots of θ(t) and dθ(t)/dt versus t for the initial conditions
θ(t = 0) = 0.1, 0.2, 0.4, 0.8, and 1.0 with dθ(t = 0)/dt = 0. Choose ∆t so that the
numerical algorithm generates a stable solution; that is, monitor the total energy and ensure
that it does not drift from its initial value. Describe the qualitative behavior of θ and
dθ/dt. What is the period T and the amplitude θmax in each case? Plot T versus θmax and
discuss the qualitative dependence of the period on the amplitude. How do your results for
T compare in the linear and nonlinear cases; for example, which period is larger? Explain
the relative values of T in terms of the relative magnitudes of the restoring force in the two
cases.
4.3 Damped Harmonic Oscillator
We know from experience that most oscillatory motion in nature gradually decreases until the
displacement becomes zero; such motion is said to be damped and the system is said to be dissipative
rather than conservative. As an example of a damped harmonic oscillator, consider the
motion of the block in Figure 4.1 when a horizontal drag force is included. For small velocities,
it is a reasonable approximation to assume that the drag force is proportional to the ﬁrst power
of the velocity. In this case the equation of motion can be written as
d2x
dt2
= −ω0
2
x − γ
dx
dt
. (4.16)
The damping coeﬃcient γ is a measure of the magnitude of the drag term. Note that the drag
force in (4.16) opposes the motion. We simulate the behavior of the damped linear oscillator in
Problem 4.6.
Problem 4.6. Damped linear oscillator
(a) Incorporate the eﬀects of damping into your harmonic oscillator simulation and plot the
time dependence of the position and the velocity. Describe the qualitative behavior of x(t)
and v(t) for ω0 = 3 and γ = 0.5 with x(t = 0) = 1, v(t = 0) = 0.
(b) The period of the motion is the time between successive maxima of x(t). Compute the period
and corresponding angular frequency and compare their values to the undamped case. Is
the period longer or shorter? Make additional runs for γ = 1, 2, and 3. Does the period
increase or decrease with greater damping? Why?
(c) The amplitude is the maximum value of x during one cycle. Compute the relaxation time
τ, the time it takes for the amplitude of an oscillation to decrease by 1/e ≈ 0.37 from its
maximum value. Is the value of τ constant throughout the motion? Compute τ for the
values of γ considered in part (b) and discuss the qualitative dependence of τ on γ.
(d) Plot the total energy as a function of time for the values of γ considered in part (b). If the
decrease in energy is not monotonic, explain.
CHAPTER 4. OSCILLATIONS 93
(e) Compute the time dependence of x(t) and v(t) for γ = 4, 5, 6, 7, and 8. Is the motion
oscillatory for all γ? How can you characterize the decay? For ﬁxed ω0, the oscillator is
said to be critically damped at the smallest value of γ for which the decay to equilibrium
is monotonic. For what value of γ does critical damping occur for ω0 = 4 and ω0 = 2? For
each value of ω0, compute the value of γ for which the system approaches equilibrium most
quickly.
(f) Compute the phase space diagram for ω0 = 3 and γ = 0.5, 2, 4, 6, and 8. Why does the phase
space trajectory converge to the origin, x = 0, v = 0? This point is called an attractor. Are
these qualitative features of the phase space plot independent of γ?
Problem 4.7. Damped nonlinear pendulum
Consider a damped pendulum with ω0 = g/L = 3 and a damping term equal to −γdθ/dt.
Choose γ = 1 and the initial condition θ(t = 0) = 0.2,dθ(t = 0)/dt = 0. In what ways is the
motion of the damped nonlinear pendulum similar to the damped linear oscillator? In what
ways is it diﬀerent? What is the shape of the phase space trajectory for the initial condition θ(t =
0) = 1,ω(t = 0) = 0? Do you ﬁnd a diﬀerent phase space trajectory for other initial conditions?
Remember that θ is restricted to be between −π and +π.
4.4 Response to External Forces
How can we determine the period of a pendulum that is not already in motion? The obvious
way is to disturb the system, for example, to displace the bob and observe its motion. We will
ﬁnd that the nature of the response of the system to a small perturbation tells us something
about the nature of the system in the absence of the perturbation.
Consider the driven damped linear oscillator with an external force F(t) in addition to the
linear restoring force and linear damping force. The equation of motion can be written as
d2x
dt2
= −ω0
2
x − γv +
1
m
F(t). (4.17)
It is customary to interpret the response of the system in terms of the displacement x rather
than the velocity v.
The time dependence of F(t) in (4.17) is arbitrary. Because many forces in nature are periodic,
we ﬁrst consider the form
1
m
F(t) = A0 cosωt (4.18)
where ω is the angular frequency of the driving force.
Problem 4.8. Response of a driven damped linear oscillator
(a) Modify your simple harmonic oscillator program so that an external force of the form (4.18)
is included. Add this force to the class that encapsulates the equations of motion without
changing the target class. The angular frequency of the driving force should be added as an
input parameter.
(b) Choose ω0 = 3, γ = 0.5, ω = 2 and the amplitude of the external force A0 = 1 for all runs
unless otherwise stated. For these values of ω0 and γ, the dynamical behavior in the absence
of an external force corresponds to an underdamped oscillator. Plot x(t) versus t in the
CHAPTER 4. OSCILLATIONS 94
presence of the external force with the initial condition x(t = 0) = 1,v(t = 0) = 0. How does
the qualitative behavior of x(t) diﬀer from the nonperturbed case? What is the period and
angular frequency of x(t) after several oscillations? Repeat the same observations for x(t)
with x(t = 0) = 0,v(t = 0) = 1. Identify a transient part of x(t) that depends on the initial
conditions and decays in time and a steady state part that dominates at longer times and is
independent of the initial conditions.
(c) Compute x(t) for several combinations of ω0 and ω. What is the period and angular frequency
of the steady state motion in each case? What parameters determine the frequency
of the steady state behavior?
(d) A measure of the long-term behavior of the driven harmonic oscillator is the amplitude of
the steady state displacement A(ω), which can be be computed for a given value of ω if the
simulation is run until a steady state has been achieved. One way to determine A is to check
the position after every time step to see if a new maximum has been reached as is done by
the following code:
i f ( x > Math . abs ( amplitude ) ) {
amplitude = Math . abs ( x ) ;
control . println ( "new amplitude = " + amplitude ) ;
}
(e) Measure the amplitude and phase shift to verify that the steady state behavior of x(t) is
given by
x(t) = A(ω)cos(ωt + δ). (4.19)
The quantity δ is the phase diﬀerence between the applied force and the steady state motion.
Compute A(ω) and δ(ω) for ω0 = 3, γ = 0.5, and ω = 0, 1.0, 2.0, 2.2, 2.4, 2.6, 2.8, 3.0, 3.2,
and 3.4. Choose the initial condition x(t = 0) = 0,v(t = 0) = 0. Repeat the simulation for
γ = 3.0, and plot A(ω) and δ(ω) versus ω for the two values of γ. Discuss the qualitative
behavior of A(ω) and δ(ω) for the two values of γ. If A(ω) has a maximum, determine the
angular frequency ωmax at which the maximum of A occurs. Is the value of ωmax close to
the natural angular frequency ω0? Compare ωmax to ω0 and to the frequency of the damped
linear oscillator in the absence of an external force.
(f) Compute x(t) and A(ω) for a damped linear oscillator with the amplitude of the external
force A0 = 4. How do the steady state results for x(t) and A(ω) compare to the case A0 = 1?
Does the transient behavior of x(t) satisfy the same relation as the steady state behavior?
(g) What is the shape of the phase space trajectory for the initial condition x(t = 0) = 1,v(t =
0) = 0? Do you ﬁnd a diﬀerent phase space trajectory for other initial conditions?
(h) Why is A(ω = 0) < A(ω) for small ω? Why does A(ω) → 0 for ω ω0?
(i) Does the mean kinetic energy resonate at the same frequency as does the amplitude? Compute
the mean kinetic energy over one cycle once steady state conditions have been reached.
Choose ω0 = 3 and γ = 0.5.
In Problem 4.8 we found that the response of the damped harmonic oscillator to an external
driving force is linear. For example, if the magnitude of the external force is doubled, then the
magnitude of the steady state motion is also doubled. This behavior is a consequence of the
linear nature of the equation of motion. When a particle is subject to nonlinear forces, the
response can be much more complicated (see Section 6.8).
CHAPTER 4. OSCILLATIONS 95
For many problems, the sinusoidal driving force in (4.18) is not realistic. Another example
of an external force can be found by observing someone pushing a child on a swing. Because the
force is nonzero for only short intervals of time, this type of force is impulsive. In the following
problem, we consider the response of a damped linear oscillator to an impulsive force.
CHAPTER 4. OSCILLATIONS 96
F(t)
t
Figure 4.3: A half-wave driving force corresponding to the positive part of a cosine function.
∗Problem 4.9. Response of a damped linear oscillator to nonsinusoidal external forces
(a) Assume a swing can be modeled by a dampled linear oscillator. The eﬀect of an impulse is
to change the velocity. For simplicity, let the duration of the push equal the time step ∆t.
Introduce an integer variable for the number of time steps and use the % operator to ensure
that the impulse is nonzero only at the time step associated with the period of the external
impulse. Determine the steady state amplitude A(ω) for ω = 1.0, 1.3, 1.4, 1.5, 1.6, 2.5, 3.0,
and 3.5. The corresponding period of the impulse is given by T = 2π/ω. Choose ω0 = 3 and
γ = 0.5. Are your results consistent with your experience of pushing a swing and with the
comparable results of Problem 4.8?
(b) Consider the response to a half-wave external force consisting of the positive part of a cosine
function (see Figure 4.3). Compute A(ω) for ω0 = 3 and γ = 0.5. At what values of ω does
A(ω) have a relative maxima? Is the half-wave cosine driving force equivalent to a sum
of cosine functions of diﬀerent frequencies? For example, does A(ω) have more than one
resonance?
(c) Compute the steady state response x(t) to the external force
1
m
F(t) =
1
π
+
1
2
cost +
2
3π
cos2t −
2
15π
cos4t. (4.20)
How does a plot of F(t) versus t compare to the half-wave cosine function? Use your results
to conjecture a principle of superposition for the solutions to linear equations.
In many of the problems in this chapter, we have asked you to draw a phase space plot
for a single oscillator. This plot provides a convenient representation of both the position and
velocity. When we study chaotic phenomena, such plots will become almost indispensable
(see Chapter 6). Here we will consider an important feature of phase space trajectories for
conservative systems.
If there are no external forces, the undamped simple harmonic oscillator and undamped
pendulum are examples of conservative systems; that is, systems for which the total energy is
a constant. In Problems 4.10 and 4.11, we will study two general properties of conservative
systems, the nonintersecting nature of their trajectories in phase space and the preservation of
area in phase space. These concepts will become more important when we study the properties
of conservative systems with more than one degree of freedom.
CHAPTER 4. OSCILLATIONS 97
x
v
Figure 4.4: What happens to a given area in phase space for conservative systems?
Problem 4.10. Trajectory of a simple harmonic oscillator in phase space
(a) We explore the phase space behavior of a single harmonic oscillator by simulating N initial
conditions simultaneously. Write a program to simulate N identical simple harmonic oscillators
each of which is represented by a small circle centered at its position and velocity
in phase space as shown in Figure 4.4. One way to do so is to adapt the BouncingBallApp
class introduced in Section 2.6. Choose N = 16 and consider random initial positions and
velocities. Do the phase space trajectories for diﬀerent initial conditions ever cross? Explain
your answer in terms of the uniqueness of trajectories in a deterministic system.
(b) Choose a set of initial conditions that form a rectangle (see Figure 4.4). Does the shape of
this area change with time? What happens to the total area in comparison to the
original area?
Problem 4.11. Trajectory of a pendulum in phase space
(a) Modify your program from Problem 4.10 so that the phase space trajectories (ω versus θ) of
N = 16 pendula with diﬀerent initial conditions can be compared. Plot several phase space
trajectories for diﬀerent values of the total energy. Are the phase space trajectories closed?
Does the shape of the trajectory depend on the total energy?
(b) Choose a set of initial conditions that form a rectangle in phase space and plot the state of
each pendulum as a circle. Does the shape of this area change with time? What happens to
the total area?
4.5 Electrical Circuit Oscillations
In this section we discuss several electrical analogues of the mechanical systems that we have
considered. Although the equations of motion are similar in form, it is convenient to consider
electrical circuits separately, because the nature of the questions of interest is somewhat diﬀer-
ent.
The starting point for electrical circuit theory is Kirchhoﬀ’s loop rule, which states that the
sum of the voltage drops around a closed path of an electrical circuit is zero. This law is a
CHAPTER 4. OSCILLATIONS 98
Element Voltage Drop Symbol Units
resistor VR = IR resistance R ohms (Ω)
capacitor VC = Q/C capacitance C farads (F)
inductor VL = LdI/dt inductance L henries (H)
Table 4.1: The voltage drops across the basic electrical circuit elements. Q is the charge
(coulombs) on one plate of the capacitor, and I is the current (amperes).
Vs
C
R
L
Figure 4.5: A simple series RLC circuit with a voltage source Vs.
consequence of conservation of energy, because a voltage drop represents the amount of energy
that is lost or gained when a unit charge passes through a circuit element. The relations for the
voltage drops across each circuit element are summarized in Table 4.1.
Imagine an electrical circuit with an alternating voltage source Vs(t) attached in series to a
resistor, inductor, and capacitor (see Figure 4.5). The corresponding loop equation is
VL + VR + VC = Vs(t). (4.21)
The voltage source term Vs in (4.21) is the emf and is measured in units of volts. If we substitute
the relationships shown in Table 4.1, we ﬁnd
L
d2Q
dt2
+ R
dQ
dt
+
Q
C
= Vs(t) (4.22)
where we have used the deﬁnition of current I = dQ/dt. We see that (4.22) for the series RLC
circuit is identical in form to the damped harmonic oscillator (4.17). The analogies between
ideal electrical circuits and mechanical systems are summarized in Table 4.2.
Although we are already familiar with (4.22), we ﬁrst consider the dynamical behavior of
an RC circuit described by
RI(t) = R
dQ
dt
= Vs(t) −
Q
C
. (4.23)
Two RC circuits corresponding to (4.23) are shown in Figure 4.6. Although the loop equation
(4.23) is identical regardless of the order of placement of the capacitor and resistor in Figure 4.6,
the output voltage measured by the oscilloscope in Figure 4.6 is diﬀerent. We will see in Problem
4.12 that these circuits act as ﬁlters that pass voltage components of certain frequencies
while rejecting others.
An advantage of a computer simulation of an electrical circuit is that the measurement of a
voltage drop across a circuit element does not aﬀect the properties of the circuit. In fact, digital
CHAPTER 4. OSCILLATIONS 99
Electric Circuit Mechanical System
charge Q displacement x
current I = dQ/dt velocity v = dx/dt
voltage drop force
inductance L mass m
inverse capacitance 1/C spring constant k
resistance R damping γ
Table 4.2: Analogies between electrical parameters and mechanical parameters.
Vs
(b)
R
C
Osc.
Vout
Vs
(a)
C
R
Osc.
Vout
Figure 4.6: Examples of RC circuits used as low and high pass ﬁlters. Which circuit is which?
computers are often used to optimize the design of circuits for special applications. The RCApp
program is not shown here because it is similar to PendulumApp, but this program is available in
the Chapter 4 package. The RCApp program simulates an RC circuit with an alternating current
(AC) voltage source of the form Vs(t) = cosωt and plots the time dependence of the charge on
the capacitor. You are asked to modify this program in Problem 4.12.
Problem 4.12. Simple ﬁlter circuits
(a) Modify the RCApp program to simulate the voltages in an RC ﬁlter. Your program should
plot the voltage across the resistor VR and the voltage across the source Vs, in addition to the
voltage across the capacitor VC. Run this program with R = 1000Ω and C = 1.0µF (10−6 farads).
Find the steady state amplitude of the voltage drops across the resistor and across the capacitor
as a function of the angular frequency ω of the source voltage Vs = cosωt. Consider
the frequencies f = 10, 50, 100, 160, 200, 500, 1000, 5000, and 10000 Hz. (Remember that
ω = 2πf .) Choose ∆t to be no more than 0.0001 s for f = 10 Hz. What is a reasonable value
of ∆t for f = 10000 Hz?
(b) The output voltage depends on where the digital oscilloscope is connected. What is the
output voltage of the oscilloscope in Figure 4.6a? Plot the ratio of the amplitude of the
output voltage to the amplitude of the input voltage as a function of ω. Use a logarithmic
scale for ω. What range of frequencies is passed? Does this circuit act as a high pass or
a low pass ﬁlter? Answer the same questions for the oscilloscope in Figure 4.6b. Use your
results to explain the operation of a high and low pass ﬁlter. Compute the value of the cutoﬀ
frequency for which the amplitude of the output voltage drops to 1/
√
2 (half-power) of the
input value. How is the cutoﬀ frequency related to RC?
CHAPTER 4. OSCILLATIONS 100
V(t)
T
1
t
Figure 4.7: Square wave voltage with period T and unit amplitude.
(c) Plot the voltage drops across the capacitor and resistor as a function of time. The phase
diﬀerence φ between each voltage drop and the source voltage can be found by ﬁnding the
time tm between the corresponding maxima of the voltages. Because φ is usually expressed
in radians, we have the relation φ/2π = tm/T , where T is the period of the oscillation. What
is the phase diﬀerence φC between the capacitor and the voltage source and the phase difference
φR between the resistor and the voltage source? Do these phase diﬀerences depend
on ω? Does the current lead or lag the voltage; that is, does the maxima of VR(t) come before
or after the maxima of Vs(t)? What is the phase diﬀerence between the capacitor and the
resistor? Does the latter diﬀerence depend on ω?
(d) Modify your program to ﬁnd the steady state response of an LR circuit with a source voltage
Vs(t) = cosωt. Let R = 100Ω and L = 2 × 10−3 H. Because L/R = 2 × 10−5 s, it is convenient
to measure the time and frequency in units of T0 = L/R. We write t∗ = t/T0, ω∗ = ωT0, and
rewrite the equation for an LR circuit as
I(t∗
) +
dI(t∗)
dt∗
=
1
R
cosω∗
t∗
. (4.24)
Because it will be clear from the context, we now simply write t and ω rather than t∗ and ω∗.
What is a reasonable value of the step size ∆t? Compute the steady state amplitude of the
voltage drops across the inductor and the resistor for the input frequencies f = 10, 20, 30,
35, 50, 100, and 200 Hz. Use these results to explain how an LR circuit can be used as a low
pass or a high pass ﬁlter. Plot the voltage drops across the inductor and resistor as a function
of time and determine the phase diﬀerences φR and φL between the resistor and the voltage
source and the inductor and the voltage source. Do these phase diﬀerences depend on ω?
Does the current lead or lag the voltage? What is the phase diﬀerence between the inductor
and the resistor? Does the latter diﬀerence depend on ω?
Problem 4.13. Square wave response of an RC circuit
Modify your program so that the voltage source is a periodic square wave as shown in Figure 4.7.
Use a 1.0µF capacitor and a 3000Ω resistor. Plot the computed voltage drop across the capacitor
as a function of time. Make sure the period of the square wave is long enough so that the
capacitor is fully charged during one half-cycle. What is the approximate time dependence of
VC(t) while the capacitor is charging (discharging)?
We now consider the steady state behavior of the series RLC circuit shown in Figure 4.5 and
represented by (4.22). The response of an electrical circuit is the current rather than the charge
CHAPTER 4. OSCILLATIONS 101
on the capacitor. Because we have simulated the analogous mechanical system, we already know
much about the behavior of driven RLC circuits. Nonetheless, we will ﬁnd several interesting
features of AC electrical circuits in the following two problems.
Problem 4.14. Response of an RLC circuit
(a) Consider an RLC series circuit with R = 100Ω, C = 3.0µF, and L = 2 mH. Modify the simple
harmonic oscillator program or the RC ﬁlter program to simulate an RLC circuit and
compute the voltage drops across the three circuit elements. Assume an AC voltage source
of the form V (t) = V0 cosωt. Plot the current I as a function of time and determine the
maximum steady state current Imax for diﬀerent values of ω. Obtain the resonance curve by
plotting Imax(ω) as a function of ω and compute the value of ω at which the resonance curve
is a maximum. This value of ω is the resonant frequency.
(b) The sharpness of the resonance curve of an AC circuit is related to the quality factor or
Q value. (Q should not be confused with the charge on the capacitor.) The sharper the
resonance, the larger the value of Q. Circuits with high Q (and hence, a sharp resonance)
are useful for tuning circuits in a radio so that only one station is heard at a time. We deﬁne
Q = ω0/∆ω, where the width ∆ω is the frequency interval between points on the resonance
curve Imax(ω) that are
√
2/2 of Imax at its maximum. Compute Q for the values of R, L, and C
given in part (a). Change the value of R by 10% and compute the corresponding percentage
change in Q. What is the corresponding change in Q if L or C is changed by 10%?
(c) Compute the time dependence of the voltage drops across each circuit element for approximately
ﬁfteen frequencies ranging from 1/10 to 10 times the resonant frequency. Plot the
time dependence of the voltage drops.
(d) The ratio of the amplitude of the sinusoidal source voltage to the amplitude of the current
is called the impedance Z of the circuit; that is, Z = Vmax/Imax. This deﬁnition of Z is a
generalization of the resistance that is deﬁned by the relation V = IR for direct current
circuits. Use the plots of part (d) to determine Imax and Vmax for diﬀerent frequencies and
verify that the impedance is given by
Z(ω) = R2 + (ωL − 1/ωC)2. (4.25)
For what value of ω is Z a minimum? Note that the relation V = IZ holds only for the
maximum values of I and V and not for I and V at any time.
(e) Compute the phase diﬀerence φR between the voltage drop across the resistor and the voltage
source. Consider ω ω0, ω = ω0, and ω ω0. Does the current lead or lag the voltage
in each case; that is, does the current reach a maxima before or after the voltage? Also compute
the phase diﬀerences φL and φC and describe their dependence on ω. Do the relative
phase diﬀerences between VC, VR, and VL depend on ω?
(f) Compute the amplitude of the voltage drops across the inductor and the capacitor at the
resonant frequency. How do these voltage drops compare to the voltage drop across the resistor
and to the source voltage? Also compare the relative phases of VC and VL at resonance.
Explain how an RLC circuit can be used to amplify the input voltage.
CHAPTER 4. OSCILLATIONS 102
4.6 Accuracy and Stability
Now that we have learned how to use numerical methods to ﬁnd numerical solutions to simple
ﬁrst-order diﬀerential equations, we need to develop some practical guidelines to help us
estimate the accuracy of the various methods. Because we have replaced a diﬀerential equation
by a diﬀerence equation, our numerical solution is not identically equal to the true solution of
the original diﬀerential equation, except for special cases. The discrepancy between the two
solutions has two causes. One cause is that computers do not store numbers with inﬁnite precision,
but rather to a maximum number of digits that is hardware and software dependent.
As we have seen, Java allows the programmer to distinguish between ﬂoating point numbers;
that is, numbers with decimal points, and integer numbers. Arithmetic with numbers represented
by integers is exact, but we cannot solve a diﬀerential equation using integer arithmetic.
Arithmetic operations involving ﬂoating point numbers, such as addition and multiplication,
introduce roundoﬀ error. For example, if a computer only stored ﬂoating point numbers to two
signiﬁcant ﬁgures, the product 2.1×3.2 would be stored as 6.7 rather than 6.72. The signiﬁcance
of roundoﬀ errors is that they accumulate as the number of mathematical operations increases.
Ideally, we should choose algorithms that do not signiﬁcantly magnify the roundoﬀ error; for
example, we should avoid subtracting numbers that are nearly the same in magnitude.
The other source of the discrepancy between the true answer and the computed answer is
the error associated with the choice of algorithm. This error is called the truncation error. A
truncation error would exist even on an idealized computer that stored ﬂoating point numbers
with inﬁnite precision and hence had no roundoﬀ error. Because the truncation error depends
on the choice of algorithm and can be controlled by the programmer, you should be motivated
to learn more about numerical analysis and the estimation of truncation errors. However, there
is no general prescription for the best algorithm for obtaining numerical solutions of diﬀerential
equations. We will ﬁnd in later chapters that the various algorithms have advantages and
disadvantages, and the appropriate selection depends on the nature of the solution, which you
might not know in advance, and on your objectives. How accurate must the answer be? Over
how large an interval do you need the solution? What kind of computer(s) are you using? How
much computer time and personal time do you have?
In practice, we usually can determine the accuracy of a numerical solution by reducing the
value of ∆t until the numerical solution is unchanged at the desired level of accuracy. Of course,
we have to be careful not to make ∆t too small, because too many steps would be required and
the computation time and roundoﬀ error would increase.
In addition to accuracy, another important consideration is the stability of an algorithm. As
discussed in Appendix 3A, it might happen that the numerical results are very good for short
times, but diverge from the true solution for longer times. This divergence might occur if small
errors in the algorithm are multiplied many times, causing the error to grow geometrically. Such
an algorithm is said to be unstable for the particular problem. We consider the accuracy and the
stability of the Euler algorithm in Problems 4.15 and 4.16.
Problem 4.15. Accuracy of the Euler algorithm
(a) Use the Euler algorithm to compute the numerical solution of dy/dx = 2x with y = 0 at
x = 0 and ∆x = 0.1, 0.05, 0.025, 0.01, and 0.005. Make a table showing the diﬀerence
between the exact solution and the numerical solution. Is the diﬀerence between these
solutions a decreasing function of ∆x? That is, if ∆x is decreased by a factor of two, how
does the diﬀerence change? Plot the diﬀerence as a function of ∆x. If your points fall
approximately on a straight line, then the diﬀerence is proportional to ∆x (for ∆x 1). The
CHAPTER 4. OSCILLATIONS 103
numerical method is called nth order if the diﬀerence between the analytic solution and the
numerical solution is proportional to (∆x)n for a ﬁxed value of x. What is the order of the
Euler algorithm?
(b) One way to determine the accuracy of a numerical solution is to repeat the calculation with
a smaller step size and compare the results. If the two calculations agree to p decimal places,
we can reasonably assume that the results are correct to p decimal places. What value of ∆x
is necessary for 0.1% accuracy at x = 2? What value of ∆x is necessary for 0.1% accuracy at
x = 4?
Problem 4.16. Stability of the Euler algorithm
(a) Consider the diﬀerential equation (4.23) with Q = 0 at t = 0. This equation represents
the charging of a capacitor in an RC circuit with a constant applied voltage V . Choose
R = 2000Ω, C = 10−6 farads, and V = 10 volts. Do you expect Q(t) to increase with t? Does
Q(t) increase indeﬁnitely, or does it reach a steady-state value? Use a program to solve
(4.23) numerically using the Euler algorithm. What value of ∆t is necessary to obtain three
decimal accuracy at t = 0.005?
(b) What is the nature of your numerical solution to (4.23) at t = 0.05 for ∆t = 0.005, 0.0025,
and 0.001? Does a small change in ∆t lead to a large change in the computed value of Q? Is
the Euler algorithm stable for reasonable values of ∆t?
4.7 Projects
Project 4.17. Chemical oscillations
The kinetics of chemical reactions can be modeled by a system of coupled ﬁrst-order diﬀerential
equations. As an example, consider the following reaction:
A + 2B → 3B + C (4.26)
where A,B, and C represent the concentrations of three diﬀerent types of molecules. The corresponding
rate equations for this reaction are
dA
dt
= −kAB2
(4.27a)
dB
dt
= kAB2
(4.27b)
dC
dt
= kAB2
. (4.27c)
The rate at which the reaction proceeds is determined by the reaction constant k. The terms
on the right-hand side of (4.27) are positive if the concentration of the molecule increases in
(4.26) as it does for B and C, and negative if the concentration decreases as it does for A. Note
that the term 2B in the reaction (4.26) appears as B2 in the rate equation (4.27). In (4.27) we
have assumed that the reactants are well stirred so that there are no spatial inhomogeneities. In
Section 7.8 we will discuss the eﬀects of spatial inhomogeneities due to molecular diﬀusion.
Most chemical reactions proceed to equilibrium, where the mean concentrations of all
molecules are constant. However, if the concentrations of some molecules are replenished, it is
possible to observe oscillations and chaotic behavior (see Chapter 6). To obtain oscillations, it
CHAPTER 4. OSCILLATIONS 104
is essential to have a series of chemical reactions such that the products of some reactions are
the reactants of others. In the following, we consider a simple set of reactions that can lead to
oscillations under certain conditions (see Lefever and Nicolis):
A → X (4.28a)
B + X → Y + D (4.28b)
2X + Y → 3X (4.28c)
X → C. (4.28d)
If we assume that the reverse reactions are negligible and A and B are held constant by an
external source, the corresponding rate equations are
dX
dt
= A − (B + 1)X + X2
Y (4.29a)
dY
dt
= BX − X2
Y . (4.29b)
For simplicity, we have chosen the rate constants to be unity.
(a) The steady state solution of (4.29) can be found by setting dX/dt and dY /dt equal to zero.
Show that the steady state values for (X,Y ) are (A,B/A).
(b) Write a program to solve numerically the rate equations given by (4.29). Your program
should input the initial values of X and Y and the ﬁxed concentrations A and B, and plot X
versus Y as the reactions evolve.
(c) Systematically vary the initial values of X and Y for given values of A and B. Are their
steady state behaviors independent of the initial conditions?
(d) Let the initial value of (X,Y ) equal (A + 0.001,B/A) for several diﬀerent values of A and B,
that is, choose initial values close to the steady state values. Classify which initial values
result in steady state behavior (stable) and which ones show periodic behavior (unstable).
Find the relation between A and B that separates the two types of behavior.
Project 4.18. Nerve impulses
In 1952 Hodgkin and Huxley developed a model of nerve impulses to understand the nerve
membrane potential of a giant squid nerve cell. The equations they developed are known as the
Hodgkin-Huxley equations. The idea is that a membrane can be treated as a capacitor where
CV = q, and thus the time rate of change of the membrane potential V is proportional to the
current dq/dt ﬂowing through the membrane. This current is due to the pumping of sodium
and potassium ions through the membrane, a leakage current, and an external current stimulus.
The model is capable of producing single nerve impulses, trains of nerve impulses, and other
eﬀects. The model is described by the following ﬁrst-order diﬀerential equations:
C
dV
dt
= −gKn4
(V − VK) − gNam3
h(V − VNa) − gL(V − VL) + Iext(t) (4.30a)
dn
dt
= αn(1 − n) − βnn (4.30b)
dm
dt
= αm(1 − m) − βmm (4.30c)
dh
dt
= αh(1 − h) − βhh (4.30d)
CHAPTER 4. OSCILLATIONS 105
where V is the membrane potential in millivolts (mV), n, m, and h are time dependent functions
that describe the gates that pump ions into or out of the cell, C is the membrane capacitance
per unit area, the gi are the conductances per unit area for potassium, sodium, and the leakage
current, Vi are the equilibrium potentials for each of the currents, and αj and βj are nonlinear
functions of V . We use the notation n, m, and h for the gate functions because the notation is
universally used in the literature. These gate functions are empirical attempts to describe how
the membrane controls the ﬂow of ions into and out of the nerve cell. Hodgkin and Huxley
found the following empirical forms for αj and βj:
αn = 0.01(V + 10)/[e(1+V /10)
− 1] (4.31a)
βn = 0.125eV /80
(4.31b)
αm = 0.1(V + 25)/[e(2.5+V /10)
− 1] (4.31c)
βm = 4eV /18
(4.31d)
αh = 0.07eV /20
(4.31e)
βh = 1/[e(3+V /10)
+ 1]. (4.31f)
The values of the parameters are C = 1.0µF/cm2, gK = 36 mmho/cm2, gNa = 120 mmho/cm2,
gL = 0.3 mmho/cm2, VK = 12 mV, VNa = −115 mV, and VL = 10.6 mV. The unit mho represents
ohm−1, and the unit of time is milliseconds (ms). These parameters assume that the resting
potential of the nerve cell is zero; however, we now know that the resting potential is about
−70 mV.
We can use the ODE solver to solve (4.30) with the state vector {V ,n,m,h,t}; the rates are
given by the right-hand side of (4.30). The following questions ask you to explore the properties
of the model.
(a) Write a program to plot n, m, and h as a function of V in the steady state (for which ˙n = ˙m =
˙h = 0). Describe how these gates are operating.
(b) Write a program to simulate the nerve cell membrane potential and plot V (t). You can use
a simple Euler algorithm with a time step of 0.01 ms. Describe the behavior of the potential
when the external current is 0.
(c) Consider a current that is zero except for a one millisecond interval. Try a current spike
amplitude of 7µA (that is, the external current equals 7 in our units). Describe the resulting
nerve impulse V (t). Is there a threshold value for the current below which there is no large
spike but only a broad peak?
(d) A constant current should produce a train of spikes. Try diﬀerent amplitudes for the current
and determine if there is a threshold current and how the spacing between spikes depends
on the amplitude of the external current.
(e) Consider a situation where there is a steady external current I1 for 20 ms and then the
current increases to I2 = I1 + ∆I. There are three types of behavior depending on I2 and
∆I. Describe the behavior for the following four situations: (1) I1 = 2.0µA, ∆I = 1.5µA; (2)
I1 = 2.0µA, ∆I = 5.0µA; (3) I1 = 7.0µA, ∆I = 1.0µA; and (4) I1 = 7.0µA, ∆I = 4.0µA. Try
other values of I1 and ∆I as well. In which cases do you obtain a steady spike train? Which
cases produce a single spike? What other behavior do you ﬁnd?
CHAPTER 4. OSCILLATIONS 106
(f) Once a spike is triggered, it is frequently diﬃcult to trigger another spike. Consider a
current pulse at t = 20 ms of 7µA that lasts for one millisecond. Then give a second current
pulse of the same amplitude and duration at t = 25 ms. What happens? What happens if
you add a third pulse at 30 ms?
References and Suggestions for Further Reading
F. S. Acton, Numerical Methods That Work (The Mathematical Association of America, 1999),
Chapter 5.
G. L. Baker and J. P. Gollub, Chaotic Dynamics: An Introduction, 2nd ed. (Cambridge University
Press, 1996). A good introduction to the notion of phase space.
Eugene I. Butikov, “Square-wave excitation of a linear oscillator,” Am. J. Phys. 72, 469–476
(2004).
A. Douglas Davis, Classical Mechanics (Saunders College Publishing, 1986). The author gives
simple numerical solutions of Newton’s equations of motion. Much emphasis is given to
the harmonic oscillator problem.
S. Eubank, W. Miner, T. Tajima, and J. Wiley, “Interactive computer simulation and analysis of
Newtonian dynamics,” Am. J. Phys. 57, 457–463 (1989).
Richard P. Feynman, Robert B. Leighton, and Matthew Sands, The Feynman Lectures on Physics,
Vol. 1 (Addison–Wesley, 1963). Chapters 21 and 23–25 are devoted to various aspects of
harmonic motion.
A. P. French, Newtonian Mechanics (W. W. Norton & Company, 1971). An introductory level
text with a good discussion of oscillatory motion.
M. Gitterman, “Classical harmonic oscillator with multiplicative noise,” Physica A 352, 309–
334 (2005). The analysis is analytical and at the graduate level. However, it would be
straightforward to reproduce most of the results after you learn about random processes
in Chapter 7.
A. L. Hodgkin and A. F. Huxley, “A quantitative description of ion currents and its applications
to conduction and excitation in nerve membranes,” J. Physiol. (Lond.) 117, 500–544
(1952).
Charles Kittel, Walter D. Knight, and Malvin A. Ruderman, Mechanics, 2nd ed., revised by A.
Carl Helmholz and Burton J. Moyer (McGraw–Hill, 1973).
R. Lefever and G. Nicolis, “Chemical instabilities and sustained oscillations,” J. Theor. Biol.
30, 267 (1971).
Jerry B. Marion and Stephen T. Thornton, Classical Dynamics, 5th ed. (Harcourt, 2004). Excellent
discussion of linear and nonlinear oscillators.
M. F. McInerney, “Computer-aided experiments with the damped harmonic oscillator,” Am. J.
Phys. 53, 991–996 (1985).
CHAPTER 4. OSCILLATIONS 107
William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery, Numerical
Recipes, 2nd ed. (Cambridge University Press, 1992). Chapter 16 discusses the integration
of ordinary diﬀerential equations.
Scott Hamilton, An Analog Electronics Companion (Cambridge University Press, 2003). A
good discussion of the physics and mathematics of basic circuit design including an extensive
introduction to circuit simulation using the PSpice simulation program.
S. C. Zilio, “Measurement and analysis of large-angle pendulum motion,” Am. J. Phys. 50,
450–452 (1982).
Chapter 5
Few-Body Problems: The Motion of
the Planets
We apply Newton’s laws of motion to planetary motion and other systems of a few particles and
explore some of the counterintuitive consequences of Newton’s laws.
5.1 Planetary Motion
Planetary motion is of special signiﬁcance because it played an important role in the conceptual
history of the mechanical view of the universe. Few theories have aﬀected Western civilization
as much as Newton’s laws of motion and the law of gravitation, which together relate the motion
of the heavens to the motion of terrestrial bodies.
Much of our knowledge of planetary motion is summarized by Kepler’s three laws, which
can be stated as
1. Each planet moves in an elliptical orbit with the sun located at one of the foci of the ellipse.
2. The speed of a planet increases as its distance from the sun decreases such that the line
from the sun to the planet sweeps out equal areas in equal times.
3. The ratio T 2/a3 is the same for all planets that orbit the sun, where T is the period of the
planet and a is the semimajor axis of the ellipse.
Kepler obtained these laws by a careful analysis of the observational data collected over many
years by Tycho Brahe.
Kepler’s ﬁrst and third laws describe the shape of the orbit rather than the time dependence
of the position and velocity of a planet. Because it is not possible to obtain this time dependence
in terms of elementary functions, we will obtain the numerical solution of the equations of
motion of planets and satellites in orbit. In addition, we will consider the eﬀects of perturbing
forces on the orbit and problems that challenge our intuitive understanding of Newton’s laws
of motion.
108
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 109
5.2 The Equations of Motion
The motion of the Sun and Earth is an example of a two-body problem. We can reduce this
problem to a one-body problem in one of two ways. The easiest way is to use the fact that
the mass of the Sun is much greater than the mass of the Earth. Hence we can assume that,
to a good approximation, the Sun is stationary and is a convenient choice of the origin of our
coordinate system.
If you are familiar with the concept of a reduced mass, you know that the reduction to a onebody
problem is more general. That is, the motion of two objects of mass m and M, whose total
potential energy is a function only of their relative separation, can be reduced to an equivalent
one-body problem for the motion of an object of reduced mass µ given by
µ =
Mm
m + M
. (5.1)
Because the mass of the Earth, m = 5.99 × 1024 kg, is so much smaller than the mass of the Sun,
M = 1.99 × 1030 kg, we ﬁnd that for most practical purposes, the reduced mass of the Sun and
the Earth is that of the Earth alone. In the following, we consider the problem of a single particle
of mass m moving about a ﬁxed center of force, which we take as the origin of the coordinate
system.
Newton’s universal law of gravitation states that a particle of mass M attracts another particle
of mass m with a force given by
F = −
GMm
r2
ˆr = −
GMm
r3
r (5.2)
where the vector r is directed from M to m (see Figure 5.1). The negative sign in (5.2) implies
that the gravitational force is attractive; that is, it tends to decrease the separation r. The gravitational
constant G is determined experimentally to be
G = 6.67 × 10−11 m3
kg · s2
. (5.3)
The force law (5.2) applies to the motion of the center of mass for objects of negligible
spatial extent. Newton delayed publication of his law of gravitation for twenty years while he
invented integral calculus and showed that (5.2) also applies to any uniform sphere or spherical
shell of matter if the distance r is measured from the center of each mass.
The gravitational force has two general properties: its magnitude depends only on the separation
of the particles, and its direction is along the line joining the particles. Such a force is
called a central force. The assumption of a central force implies that the orbit of the Earth is
restricted to a plane (x-y), and the angular momentum L is conserved and lies in the third (z)
direction. We write Lz in the form
Lz = (r × mv)z = m(xvy − yvx) (5.4)
where we have used the cross-product deﬁnition L = r×p and p = mv. An additional constraint
on the motion is that the total energy E is conserved and is given by
E =
1
2
mv2
−
GMm
r
. (5.5)
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 110
M
m
y
x
r
θ
Fy
Fx
Figure 5.1: An object of mass m moves under the inﬂuence of a central force F. Note that
cosθ = x/r and sinθ = y/r, which provide useful relations for writing the equations of motion
in component form suitable for numerical solutions.
If we ﬁx the coordinate system at the mass M, the equation of motion of the particle of mass
m is
m
d2r
dt2
= −
GMm
r3
r. (5.6)
It is convenient to write the force in Cartesian coordinates (see Figure 5.1):
Fx = −
GMm
r2
cosθ = −
GMm
r3
x (5.7a)
Fy = −
GMm
r2
sinθ = −
GMm
r3
y. (5.7b)
Hence, the equations of motion in Cartesian coordinates are
d2x
dt2
= −
GM
r3
x (5.8a)
d2y
dt2
= −
GM
r3
y (5.8b)
where r2 = x2 + y2. Equations (5.8a) and (5.8b) are examples of coupled diﬀerential equations
because each equation contains both x and y.
5.3 Circular and Elliptical Orbits
Because many planetary orbits are nearly circular, it is useful to obtain the condition for a
circular orbit. The magnitude of the acceleration a is related to the radius r of the circular orbit
by
a =
v2
r
(5.9)
where v is the speed of the object. The acceleration is always directed toward the center and is
due to the gravitational force. Hence, we have
mv2
r
=
GMm
r2
(5.10)
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 111
y
x
O
F1
F2
P
ea
a
B
A
Figure 5.2: The characterization of an ellipse in terms of the semimajor axis a and the eccentricity
e. The semiminor axis b is the distance OB. The origin O in Cartesian coordinates is at the
center of the ellipse.
and
v =
GM
r
1/2
. (5.11)
The relation (5.11) between the radius and the speed is the general condition for a circular orbit.
We can also ﬁnd the dependence of the period T on the radius of a circular orbit using the
relation,
T =
2πr
v
(5.12)
in combination with (5.11) to obtain
T 2
=
4π2
GM
r3
. (5.13)
The relation (5.13) is a special case of Kepler’s third law with the radius r corresponding to the
semimajor axis of an ellipse.
A simple geometrical characterization of an elliptical orbit is shown in Figure 5.2. The two
foci of an ellipse, F1 and F2, have the property that for any point P , the distance F1P + F2P is a
constant. In general, an ellipse has two perpendicular axes of unequal length. The longer axis
is the major axis; half of this axis is the semimajor axis a. The shorter axis is the minor axis; the
semiminor axis b is half of this distance. It is common to specify an elliptical orbit by a and by
the eccentricity e, where e is the ratio of the distance between the foci to the length of the major
axis. Because F1P + F2P = 2a, it is easy to show that
e = 1 −
b2
a2
(5.14)
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 112
with 0 < e < 1. (Choose the point P at x = 0,y = b.) A special case is b = a, for which the ellipse
reduces to a circle and e = 0.
5.4 Astronomical Units
It is convenient to choose a system of units in which the magnitude of the product GM is not
too large and not too small. To describe the Earth’s orbit, the convention is to choose the length
of the Earth’s semimajor axis as the unit of length. This unit of length is called the astronomical
unit (AU) and is
1AU = 1.496 × 1011
m. (5.15)
The unit of time is assumed to be one year or 3.15×107 s. In these units, the period of the Earth
is T = 1 years and its semimajor axis is a = 1AU. Hence, from (5.13)
GM =
4π2a3
T 2
= 4π2
AU3
/years2
(astronomical units). (5.16)
As an example of the use of astronomical units, a program distance of 1.5 would correspond to
1.5 × (1.496 × 1011) = 2.244 × 1011 m.
5.5 Log-log and Semilog Plots
The values of T and a for our solar system are given in Table 5.1. We ﬁrst analyze these values
and determine if T and a satisfy a simple mathematical relationship.
Suppose we wish to determine whether two variables y and x satisfy a functional relationship,
y = f (x). To simplify the analysis, we ignore possible errors in the measurements of y and
x. The simplest relation between y and x is linear; that is, y = mx + b. The existence of such a
relation can be seen by plotting y versus x and ﬁnding if the plot is linear. From Table 5.1 we
see that T is not a linear function of a. For example, an increase in T from 0.24 to 1, an increase
of approximately 4, yields an increase in a of approximately 2.5.
For many problems, it is reasonable to assume an exponential relation
y = C erx
(5.17)
or a power law relation
y = C xn
(5.18)
where C, r, and n are unknown parameters.
If we assume the exponential form (5.17), we can take the natural logarithm of both sides
to ﬁnd
lny = lnC + rx. (5.19)
Hence, if (5.17) is applicable, a plot of lny versus x would yield a straight line with slope r and
intercept lnC.
The natural logarithm of both sides of the power law relation (5.18) yields
lny = lnC + nlnx. (5.20)
If (5.18) applies, a plot of lny versus lnx yields the exponent n (the slope), which is the usual
quantity of physical interest if a power law dependence holds.
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 113
Planet T (Earth years) a (AU)
Mercury 0.241 0.387
Venus 0.615 0.723
Earth 1.0 1.0
Mars 1.88 1.523
Jupiter 11.86 5.202
Saturn 29.5 9.539
Uranus 84.0 19.18
Neptune 165 30.06
Pluto 248 39.44
Table 5.1: The period T and semimajor axis a of the planets. The unit of length is the astronomical
unit (AU). The unit of time is one (Earth) year.
We illustrate a simple analysis of the data in Table 5.1. Because we expect that the relation
between T and a has the power law form T = Can, we plot lnT versus lna (see Figure 5.3). A visual
inspection of the plot indicates that a linear relationship between lnT and lna is reasonable
and that the slope is approximately 1.50 in agreement with Kepler’s second law. In Chapter 7
we will discuss the least squares method for ﬁtting a straight line through a number of data
points. With a little practice, you can do a visual analysis that is nearly as good.
The PlotFrame class contains the axes and titles needed to produce linear, log-log, and
semilog plots. It also contains the methods needed to display data in a table format. This table
can be displayed programmatically or by right-clicking (control-clicking) at runtime. Listing 5.1
shows a short program that produces the log-log plot of the semimajor axis of the planets versus
the orbital period. The arrays a and T contain the semimajor axis of the planets and their
periods, respectively. Setting the log scale option causes the PlotFrame to transform the data
as it is being plotted and causes the axis to change how labels are rendered. Note that the plot
automatically adjusts itself to ﬁt the data because the autoscale option is true by default. Also
the grid and the tick-labels change as the window is resized.
Listing 5.1: A simple program that producs a log-log plot to demonstrate Kepler’s second law.
package org . opensourcephysics . sip . ch05 ;
import org . opensourcephysics . frames . PlotFrame ;
public class SecondLawPlotApp {
public s t a t i c void main ( String [ ] args ) {
PlotFrame frame = new PlotFrame ( "ln(a)" , "ln(T)" ,
"Kepler’s second law" ) ;
frame . setLogScale ( true , true ) ;
frame . setConnected ( false ) ;
double [ ] period = {
0.241 , 0.615 , 1.0 , 1.88 , 11.86 , 29.50 , 84.0 , 165 , 248
} ;
double [ ] a = {
0.387 , 0.723 , 1.0 , 1.523 , 5.202 , 9.539 , 19.18 , 30.06 , 39.44
} ;
frame . append (0 , a , period ) ;
frame . s e t V i s i b l e ( true ) ;
/ / d e f i n e s t i t l e s of t a b l e columns
frame . setXYColumnNames (0 , "T (years)" , "a (AU)" ) ;
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 114
-1
0
1
2
3
4
-2 -1 0 1 2 3 4 5 6
lnT
ln a
Figure 5.3: Plot of lnT versus lna using the data in Table 5.1. Verify that the slope is 1.50.
x y1(x) y2(x) y3(x)
0 0.00 0.00 2.00
0.5 0.75 1.59 5.44
1.0 3.00 2.00 14.78
1.5 6.75 2.29 40.17
2.0 12.00 2.52 109.20
2.5 18.75 2.71 296.83
Table 5.2: Determine the functional forms of y(x) for the three sets of data. There are no measurement
errors, but there are roundoﬀ errors.
/ / shows data t a b l e ; can a l s o be done from frame menu
frame . showDataTable ( true ) ;
frame . setDefaultCloseOperation ( javax . swing . JFrame .EXIT_ON_CLOSE ) ;
}
}
Exercise 5.1. Simple functional forms
(a) Run SecondLawPlotApp and convince yourself that you understand the syntax.
(b) Modify SecondLawPlotApp so that the three sets of data shown in Table 5.2 are plotted.
Generate linear, semilog, and log-log plots to determine the functional form of y(x) that
best ﬁts each data set.
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 115
5.6 Simulation of the Orbit
We now develop a program to simulate the Earth’s orbit about the Sun. The PlanetApp class
shown in Listing 5.2 organizes the startup process and creates the visualization. Because this
class extends AbstractSimulation, it is suﬃcient to know that the superclass invokes the
doStep method periodically when the thread is running or once each time the Step button is
clicked. The preferred scale and the aspect ratio for the plot frame are set in the constructor. The
statement frame.setSquareAspect(true) ensures that a unit of distance will equal the same
number of pixels in both the horizontal and vertical directions; the statement planet.initialize(new
double[]{x, vx, y, vy, 0}) in the initialize method is used to create an array on the ﬂy
as the argument to another method.
Listing 5.2: PlanetApp.
package org . opensourcephysics . sip . ch05 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . ;
public class PlanetApp extends AbstractSimulation {
PlotFrame frame = new PlotFrame ( "x (AU)" , "y (AU)" ,
"Planet Simulation" ) ;
Planet planet = new Planet ( ) ;
public PlanetApp ( ) {
frame . addDrawable ( planet ) ;
frame . setPreferredMinMax ( −5 , 5 , −5, 5 ) ;
frame . setSquareAspect ( true ) ;
}
public void doStep ( ) {
for ( int i = 0; i <5; i ++) { / / do 5 s t e p s between screen draws
planet . doStep ( ) ; / / advances time
}
frame . setMessage ( "t = "+decimalFormat . format ( planet . s t a t e [ 4 ] ) ) ;
}
public void i n i t i a l i z e ( ) {
planet . odeSolver . setStepSize ( control . getDouble ( "dt" ) ) ;
double x = control . getDouble ( "x" ) ;
double vx = control . getDouble ( "vx" ) ;
double y = control . getDouble ( "y" ) ;
double vy = control . getDouble ( "vy" ) ;
/ / c r e a t e an array on the f l y as the argument to another method
planet . i n i t i a l i z e (new double [ ] { x , vx , y , vy , 0 } ) ;
frame . setMessage ( "t = 0" ) ;
}
public void reset ( ) {
control . setValue ( "x" , 1 ) ;
control . setValue ( "vx" , 0 ) ;
control . setValue ( "y" , 0 ) ;
control . setValue ( "vy" , 6 . 2 8 ) ;
control . setValue ( "dt" , 0 . 0 1 ) ;
i n i t i a l i z e ( ) ;
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 116
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl . createApp (new PlanetApp ( ) ) ;
}
}
The Planet class in Listing 5.3 deﬁnes the physics and instantiates the numerical method.
The latter is the Euler algorithm, which will be replaced in Problem 5.2. Note how the argument
to the initialize method is used. The System.arraycopy(array1,index1,array2,index2,length
method in the core Java API copies blocks of memory, such as arrays, and is optimized for particular
operating systems. This method copies length elements of array1 starting at index1
into array2 starting at index2. In most applications index1 and index2 will be set equal to 0.
Listing 5.3: Class that models the rate equation for a planet acted on by an inverse square law
force.
package org . opensourcephysics . sip . ch05 ;
import java . awt . ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . numerics . ;
public class Planet implements Drawable , ODE {
/ / GM in units of (AU)^3/( yr )^2
final s t a t i c double GM = 4 Math . PI Math . PI ;
Circle c i r c l e = new Circle ( ) ;
Trail t r a i l = new Trail ( ) ;
double [ ] s t a t e = new double [ 5 ] ; / / {x , vx , y , vy , t }
Euler odeSolver = new Euler ( this ) ; / / c r e a t e s numerical method
public void doStep ( ) {
odeSolver . step ( ) ; / / advances time
t r a i l . addPoint ( s t a t e [ 0 ] , s t a t e [ 2 ] ) ; / / x , y
}
void i n i t i a l i z e ( double [ ] i n i t S t a t e ) {
System . arraycopy ( i ni t St a te , 0 , state , 0 , i n i t S t a t e . length ) ;
/ / r e i n i t i a l i z e s the s o l v e r in case the s o l v e r a c c e s s e s data
/ / from previous s t e p s
odeSolver . i n i t i a l i z e ( odeSolver . getStepSize ( ) ) ;
t r a i l . clear ( ) ;
}
public void getRate ( double [ ] state , double [ ] rate ) {
/ / s t a t e [ ] : x , vx , y , vy , t
double r2 = ( s t a t e [ 0 ] s t a t e [ 0 ] ) + ( s t a t e [ 2 ] s t a t e [ 2 ] ) ; / / r squared
double r3 = r2 Math . sqrt ( r2 ) ; / / r cubed
rate [ 0] = s t a t e [ 1 ] ; / / x r a t e
rate [ 1] = (−GM s t a t e [ 0 ] ) / r3 ; / / vx r a t e
rate [ 2] = s t a t e [ 3 ] ; / / y r a t e
rate [ 3] = (−GM s t a t e [ 2 ] ) / r3 ; / / vy r a t e
rate [ 4] = 1; / / time r a t e
}
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 117
public double [ ] getState ( ) {
return s t a t e ;
}
public void draw ( DrawingPanel panel , Graphics g ) {
c i r c l e . setXY ( s t a t e [ 0 ] , s t a t e [ 2 ] ) ;
c i r c l e . draw ( panel , g ) ;
t r a i l . draw ( panel , g ) ;
}
}
The Planet class implements the Drawable interface and deﬁnes the draw method as described
in Section 3.3. In this case we did not use graphics primitives such as fillOval to
perform the drawing. Instead, the method calls the methods circle.draw and trail.draw to
draw the planet and its trajectory, respectively.
Invoking a method in another object that has the desired functionality is known as forwarding
or delegating the method. One advantage of forwarding is that we can change the implementation
of the drawing within the Planet class at any time and still be assured that the planet
object is drawable. We could, for example, replace the circle by an image of the Earth. Note
that we have created a composite object by combining the properties of the simpler circle
and trace objects. These techniques of encapsulation and composition are common in object
oriented programming.
Problem 5.2. Veriﬁcation of Planet and PlanetApp for circular orbits
(a) Verify Planet and PlanetApp by considering the special case of a circular orbit. For example,
choose (in astronomical units) x(t = 0) = 1, y(t = 0) = 0, and vx(t = 0) = 0. Use the
relation (5.11) to ﬁnd the value of vy(t = 0) that yields a circular orbit. How small a value
of ∆t is needed so that a circular orbit is repeated over many periods? Your answer will depend
on your choice of diﬀerential equation solver. Find the largest value of ∆t that yields
an orbit that repeats for many revolutions using the Euler, Euler–Cromer, Verlet, and RK4
algorithms. Is it possible to choose a smaller value of ∆t, or are some algorithms, such as the
Euler method, simply not stable for this dynamical system?
(b) Write a method to compute the total energy [see (5.5)] and compute it at regular intervals as
the system evolves. (It is suﬃcient to calculate the energy per unit mass, E/m.) For a given
value of ∆t, which algorithm conserves the total energy best? Is it possible to choose a value
of ∆t that conserves the energy exactly? What is the signiﬁcance of the negative sign for the
total energy?
(c) Write a separate method to determine the numerical value of the period. (See Problem 3.9c
for a discussion of a similar condition.) Choose diﬀerent sets of values of x(t = 0) and vy(t =
0), consistent with the condition for a circular orbit. For each orbit determine the radius and
the period and verify Kepler’s third law.
Problem 5.3. Veriﬁcation of Kepler’s second and third law
(a) Set y(t = 0) = 0 and vx(t = 0) = 0 and ﬁnd by trial and error several values of x(t = 0) and
vy(t = 0) that yield elliptical orbits of a convenient size. Choose a suitable algorithm and plot
the speed of the planet as the orbit evolves. Where is the speed a maximum (minimum)?
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 118
(b) Use the same initial conditions as in part (a) and compute the total energy, angular momentum,
semimajor and semiminor axes, eccentricity, and period for each orbit. Plot your data
for the dependence of the period T on the semimajor axis a and verify Kepler’s third law.
Given the ratio of T 2/a3 that you found, determine the numerical value of this ratio in SI
units for our solar system.
(c) The force center is at (x,y) = (0,0) and is one focus. Find the second focus by symmetry.
Compute the sum of the distances from each point on the orbit to the two foci and verify
that the orbit is an ellipse.
(d) According to Kepler’s second law, the orbiting object sweeps out equal areas in equal times.
If we use an algorithm with a ﬁxed time step ∆t, it is suﬃcient to compute the area of the
triangle swept in each time step. This area equals one-half the base of the triangle times its
height, or 1
2 ∆t (r × v) = 1
2 ∆t(xvy − yvx). Is this area a constant? This constant corresponds to
what physical quantity?
(e)∗ Show that algorithms with a ﬁxed value of ∆t break down if the “planet” is too close to
the sun. What is the cause of the failure of the method? What advantage might there be to
using a variable time step? What are the possible disadvantages? (See Project 5.19 for an
example where a variable time step is very useful.)
Problem 5.4. Noninverse square forces
(a) Consider the dynamical eﬀects of a small change in the attractive inverse-square force law,
for example, let the magnitude of the force equal Cm/r2+δ, where δ << 1. For simplicity, take
the numerical value of the constant C to be 4π2 as before. Consider the initial conditions
x(t = 0) = 1, y(t = 0) = 0, vx(t = 0) = 0, and vy(t = 0) = 5. Choose δ = 0.05 and determine
the nature of the orbit. Does the orbit of the planet retrace itself? Verify that your result is
not due to your choice of ∆t. Does the planet spiral away from or toward the sun? The path
of the planet can be described as an elliptical orbit that slowly rotates or precesses in the
same sense as the motion of the planet. A convenient measure of the precession is the angle
between successive orientations of the semimajor axis of the ellipse. This angle is the rate of
precession per revolution. Estimate the magnitude of this angle for your choice of δ. What
is the eﬀect of decreasing the semimajor axis for ﬁxed δ? What is the eﬀect of changing δ for
ﬁxed semimajor axis?
(b) Einstein’s theory of gravitation (the general theory of relativity) predicts a correction to the
force on a planet that varies as 1/r4 due to a weak gravitational ﬁeld. The result is that the
equation of motion for the trajectory of a particle can be written as
d2r
dt2
= −
GM
r2
1 + α
GM
c2
2 1
r2
ˆr, (5.21)
where the parameter α is dimensionless. Take GM = 4π2 and assume α = 10−3. Determine
the nature of the orbit for this potential. (For our solar system, the constant α is a maximum
for the planet Mercury, but is much smaller than 10−3.)
(c) Suppose that the attractive gravitational force law depends on the inverse cube of the distance,
Cm/r3. What are the units of C? For simplicity, take the numerical value of C to be
4π2. Consider the initial condition x(t = 0) = 1, y(t = 0) = 0, vx(t = 0) = 0 and determine
analytically the value of vy(t = 0) required for a circular orbit. How small a value of ∆t is
needed so that the simulation yields a circular orbit over several periods? How does this
value of ∆t compare with the value needed for the inverse-square force law?
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 119
(a) (b)
Figure 5.4: (a) An impulse applied in the tangential direction. (b) An impulse applied in the
radial direction.
(d) Vary vy(t = 0) by approximately 2% from the circular orbit condition that you determined
in part (c). What is the nature of the new orbit? What is the sign of the total energy? Is the
orbit bound? Is it closed? Are all bound orbits closed?
Problem 5.5. Eﬀect of drag resistance on a satellite orbit
Consider a satellite in orbit about the Earth. In this case it is convenient to measure distances in
terms of the radius of the Earth, R = 6.37 × 106 m, and the time in terms of hours. Because the
force on the satellite is proportional to Gm, where m = 5.99 × 1024 kg is the mass of the Earth,
we need to evaluate the product Gm in Earth units (EU). In these units the value of Gm is given
by
Gm = 6.67 × 10−11 m3
kg · s2
1EU
6.37 × 106 m
3
3.6 × 103
s/h
2
5.99 × 1024
kg
= 20.0EU3
/h2
(Earth units). (5.22)
Modify the Planet class to incorporate the eﬀects of drag resistance on the motion of an orbiting
Earth satellite. Assume that the drag force is proportional to the square of the speed of the
satellite. To be able to observe the eﬀects of air resistance in a reasonable time, take the magnitude
of the drag force to be approximately one-tenth of the magnitude of the gravitational
force. Choose initial conditions such that a circular orbit would be obtained in the absence of
drag resistance and allow at least one revolution before “switching on” the drag resistance. Describe
the qualitative change of the orbit due to drag resistance. How does the total energy and
the speed of the satellite change with time?
5.7 Impulsive Forces
What happens to the orbit of an Earth satellite when it is hit by space debris? We now discuss
the modiﬁcations we need to make in Planet and PlanetApp so that we can apply an impulsive
force (a kick) by a mouse click. If we apply a vertical kick when the position of the satellite is as
shown in Figure 5.4a, the impulse would be tangential to the orbit. A radial kick can be applied
when the satellite is as shown in Figure 5.4b.
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 120
User actions, such as mouse clicks or keyboard entries, are passed from the operating system
to Java event listeners. Although this standard Java framework is straightforward, we have
simpliﬁed it to respond to mouse actions within the Open Source Physics panels and frames.1
In order for an Open Source Physics program to respond to mouse actions, the program implements
the InteractiveMouseHandler interface and then registers its ability to process mouse
actions with the PlotFrame. This procedure is demonstrated in the following test program.
You can copy the handleMouseAction code into your program and replace the print statements
with useful methods. Other mouse actions, such as MOUSE_CLICKED, MOUSE_MOVED, and
MOUSE_ENTERED are deﬁned in the InteractivePanel class.
Listing 5.4: InteractiveMouseHandler interface test program.
package org . opensourcephysics . sip . ch05 ;
import java . awt . event . ;
import javax . swing . ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . frames . ;
public class MouseApp implements InteractiveMouseHandler {
PlotFrame frame = new PlotFrame ( "x" , "y" , "Interactive Handler" ) ;
public MouseApp ( ) {
frame . setInteractiveMouseHandler ( this ) ;
frame . s e t V i s i b l e ( true ) ;
frame . setDefaultCloseOperation ( JFrame .EXIT_ON_CLOSE ) ;
}
public void handleMouseAction ( InteractivePanel panel ,
MouseEvent evt ) {
switch ( panel . getMouseAction ( ) ) {
case InteractivePanel .MOUSE_DRAGGED :
panel . setMessage ( "Dragged" ) ;
break ;
case InteractivePanel .MOUSE_PRESSED :
panel . setMessage ( "Pressed" ) ;
break ;
case InteractivePanel .MOUSE_RELEASED :
panel . setMessage ( null ) ;
break ;
}
}
public s t a t i c void main ( String [ ] args ) {
new MouseApp ( ) ;
}
}
The switch statement is used in Listing 5.4 instead of a chain of if statements. The panel’s
getMouseAction method returns an integer. If this integer matches one of the named constants
following the case label, then the statements following that constant are executed until a break
statement is encountered. If a case does not include a break, then the execution continues with
1See the Open Source Physics User’s Guide for an extensive discussion of interactive drawing panels.
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 121
the next case. The equivalent of the else construct in an if statement is default followed by
statements that are executed if none of the explicit cases occur.
We now challenge your intuitive understanding of Newton’s laws of motion by considering
several perturbations of the motion of an orbiting object. Modify your planet program to simulate
the eﬀects of the perturbations in Problem 5.6. In each case answer the questions before
doing the simulation.
Problem 5.6. Tangential and radial perturbations
(a) Suppose that a small tangential “kick” or impulsive force is applied to a satellite in a circular
orbit about the Earth (see Figure 5.4a.) Choose Earth units so that the numerical value of
the product Gm is given by (5.22). Apply the impulsive force by stopping the program after
the satellite has made several revolutions and click the mouse to apply the force. Recall that
the impulse changes the momentum in the desired direction. In what direction does the
orbit change? Is the orbit stable, for example, does a small impulse lead to a small change in
the orbit? Does the orbit retrace itself indeﬁnitely if no further perturbations are applied?
Describe the shape of the perturbed orbit.
(b) How does the change in the orbit depend on the strength of the kick and its duration?
(c) Determine if the angular momentum and the total energy are changed by the perturbation.
(d) Apply a radial kick to the satellite as in Figure 5.4b and answer the same questions as in
parts (a)–(c).
(e) Determine the stability of the inverse-cube force law (see Problem 5.4) to radial and tangential
perturbations.
Mouse actions are not the only possible way to aﬀect the simulation. We can also add custom
buttons to the control. These buttons are added when the program is instantiated in the main
method.
public s t a t i c void main ( String [ ] args ) {
/ / OSPControl i s a s u p e r c l a s s of SimulationControl
OSPControl control = SimulationControl . createApp (new PlanetApp ( ) ) ;
control . addButton ( "doRadialKick" , "Kick!" , "Perform a radial kick" ) ;
}
Note that SimulationControl (and CalculationControl) extend the OSPControl superclass
and therefore support the addButton method where this method is deﬁned. We assign the variable
returned by the static createApp method to a variable of type OSPControl to highlight the
object-oriented structure of the Open Source Physics library.
The ﬁrst parameter in the addButton method speciﬁes the method that will be invoked
when the button is clicked, the second parameter speciﬁes the text label that will appear on the
button, and the third parameter speciﬁes the tool tip that will appear when the mouse hovers
over the button. Custom buttons can be used for just about anything, but the corresponding
method must be deﬁned.
Exercise 5.7. Custom buttons
Use a custom button in Problem 5.6 rather than a mouse click to apply an impulsive force to
the planet.
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 122
O
vy
vx
u
w
Figure 5.5: The orbit of a particle in velocity space. The vector w points from the origin in
velocity space to the center of the circular orbit. The vector u points from the center of the orbit
to the point (vx,vy).
5.8 Velocity Space
In Problem 5.6 your intuition might have been incorrect. For example, you might have thought
that the orbit would elongate in the direction of the kick. In fact the orbit does elongate but in a
direction perpendicular to the kick. Do not worry; you are in good company! Few students have
a good qualitative understanding of Newton’s law of motion, even after taking an introductory
course in physics. A qualitative way of stating Newton’s second law is
Forces act on the trajectories of particles by changing velocity, not position.
If we fail to take into account this property of Newton’s second law, we will encounter physical
situations that appear counterintuitive.
Because force acts to change velocity, it is reasonable to consider both velocity and position
on an equal basis. In fact position and momentum are treated in such a manner in advanced
formulations of classical mechanics and in quantum mechanics.
In Problem 5.8 we explore some of the properties of orbits in velocity space in the context
of the bound motion of a particle in an inverse-square force. Modify your program so that the
path in velocity space of the Earth is plotted. That is, plot the point (vx,vy) the same way you
plotted the point (x,y). The path in velocity space is a series of successive values of the object’s
velocity vector. If the position space orbit is an ellipse, what is the shape of the orbit in velocity
space?
Problem 5.8. Properties of velocity space orbits
(a) Modify your program to display the orbit in position space and in velocity space at the
same time. Verify that the velocity space orbit is a circle, even if the orbit in position space
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 123
is an ellipse. Does the center of this circle coincide with the origin (vx,vy) = (0,0) in velocity
space? Choose the same initial conditions that you considered in Problems 5.2 and 5.3.
(b)∗ Let u denote the radius vector of a point on the velocity circle and w denote the vector
from the origin in velocity space to the center of the velocity circle (see Figure 5.5). Then
the velocity of the particle can be written as
v = u + w. (5.23)
Compute u and verify that its magnitude is given by
u = GMm/L (5.24)
where L is the magnitude of the angular momentum. Note that L is proportional to m so
that it is not necessary to know the magnitude of m.
(c)∗ Verify that at each moment in time, the planet’s position vector r is perpendicular to u.
Explain why this relation holds.
Problem 5.9. Eﬀect of impulses in velocity space
How does the velocity space orbit change when an impulsive kick is applied in the tangential or
in the radial direction? How do the magnitude and direction of w change? From the observed
change in the velocity orbit and the above considerations, explain the observed change of the
orbit in position space.
5.9 A Mini-Solar System
So far our study of planetary orbits has been restricted to two-body central forces. However,
the solar system is not a two-body system, because the planets exert gravitational forces on
one another. Although the interplanetary forces are small in magnitude in comparison to the
gravitational force of the sun, they can produce measurable eﬀects. For example, the existence
of Neptune was conjectured on the basis of a discrepancy between the experimentally measured
orbit of Uranus and the predicted orbit calculated from the known forces.
The presence of other planets implies that the total force on a given planet is not a central
force. Furthermore, because the orbits of the planets are not exactly in the same plane, an
analysis of the solar system must be extended to three dimensions if accurate calculations are
required. However, for simplicity, we will consider a model of a two-dimensional solar system
with two planets in orbit about a ﬁxed sun.
The equations of motion of two planets of mass m1 and mass m2 can be written in vector
form as (see Figure 5.6)
m1
d2r1
dt2
= −
GMm1
r1
3
r1 +
Gm1m2
r21
3
r21 (5.25a)
m2
d2r2
dt2
= −
GMm2
r2
3
r2 −
Gm1m2
r21
3
r21 (5.25b)
where r1 and r2 are directed from the sun to planets 1 and 2 respectively, and r21 = r2 −r1 is the
vector from planet 1 to planet 2. It is convenient to divide (5.25a) by m1 and (5.25b) by m2 and
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 124
y
x
r2
r1
r21
M
m1
m2
Figure 5.6: The coordinate system used in (5.25). Planets of mass m1 and m2 orbit a sun of mass
M.
to write the equations of motion as
d2r1
dt2
= −
GM
r1
3
r1 +
Gm2
r21
3
r21 (5.26a)
d2r2
dt2
= −
GM
r2
3
r2 −
Gm1
r21
3
r21. (5.26b)
A numerical solution of (5.26) can be obtained by the straightforward extension of the
Planet class as shown in Listing 5.5. To simplify the drawing of the particle trajectories, the
Planet2 class deﬁnes an inner class, Mass, which extends Circle and contains a Trail. Whenever
a planet moves, a point is added to the trail so that its location and path are shown on
the plot. Inner classes are an organizational convenience that save us the trouble of having
to create another ﬁle, which in this case would be named Mass.java. When we compile the
Planet2 class, we will produce a bytecode ﬁle named Planet2$Mass.class in addition to the
ﬁle Planet2.class. Inner classes are most eﬀective as short helper classes which work in conjuction
with the containing class because they have access to all the data (including private
variables) in the containing class.
Listing 5.5: A class that implements the rate equation for two interacting planets acted on by
an inverse-square law force.
package org . opensourcephysics . sip . ch05 ;
import java . awt . ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . numerics . ;
public class Planet2 implements Drawable , ODE {
/ / GM in units of (AU)^3/( yr )^2
final s t a t i c double GM = 4 Math . PI Math . PI ;
final s t a t i c double GM1 = 0.04 GM;
final s t a t i c double GM2 = 0.001 GM;
double [ ] s t a t e = new double [ 9 ] ;
ODESolver odeSolver = new RK45MultiStep ( this ) ;
Mass mass1 = new Mass ( ) , mass2 = new Mass ( ) ;
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 125
public void doStep ( ) {
odeSolver . step ( ) ;
mass1 . setXY ( s t a t e [ 0 ] , s t a t e [ 2 ] ) ;
mass2 . setXY ( s t a t e [ 4 ] , s t a t e [ 6 ] ) ;
}
public void draw ( DrawingPanel panel , Graphics g ) {
mass1 . draw ( panel , g ) ;
mass2 . draw ( panel , g ) ;
}
void i n i t i a l i z e ( double [ ] i n i t S t a t e ) {
System . arraycopy ( i ni t St a te , 0 , state , 0 , i n i t S t a t e . length ) ;
mass1 . clear ( ) ; / / c l e a r s data from the old t r a i l
mass2 . clear ( ) ;
mass1 . setXY ( s t a t e [ 0 ] , s t a t e [ 2 ] ) ;
mass2 . setXY ( s t a t e [ 4 ] , s t a t e [ 6 ] ) ;
}
public void getRate ( double [ ] state , double [ ] rate ) {
/ / s t a t e [ ] : x1 , vx1 , y1 , vy1 , x2 , vx2 , y2 , vy2 , t
double r1Squared = ( s t a t e [ 0 ] s t a t e [ 0 ] ) + ( s t a t e [ 2 ] s t a t e [ 2 ] ) ;
double r1Cubed = r1Squared Math . sqrt ( r1Squared ) ;
double r2Squared = ( s t a t e [ 4 ] s t a t e [ 4 ] ) + ( s t a t e [ 6 ] s t a t e [ 6 ] ) ;
double r2Cubed = r2Squared Math . sqrt ( r2Squared ) ;
double dx = s t a t e [4] − s t a t e [ 0 ] ; / / x12 separ ation
double dy = s t a t e [6] − s t a t e [ 2 ] ; / / y12 separ ation
double dr2 = ( dx dx )+( dy dy ) ; / / r12 squared
double dr3 = Math . sqrt ( dr2 ) dr2 ; / / r12 cubed
rate [ 0] = s t a t e [ 1 ] ; / / x1 r a t e
rate [ 2] = s t a t e [ 3 ] ; / / y1 r a t e
rate [ 4] = s t a t e [ 5 ] ; / / x2 r a t e
rate [ 6] = s t a t e [ 7 ] ; / / y2 r a t e
rate [ 1] = (( −GM s t a t e [ 0 ] ) / r1Cubed ) + ( (GM1 dx )/ dr3 ) ; / / vx1 r a t e
rate [ 3] = (( −GM s t a t e [ 2 ] ) / r1Cubed ) + ( (GM1 dy )/ dr3 ) ; / / vy1 r a t e
rate [ 5] = (( −GM s t a t e [ 4 ] ) / r2Cubed ) −((GM2 dx )/ dr3 ) ; / / vx2 r a t e
rate [ 7] = (( −GM s t a t e [ 6 ] ) / r2Cubed ) −((GM2 dy )/ dr3 ) ; / / vy2 r a t e
rate [ 8] = 1; / / time r a t e
}
public double [ ] getState ( ) {
return s t a t e ;
}
class Mass extends Circle {
Trail t r a i l = new Trail ( ) ;
public void draw ( DrawingPanel panel , Graphics g ) {
t r a i l . draw ( panel , g ) ;
super . draw ( panel , g ) ;
}
void clear ( ) {
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 126
t r a i l . clear ( ) ;
}
public void setXY ( double x , double y ) {
super . setXY ( x , y ) ;
t r a i l . addPoint ( x , y ) ;
}
}
}
The target application, Planet2App, extends AbstractSimulation in the usual way. Because
it is almost identical to Listing 5.2, it is not shown here. The complete program is available
in the ch05 package.
Problem 5.10. Planetary perturbations
Use Planet2App with the initial conditions given in the program. For illustrative purposes,
we have adopted the numerial values m1/M = 10−3 and m2/M = 4 × 10−2 and hence GM1 =
(m2/M)GM = 0.04GM and GM2 = (m1/M)GM = 0.001GM. What would be the shape of the orbits
and the periods of the two planets if they did not mutually interact? What is the qualitative effect
of their mutual interaction? Describe the shape of the two orbits. Why is one planet aﬀected
more by their mutual interaction than the other? Are the angular momentum and the total energy
of planet one conserved? Are the total energy and total angular momentum of the two
planets conserved? A related but more time consuming problem is given in Project 5.18.
Problem 5.11. Double stars
Another interesting dynamical system consists of one planet orbiting about two ﬁxed stars of
equal mass. In this case there are no closed orbits, but the orbits can be classiﬁed as either
stable or unstable. Stable orbits may be open loops that encircle both stars, ﬁgure eights, or
orbits that encircle only one star. Unstable orbits will eventually collide with one of the stars.
Modify Planet2 to simulate the double-star system, with the ﬁrst star located at (−1,0) and the
second star of equal mass located at (1,0). Place the planet at (0.1,1) and systematically vary
the x and y components of the velocity to obtain diﬀerent types of orbits. Then try other initial
positions.
5.10 Two-Body Scattering
Much of our understanding of the structure of matter comes from scattering experiments. In
this section we explore one of the more diﬃcult concepts in the theory of scattering, the diﬀerential
cross section.
A typical scattering experiment involves a beam with many incident particles all with the
same kinetic energy. The coordinate system is shown in Figure 5.7. The incident particles come
from the left with an initial velocity v in the +x direction. We take the center of the beam and
the center of the target to be on the x-axis. The impact parameter b is the perpendicular distance
from the initial trajectory to a parallel line through the center of the target (see Figure 5.7).
We assume that the width of the beam is larger than the size of the target. The target contains
many scattering centers, but for calculational purposes, we may consider scattering oﬀ only one
particle if the target is suﬃciently thin.
When an incident particle comes close to the target, it is deﬂected. In a typical experiment,
the scattered particles are counted in a detector that is far from the target. The ﬁnal velocity of
the scattered particles is v , and the angle between v and v is the scattering angle θ.
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 127
b θ
2πbdb
∝2π sinθ |dθ|
target
Figure 5.7: The coordinate system used to deﬁne the diﬀerential scattering cross section. Particles
passing through the beam area 2πbdb are scattered into the solid angle dΩ.
Let us assume that the scattering is elastic and that the target is much more massive than
the beam particles so that the target can be considered to be ﬁxed. (The latter condition can
be relaxed by using center of mass coordinates.) We also assume that no incident particle is
scattered more than once. These considerations imply that the initial speed and ﬁnal speed of
the incident particles are equal. The functional dependence of θ on b depends on the force
on the beam particles due to the target. In a typical experiment, the number of particles in an
angular region between θ and θ +dθ is detected for many values of θ. These detectors measure
the number of particles scattered into the solid angle dΩ = sinθ dθ dφ centered about θ. The
diﬀerential cross section σ(θ) is deﬁned by the relation
dN
N
= nσ(θ)dΩ (5.27)
where dN is the number of particles scattered into the solid angle dΩ centered about θ and the
azimuthal angle φ, N is the total number of particles in the beam, and n is the target density
deﬁned as the number of targets per unit area.
The interpretation of (5.27) is that the fraction of particles scattered into the solid angle
dΩ is proportional to dΩ and the density of the target. From (5.27) we see that σ(θ) can be
interpreted as the eﬀective area of a target particle for the scattering of an incident particle
into the element of solid angle dΩ. Particles that are not scattered are ignored. Another way of
thinking about σ(θ) is that it is the ratio of the area bdbdφ to the solid angle dΩ = sinθ dθ dφ,
where bdbdφ is the inﬁnitesimal cross-sectional area of the beam that scatters into the solid
angle deﬁned by θ to θ + dθ and φ to φ + dφ. The alternative notation for the diﬀerential cross
section, dσ/dΩ, comes from this interpretation.
To do an analytic calculation of σ(θ), we write
σ(θ) =
dσ
dΩ
=
b
sinθ
db
dθ
. (5.28)
We see from (5.28) that the analytic calculation of σ(θ) involves b as a function of θ, or more
precisely, how b changes to give scattering through an inﬁnitesimally larger angle θ + dθ.
In a scattering experiment, particles enter from the left (see Figure 5.7) with random values
of the impact parameter b and azimuthal angle φ, and the number of particles scattered into the
various detectors is measured. In our simulation, we know the value of b, and we can integrate
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 128
Newton’s equations of motion to ﬁnd the angle at which the incident particle is scattered. Hence,
in contrast to the analytic calculation, a simulation naturally yields θ as a function of b.
Because the diﬀerential cross section is usually independent of φ, we need to consider beam
particles only at φ = 0. We have to take into account the fact that in a real beam, there are more
particles at some values of b than at others. That is, the number of particles in a real beam is
proportional to 2πb∆b, the area of the ring between b and b+∆b, where we have integrated over
the values of φ to obtain the factor of 2π. Here ∆b is the interval between the values of b used
in the program. Because there is only one target in the beam, the target density is n = 1/(πR2).
The scattering program requires the Scatter, ScatterAnalysis, and ScatterApp classes.
The ScatterApp class in Listing 5.6 organizes the startup process and creates the visualizations.
As usual, it extends AbstractSimulation by overriding the doStep method. However, in this
case a single step is not a time step. A step calculates a trajectory and scattering angle for the
given impact parameter. After a trajectory is calculated, the impact parameter is incremented
and the panel is repainted. If necessary, you can eliminate this visualization to increase the
computational speed. If the new impact parameter exceeds the beam radius bmax, the animation
is stopped and the accumulated data is analyzed. Note that the calculateTrajectory method
returns true if the calculation succeeded and that an error message is printed if the calculation
fails. Including a failsafe mechanism to stop a computation is good programming practice.
Listing 5.6: A program that calculates the scattering trajectories and computes the diﬀerential
cross section.
public class ScatterApp extends AbstractSimulation {
PlotFrame frame = new PlotFrame ( "x" , "y" , "Trajectories" ) ;
ScatterAnalysis analysis = new ScatterAnalysis ( ) ;
Scatter t r a j e c t o r y = new Scatter ( ) ;
double vx ; / / speed of the i n c i d e n t p a r t i c l e
double b , db ; / / impact parameter and increment
double bmax ; / / maximum impact parameter
/
Constructs ScatterApp .
/
public ScatterApp ( ) {
frame . setPreferredMinMax ( −5 , 5 , −5, 5 ) ;
frame . setSquareAspect ( true ) ;
}
public void doStep ( ) {
i f ( t r a j e c t o r y . calculateTrajectory ( frame , b , vx ) ) {
analysis . d e t e c t P a r t i c l e (b , t r a j e c t o r y . getAngle ( ) ) ;
} else {
control . println ( "Trajectory did not converge at b = "+b ) ;
}
frame . setMessage ( "b = "+decimalFormat . format (b ) ) ;
b += db ; / / i n c r e a s e s the impact parameter
frame . repaint ( ) ;
i f (b>bmax) {
control . calculationDone ( "Maximum impact parameter reached" ) ;
analysis . plotCrossSection (b ) ;
}
}
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 129
public void i n i t i a l i z e ( ) {
vx = control . getDouble ( "vx" ) ;
bmax = control . getDouble ( "bmax" ) ;
db = control . getDouble ( "db" ) ;
b = db/2; / / s t a r t s b at average value of f i r s t i n t e r v a l 0−>db
/ / b w i l l increment to 3 db /2 , 5 db /2 , 7 db /2 , . . .
frame . setMessage ( "b = 0" ) ;
frame . clearDrawables ( ) ; / / removes old t r a j e c t o r i e s
analysis . clear ( ) ;
}
public void reset ( ) {
control . setValue ( "vx" , 3 ) ;
control . setValue ( "bmax" , 0 . 2 5 ) ;
control . setValue ( "db" , 0 . 0 1 ) ;
i n i t i a l i z e ( ) ;
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl . createApp (new ScatterApp ( ) ) ;
}
}
The Scatter class shown in Listing 5.7 calculates the trajectories by expressing the equation
of motion as a rate equation. The most important method is calculateTrajectory, which
calculates a trajectory by stepping the diﬀerential equation solver and adding the resulting data
to a trail to display the path. Because the beam source is far away, we stop the calculation when
the distance of the scattered particle from the target exceeds the initial distance. Note the use
of the ternary ?: operator. This very eﬃcient and compact operator uses three expressions. The
ﬁrst expression evaluates to a boolean. If this expression is true, then the statement after the ? is
executed. If this expression is false, then the statement after the : is executed. However, because
some potentials may trap particles for long periods of time, we also stop the calculation after a
predetermined number of time steps.
Listing 5.7: A class that models particle scattering using a central force law.
package org . opensourcephysics . sip . ch05 ;
import java . awt . ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . frames . ;
import org . opensourcephysics . numerics . ;
public class Scatter implements ODE {
double [ ] s t a t e = new double [ 5 ] ;
RK4 odeSolver = new RK4( this ) ;
public Scatter ( ) {
odeSolver . setStepSize ( 0 . 0 5 ) ;
}
boolean calculateTrajectory ( PlotFrame frame , double b , double vx ) {
s t a t e [ 0 ] = −5.0; / / x
s t a t e [ 1 ] = vx ; / / vx
s t a t e [ 2 ] = b ; / / y
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 130
s t a t e [ 3 ] = 0; / / vy
s t a t e [ 4 ] = 0; / / time
Trail t r a i l = new Trail ( ) ;
t r a i l . color = Color . red ;
frame . addDrawable ( t r a i l ) ;
double r2 = ( s t a t e [ 0 ] s t a t e [ 0 ] ) + ( s t a t e [ 2 ] s t a t e [ 2 ] ) ;
double count = 0;
while ( ( count <=1000)&&((2 r2 ) >(( s t a t e [ 0 ] s t a t e [ 0 ] ) + ( s t a t e [ 2 ] s t a t e [ 2 ] ) ) ) ) {
t r a i l . addPoint ( s t a t e [ 0 ] , s t a t e [ 2 ] ) ;
odeSolver . step ( ) ;
count ++;
}
return count <1000;
}
private double force ( double r ) {
/ / Coulomb f o r c e law
return ( r==0) ? 0 : (1/ r / r ) ; / / returns 0 i f r = 0
}
public void getRate ( double [ ] state , double [ ] rate ) {
double r = Math . sqrt ( ( s t a t e [ 0 ] s t a t e [ 0 ] ) + ( s t a t e [ 2 ] s t a t e [ 2 ] ) ) ;
double f = force ( r ) ;
rate [ 0] = s t a t e [ 1 ] ;
rate [ 1] = ( f s t a t e [ 0 ] ) / r ;
rate [ 2] = s t a t e [ 3 ] ;
rate [ 3] = ( f s t a t e [ 2 ] ) / r ;
rate [ 4] = 1;
}
public double [ ] getState ( ) {
return s t a t e ;
}
double getAngle ( ) {
return Math . atan2 ( s t a t e [ 3 ] , s t a t e [ 1 ] ) ; / / / Math . PI ; xx
}
}
The ScatterAnalysis class performs the data analysis. This class creates an array of bins to
sort and accumulate the trajectories according to the scattering angle. The values of the scattering
angle between 0◦ and 180◦ are divided into bins of width dtheta. To compute the number
of particles coming from a ring of radius b, we accumulate the value of b associated with each
bin or “detector” and write bins[index] += b (see the detectParticle method), because the
number of particles in a ring of radius b is proportional to b. The total number of scattered
particles is computed in the same way:
totalN += b ;
You might want to increase the number of bins and the range of angles for better resolution.
Listing 5.8: The ScatterAnalysis class accumulates the scattering data and plots the diﬀerential
cross section.
public class ScatterAnalysis {
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 131
int numberOfBins = 18;
PlotFrame frame = new PlotFrame ( "angle" , "sigma" ,
"differential cross section" ) ;
double [ ] bins = new double [ numberOfBins ] ;
double dtheta = Math . PI /( numberOfBins ) ;
double totalN = 0; / / t o t a l number of s c a t t e r e d p a r t i c l e s
void clear ( ) {
for ( int i = 0; i <numberOfBins ; i ++) {
bins [ i ] = 0;
}
totalN = 0;
frame . clearData ( ) ;
frame . repaint ( ) ;
}
void d e t e c t P a r t i c l e ( double b , double theta ) {
/ / t r e a t s p o s i t i v e and negative angles equally to get b e t t e r s t a t i s t i c s
theta = Math . abs ( theta ) ;
int index = ( int ) ( theta / dtheta ) ;
bins [ index ] += b ;
totalN += b ;
}
void plotCrossSection ( double radius ) {
double targetDensity = 1/Math . PI/ radius / radius ;
double delta = ( dtheta 180)/Math . PI ; / / uses degrees f o r p l o t
frame . clearData ( ) ;
for ( int i = 0; i <numberOfBins ; i ++) {
double domega = 2 Math . PI Math . sin ( ( i +0.5) dtheta ) dtheta ;
double sigma = bins [ i ]/ totalN / targetDensity /domega ;
frame . append (0 , ( i +0.5) delta , sigma ) ;
}
frame . s e t V i s i b l e ( true ) ;
}
}
Problem 5.12. Total cross section
The total cross section σT is deﬁned as
σT = σ(θ)dΩ. (5.29)
Add code to calculate and display the total cross section in the plotCrossSection method.
Design a test to verify that the ODE solver in the Scatter class has suﬃcient accuracy.
In Problem 5.13, we consider a model of the hydrogen atom for which the force on a beam
particle is zero for r > a. Because we do not count the beam particles that are not scattered,
we set the beam radius equal to a. For forces that are not identically zero, we need to choose a
minimum angle for θ such that particles whose scattering angle is less than this minimum are
not counted as scattered (see Problem 5.14).
Problem 5.13. Scattering from a model hydrogen atom
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 132
(a) Consider a model of the hydrogen atom for which a positively-charged nucleus of charge +e
is surrounded by a uniformly distributed negative charge of equal magnitude. The spherically
symmetric negative charge distribution is contained within a sphere of radius a. It
is straightforward to show that the force between a positron of charge +e and this model
hydrogen atom is given by
f (r) =



1/r2 − r/a3 r ≤ a
0 r > a.
(5.30)
We have chosen units such that e2/(4π 0) = 1, and the mass of the positron is unity. What
is the ionization energy in these units? Modify the Scatter class to incorporate this force.
Is the force on the positron from the model hydrogen atom purely repulsive? Choose a = 1
and set the beam radius bmax = 1. Use E = 0.125 and ∆t = 0.01. Compute the trajectories for
b = 0.25, 0.5, and 0.75 and describe the qualitative nature of the trajectories.
(b) Determine the cross section for E = 0.125. Choose nine bins so that the angular width of a
detector is delta = 20◦, and let db = 0.1, 0.01, and 0.002. How does the accuracy of your
results depend on the number of bins? Determine the diﬀerential cross section for diﬀerent
energies and explain its qualitative energy dependence.
(c) What is the value of σT for E = 0.125? Does σT depend on E? The total cross section has
units of area, but a point charge does not have an area. To what area does it refer? What
would you expect the total cross section to be for scattering from a hard sphere?
(d) Change the sign of the force so that it corresponds to electron scattering. How do the trajectories
change? Discuss the change in σ(θ).
Problem 5.14. Rutherford scattering
(a) One of the most famous scattering experiments was performed by Geiger and Marsden
who scattered a beam of alpha particles on a thin gold foil. Based on these experiments,
Rutherford deduced that the positive charge of the atom is concentrated in a small region
at the center of the atom rather than distributed uniformly over the entire atom. Use a 1/r2
force in class Scatter and compute the trajectories for b = 0.25, 0.5, and 0.75 and describe
the trajectories. Choose E = 5 and ∆t = 0.01. The default value of x0, the initial x-coordinate
of the beam, is x0 = −5. Is this value reasonable?
(b) For E = 5 determine the cross section with numberOfBins = 18. Choose the beam width
bmax = 2. Then vary db (or numberOfBins) and compare the accuracy of your results to the
analytic result for which σ(θ) varies as [sin(θ/2)]−4. How do your computed results compare
with this dependence on θ? If necessary, decrease db. Are your results better or worse at
small angles, intermediate angles, or large angles near 180◦? Explain.
(c) Because the Coulomb force is long range, there is scattering at all impact parameters. Increase
the beam radius and determine if your results for σ(θ) change. What happens to the
total cross section as you increase the beam width?
(d) Compute σ(θ) for diﬀerent values of E and estimate the dependence of σ(θ) on E.
Problem 5.15. Scattering by other potentials
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 133
(a) A simple phenomenological form for the eﬀective interaction between electrons in metals is
the screened Coulomb (or Thomas–Fermi) potential given by
V (r) =
e2
4π 0r
e−r/a
. (5.31)
The range of the interaction a depends on the density and temperature of the electrons.
The form (5.31) is known as the Yukawa potential in the context of the interaction between
nuclear particles and as the Debye potential in the context of classical plasmas. Choose
units such that a = 1 and e2/(4π 0) = 1. Recall that the force is given by f (r) = −dV /dr.
Incorporate this force law into class Scatter and compute the dependence of σ(θ) on the
energy of the incident particle. Choose the beam width equal to 3. Compare your results for
σ(θ) with your results from the Coulomb potential.
(b) Modify the force law in Scatter so that f (r) = 24(2/r13 − 1/r7). This form for f (r) is used
to describe the interactions between simple molecules (see Chapter 8). Describe some typical
trajectories and compute the diﬀerential cross section for several diﬀerent energies. Let
bmax = 2. What is the total cross section? How do your results change if you vary bmax?
Choose a small angle as the minimum scattering angle. How sensitive is the total cross section
to this minimum angle? Does the diﬀerential cross section vary for any other angles
besides the smallest scattering angle?
5.11 Three-body problems
Poincaré showed that it is impossible to obtain an analytic solution for the unrestricted motion
of three or more objects interacting under the inﬂuence of gravity. However solutions are known
for a few special cases, and it is instructive to study the properties of these solutions.
The ThreeBody class computes the trajectories of three particles of equal mass moving in
a plane and interacting under the inﬂuence of gravity. Both the physics and the drawing are
implemented in the ThreeBody class shown in Listing 5.9. Note that the getRate and computeForce
methods compute trajectories for an arbitrary number of masses. Note how the computeForce
method uses the arraycopy method to quickly zero the arrays. To simplify the drawing
of the particle trajectories, the ThreeBody class uses an inner class that extends a Circle and
contains a Trail.
Listing 5.9: A class that models the dynamics of the three-body problem.
package org . opensourcephysics . sip . ch05 ;
import java . awt . ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . numerics . ;
public class ThreeBody implements Drawable , ODE {
int n = 3; / / number of i n t e r a c t i n g bodies
/ / s t a t e= { x1 , vx1 , y1 , vy1 , x2 , vx2 , y2 , vy2 , x3 , vx3 , y3 , vy3 , t }
double [ ] s t a t e = new double [4 n+1];
double [ ] force = new double [2 n]
double [ ] zeros = new double [2 n ] ;
ODESolver odeSolver = new RK45MultiStep ( this ) ;
Mass mass1 = new Mass ( ) , mass2 = new Mass ( ) , mass3 = new Mass ( ) ;
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 134
public void draw ( DrawingPanel panel , Graphics g ) {
mass1 . draw ( panel , g ) ;
mass2 . draw ( panel , g ) ;
mass3 . draw ( panel , g ) ;
}
public void doStep ( ) {
odeSolver . step ( ) ;
mass1 . setXY ( s t a t e [ 0 ] , s t a t e [ 2 ] ) ;
mass2 . setXY ( s t a t e [ 4 ] , s t a t e [ 6 ] ) ;
mass3 . setXY ( s t a t e [ 8 ] , s t a t e [ 1 0 ] ) ;
}
void i n i t i a l i z e ( double [ ] i n i t S t a t e ) {
/ / c o p i e s i n i t S t a t e to s t a t e
System . arraycopy ( i ni t St a te , 0 , state , 0 , 13);
mass1 . clear ( ) ;
mass2 . clear ( ) ;
mass3 . clear ( ) ;
mass1 . setXY ( s t a t e [ 0 ] , s t a t e [ 2 ] ) ;
mass2 . setXY ( s t a t e [ 4 ] , s t a t e [ 6 ] ) ;
mass3 . setXY ( s t a t e [ 8 ] , s t a t e [ 1 0 ] ) ;
}
void computeForce ( double [ ] s t a t e ) {
/ / s e t s f o r c e array elements to 0
System . arraycopy ( zeros , 0 , force , 0 , force . length ) ;
for ( int i = 0; i <n ; i ++) {
for ( int j = i +1; j <n ; j ++) {
double dx = s t a t e [4 i ]− s t a t e [4 j ] ;
double dy = s t a t e [4 i +2]− s t a t e [4 j +2];
double r2 = dx dx+dy dy ;
double r3 = r2 Math . sqrt ( r2 ) ;
double fx = dx/r3 ;
double fy = dy/r3 ;
force [2 i ] −= fx ;
force [2 i +1] −= fy ;
force [2 j ] += fx ;
force [2 j +1] += fy ;
}
}
}
public void getRate ( double [ ] state , double [ ] rate ) {
computeForce ( s t a t e ) ; / / f o r c e array a l t e r n a t e s fx and fy
for ( int i = 0; i <n ; i ++) {
int i4 = 4 i ;
rate [ i4 ] = s t a t e [ i4 +1]; / / x r a t e i s vx
rate [ i4 +1] = force [2 i ] ; / / vx r a t e i s fx
rate [ i4 +2] = s t a t e [ i4 +3]; / / y r a t e i s vy
rate [ i4 +3] = force [2 i +1]; / / vy r a t e i s fy
}
rate [ s t a t e . length −1] = 1; / / time r a t e i s l a s t
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 135
}
public double [ ] getState ( ) {
return s t a t e ;
}
class Mass extends Circle {
Trail t r a i l = new Trail ( ) ;
/ / Draws the mass
public void draw ( DrawingPanel panel , Graphics g ) {
t r a i l . draw ( panel , g ) ;
super . draw ( panel , g ) ;
}
/ / Clears t r a i l
void clear ( ) {
t r a i l . clear ( ) ;
}
/ / Sets postion and adds to t r a i l
public void setXY ( double x , double y ) {
super . setXY ( x , y ) ;
t r a i l . addPoint ( x , y ) ;
}
}
}
The initial conditions for our examples are contained in the ThreeBodyInitialConditions
class. This ﬁle is available in the ch05 package but is not listed here because it contains mostly
numeric data.
In 1765 Euler discovered an analytic solution in which three masses start on a line and
rotate so that the central mass stays ﬁxed. The EULER array in ThreeBodyInitialConditions
initializes the model to produce this type of solution. The ﬁrst mass is placed at the center, and
the other two masses are placed on opposite sides with velocities that are equal but opposite.
Because of the symmetry, the trajectories are ellipses with a common focus at the center.
A second analytic solution to the unrestricted three-body problem was found by Lagrange
in 1772. This solution starts with three masses at the corners of an equilateral triangle. Each
mass moves in an ellipse in such a way that the triangle formed by the masses remains equilateral.
The LAGRANGE array initializes this solution.
A spectacular new solution that adds to the sparse list of analytic three-body solutions was
ﬁrst discovered numerically by Moore and proven to be stable by Chenciner and Montgomery.
The MONTGOMERY array contains the initial conditions for this solution.
The ThreeBodyApp class in Listing 5.10 is the target class for the three-body program. The
doStep method merely increments the model’s diﬀerential equations solver and repaints the
view.
Listing 5.10: A program that displays the trajectories of three bodies interacting via gravitational
forces.
package org . opensourcephysics . sip . ch05 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . ;
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 136
public class ThreeBodyApp extends AbstractSimulation {
PlotFrame frame = new PlotFrame ( "x" , "y" , "Three-Body Orbits" ) ;
ThreeBody t r a j e c t o r y = new ThreeBody ( ) ;
public ThreeBodyApp ( ) {
frame . addDrawable ( t r a j e c t o r y ) ;
frame . setSquareAspect ( true ) ;
frame . setSize (450 , 450);
}
public void i n i t i a l i z e ( ) {
t r a j e c t o r y . odeSolver . setStepSize ( control . getDouble ( "dt" ) ) ;
t r a j e c t o r y . i n i t i a l i z e ( ThreeBodyInitialConditions .MONTGOMERY) ;
frame . setPreferredMinMax ( −1.5 , 1.5 , −1.5 , 1 . 5 ) ;
}
public void reset ( ) {
control . setValue ( "dt" , 0 . 1 ) ;
enableStepsPerDisplay ( true ) ;
i n i t i a l i z e ( ) ;
}
protected void doStep ( ) {
t r a j e c t o r y . doStep ( ) ;
frame . setMessage ( "t="+decimalFormat . format ( t r a j e c t o r y . s t a t e [ 4 ] ) ) ;
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl . createApp (new ThreeBodyApp ( ) ) ;
}
}
Problem 5.16. Stability of solutions to the three-body problem
Examine the stability of the three solutions to the three-body problem by slightly varying the
initial velocity of one of the masses. Before passing your new initial state to trajectory.initialize,
calculate the center of mass velocity and subtract this velocity from every object. Show that any
instability is due to the physics and not to the numerical diﬀerential equation solver. Which
of the three analytic solutions is stable? Check conservation of the total energy and angular
momentum.
5.12 Projects
Project 5.17. Eﬀect of a “solar wind”
(a) Assume that a satellite is aﬀected not only by the Earth’s gravitational force, but also by a
weak uniform “solar wind” of magnitude W acting in the horizontal direction. The equa-
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 137
tions of motion can be written as
d2x
dt2
= −
GMx
r3
+ W (5.32a)
d2y
dt2
= −
GMy
r3
. (5.32b)
Choose initial conditions so that a circular orbit would be obtained for W = 0. Then choose
a value of W whose magnitude is about 3% of the acceleration due to the gravitational ﬁeld
and compute the orbit. How does the orbit change?
(b) Determine the change in the velocity space orbit when the solar wind (5.32) is applied. How
does the total angular momentum and energy change? Explain in simple terms the previously
observed change in the position space orbit. See Luehrmann for further discussion of
this problem.
Project 5.18. Resonances and the asteroid belt
(a) A histogram of the number of asteroids versus their distance from the sun shows some
distinct gaps. These gaps, called the Kirkwood gaps, are due to resonance eﬀects. That is, if
asteroids were in these gaps, their periods would be simple fractions of the period of Jupiter.
Modify class Planet2 so that planet two has the mass of Jupiter by setting GM1 = 0.001*GM.
Because the asteroid masses are very small compared to that of Jupiter, the gravitational
force on Jupiter due to the asteroids can be neglected. The initial conditions listed in Planet2
are approximately correct for Jupiter. The initial conditions for the asteroid (planet one in
Planet2) correspond to the 1/3 resonance (the period of the asteroid is one third that of
Jupiter). Run the program with these changes and describe the orbit of the asteroid.
(b) Use Kepler’s third law, T 2/a3 = constant, to determine the values of a, the asteroid’s semimajor
axis, such that the ratio of its period of revolution about the Sun to that of Jupiter is
1/2, 3/7, 2/5, and 2/3. Set the initial value of x(1) equal to a for each of these ratios and
choose the initial value of vy(1) so that the asteroid would have a circular orbit if Jupiter
was not present. Describe the orbits that you obtain.
(c) It is instructive to plot a as a function of time. However, because it is not straightforward to
measure a directly in the simulation, it is more convenient to plot the quantity −2GMm/E,
where E is the total energy of the asteroid and m is the mass of the asteroid. Because E is
proportional to m, the quantity −2GMm/E is independent of m. If the interaction of the
asteroid with Jupiter is ignored, it can be shown that a = −2GMm/E, where E is the asteroid
kinetic energy plus the asteroid-sun potential energy. Derive this result for circular
orbits. Plot the quantity −2GMm/E versus time for about thirty revolutions for the initial
conditions in Problem 5.18b.
(d) Compute the time dependence of −2GMm/E for asteroid orbits whose initial position x(1)
ranges from 2.0 to 5.0 in steps of 0.2. Choose the initial values of vy(1) so that circular
orbits would be obtained in the absence of Jupiter. Are there any values of x(1) for which
the time dependence of a is unusual?
(e) Make a histogram of the number of asteroids versus the value of −2GMm/E at t = 2000.
(You can use the HistogramFrame class described on page 206 if you wish.) Assume that
the initial value of x(1) ranges from 2.0 to 5.0 in steps of 0.02 and choose the initial values
of vy(1) as before. Use a histogram bin width of 0.1. If you have time, repeat for t = 5000
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 138
4.0
2.0
0.0
-2.0
-4.0
-4.0 -2.0 0.0 2.0 4.0
Figure 5.8: Orbits of the two electrons in the classical helium atom with the initial condition
r1 = (3,0),r2 = (1,0),v1 = (0,0.4), and v2 = (0,−1) (see Project 5.19c).
and compare the histogram with your previous results. Is there any evidence for Kirkwood
gaps? A resonance occurs when the periods of the asteroid and Jupiter are related by simple
fractions. We expect the number of asteroids with values of a corresponding to resonances
to be small.
(f) Repeat part (e) with initial velocities that vary from their values for a circular orbit by 1, 3,
and 5%.
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 139
Project 5.19. The classical helium atom
The classical helium atom is a relatively simple example of a three-body problem and is similar
to the gravitational three-body problem of a heavy sun and two light planets. The important
diﬀerence is that the two electrons repel one another, unlike the planetary case where the intraplanetary
interaction is attractive. If we ignore the small motion of the heavy nucleus, the
equations of motion for the two electrons can be written as
a1 = −2
r1
r3
1
+
r1 − r2
r3
12
(5.33a)
a2 = −2
r2
r3
2
+
r2 − r1
r3
12
(5.33b)
where r1 and r2 are measured from the ﬁxed nucleus at the origin, and r12 is the distance between
the two electrons. We have chosen units such that the mass and charge of the electron
are both unity. The charge of the helium nucleus is two in these units. Because the electrons
are sometimes very close to the nucleus, their acceleration can become very large, and a very
small time step ∆t is required. It is not eﬃcient to use the same small time step throughout
the simulation, and instead a variable time step or an adaptive step size algorithm is suggested.
An adaptive step size algorithm can be used with any standard numerical algorithm for solving
diﬀerential equations. The RK45 algorithm described in Appendix 3A is adaptive and is a good
all-around choice for these types of problems.
(a) For simplicity, we restrict our atom to two dimensions. Modify Planet2 to simulate the classical
helium atom. Choose units such that the electron mass is one and the other constants
are absorbed into the unit of charge so that the force between two electrons is
|F| =
1
r2
. (5.34)
Choose the initial value of the time step to be ∆t = 0.001. Some of the possible orbits are
similar to those we have seen in our mini-solar system. For example, try the initial condition
r1 = (2,0),r2 = (−1,0),v1 = (0,0.95), and v2 = (0,−1).
(b) Most initial conditions result in unstable orbits in which one electron eventually leaves
the atom (autoionization). The initial condition r1 = (1.4,0),r2 = (−1,0),v1 = (0,0.86), and
v2 = (0,−1) gives “braiding” orbits. Make small changes in this initial condition to observe
autoionization.
(c) The classical helium atom is capable of very complex orbits (see Figure 5.8). Investigate the
motion for the initial condition r1 = (3,0),r2 = (1,0),v1 = (0,0.4), and v2 = (0,−1). Does the
motion conserve the total angular momentum? Also try r1 = (2.5,0),r2 = (1,0),v1 = (0,0.4),
and v2 = (0,−1).
(d) Choose the initial condition r1 = (2,0),r2 = (−1,0), and v2 = (0,−1). Then vary the initial
value of v1 from (0.6,0) to (1.3,0) in steps of ∆v = 0.02. For each set of initial conditions,
calculate the time it takes for autoionization. Assume that ionization occurs when either
electron exceeds a distance of six from the nucleus. Run each simulation for a maximum
time of 2000. Plot the ionization time versus v1x. Repeat for a smaller interval of ∆v centered
about one of the longer ionization times. These calculations require much computer
resources. Do the two plots look similar? If so, such behavior is called “self-similar” and is
characteristic of chaotic systems and the geometry of fractals (see Chapters 6 and 13). More
discussion on the nature of the orbits can be found in Yamamoto and Kaneko.
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 140
References and Suggestions for Further Reading
Harold Abelson, Andrea diSessa, and Lee Rudolph, “Velocity space and the geometry of planetary
orbits,” Am. J. Phys. 43, 579–589 (1975). See also Andrea diSessa, “Orbit: a minienvironment
for exploring orbital mechanics,” in O. Lecarme and R. Lewis, editors, Computers
in Education, 359 (North–Holland, 1975). Detailed geometrical rather than calculusbased
arguments on the origin of closed orbits for inverse-square forces are presented.
Sections 5.7 and 5.8 are based on these papers.
Ralph Baierlein, Newtonian Dynamics (McGraw–Hill, 1983). An intermediate level text on
mechanics. Of particular interest are the discussions on the stability of circular orbits and
the eﬀects of an oblate sun.
John J. Brehm and William J. Mullin, Introduction to the Structure of Matter (John Wiley &
Sons, 1989). See Section 3-4 for a discussion of Rutherford scattering.
Alain Chenciner and Richard Montgomery, “A remarkable periodic solution of the three-body
problem in the case of equal masses,” Annals of Mathematics 152, 881–901 (2000).
J. M. A. Danby, Computer Modeling: From Sports To Spaceﬂight . . . From Order To Chaos
(William-Bell, 1997). See Chapter 11 for a discussion of orbits including an excellent treatment
of the Lagrange points.
R. P. Feynman, R. B. Leighton, M. Sands, The Feynman Lectures in Physics, Vol. 1 (Addison–
Wesley, 1963). See Chapter 9.
A. P. French, Newtonian Mechanics (W. W. Norton & Company, 1971). An introductory level
text with more than a cursory treatment of planetary motion.
Ian R. Gatland, “Numerical integration of Newton’s equations including velocity-dependent
forces,” Am J. Phys. 62, 259 (1994). The author chooses a variable time step based on
the diﬀerence in the calculation of the positions rather than the energy as we did in
Project 5.19.
Herbert Goldstein, Charles P. Poole, and John L. Safko, Classical Mechanics, 3rd ed. (Addison–
Wesley, 2002). Chapter 3 has an excellent discussion of the Kepler problem and the conditions
for a closed orbit.
Myron Lecar and Fred A. Franklin, “On the original distribution of the asteroids. I,” Icarus 20,
422–436 (1973). The authors use simulations of the motions of asteroids and discuss the
Kirkwood gaps.
Arthur W. Luehrmann, “Orbits in the solar wind – a mini-research problem,” Am. J. Phys.
42, 361 (1974). Luehrmann emphasizes the desirability of student problems requiring
inductive rather than deductive reasoning.
Jerry B. Marion and Stephen T. Thornton, Classical Dynamics, 5th ed. (Harcourt, 2004). Chapter
8 discusses central force motion, the precession of the Mercury, and the stability of
circular orbits.
Michael McCloskey, “Intuitive physics,” Sci. Am. 248 (4), 122–130 (1983). A discussion of the
counterintuitive nature of Newton’s laws.
CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 141
John R. Merrill and Richard A. Morrow, “An introductory scattering experiment by simulation,”
Am. J. Phys. 38, 1104–1107 (1970).
C. Moore, “Braids in classical gravity,” Phys. Rev. Lett. 70, 3675 (1993).
Bernard Schutz, Gravity From the Ground Up (Cambridge University Press, 2003). The associated
website, <www.gravityfromthegroundup.org/>, has many Java programs including
a simulation of the orbit of a planet around a black hole or neutron star, using the equation
of motion appropriate for general relativity.
Tomomyuki Yamamoto and Kunihiko Kaneko, “Helium atom as a classical three-body problem,”
Phys. Rev. Lett. 70, 1928 (1993).
Chapter 6
The Chaotic Motion of Dynamical
Systems
We study simple nonlinear deterministic models that exhibit chaotic behavior. We will ﬁnd that
the use of the computer to do numerical experiments will help us gain insight into the nature
of chaos.
6.1 Introduction
Most natural phenomena are intrinsically nonlinear. Weather patterns and the turbulent motion
of ﬂuids are everyday examples. Although we have explored some of the properties of
nonlinear systems in Chapter 4, it is easier to introduce some of the important concepts in the
context of ecology. Our ﬁrst goal will be to motivate and analyze the one-dimensional diﬀerence
equation
xn+1 = 4rxn(1 − xn) (6.1)
where xn is the ratio of the population in the nth generation to a reference population. We
shall see that the dynamical properties of (6.1) are surprisingly intricate and have important
implications for the development of a more general description of nonlinear phenomena. The
signiﬁcance of the behavior of (6.1) is indicated by the following quote from the ecologist Robert
May:
“Its study does not involve as much conceptual sophistication as does elementary calculus. Such
study would greatly enrich the student’s intuition about nonlinear systems. Not only in research
but also in the everyday world of politics and economics we would all be better oﬀ if more people
realized that simple nonlinear systems do not necessarily possess simple dynamical properties.”
The study of chaos is of much current interest, but the phenomena is not new and has
been of interest, particularly to astronomers and mathematicians, for over one hundred years.
Much of the current interest is due to the use of the computer as a tool for making empirical
observations. We will use the computer in this spirit.
142
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 143
6.2 A Simple One-Dimensional Map
Imagine an island with an insect population that breeds in the summer and leaves eggs that
hatch the following spring. Because the population growth occurs at discrete times, it is appropriate
to model the population growth by a diﬀerence equation rather than by a diﬀerential
equation. A simple model of population growth that relates the population in generation n + 1
to the population in generation n is given by
Pn+1 = aPn (6.2)
where Pn is the population in generation n and a is a constant. In the following, we will assume
that the time interval between generations is unity and will refer to n as the time.
If a < 1, the population decreases at each generation, and eventually the population becomes
extinct. If a > 1, each generation will be a times larger than the previous one. In this case
(6.2) leads to geometrical growth and an unbounded population. Although the unbounded
nature of geometrical growth is clear, it is remarkable that most of us do not integrate our understanding
of geometrical growth into our everyday lives. Can a bank pay 4% interest each
year indeﬁnitely? Can the world’s human population grow at a constant rate forever?
It is natural to formulate a more realistic model in which the population is bounded by the
ﬁnite carrying capacity of its environment. A simple model of density-dependent growth is
Pn+1 = Pn(a − bPn). (6.3)
Equation (6.3) is nonlinear due to the presence of the quadratic term in Pn. The linear term
represents the natural growth of the population; the quadratic term represents a reduction of
this natural growth caused, for example, by overcrowding or by the spread of disease.
It is convenient to rescale the population by letting Pn = (a/b)xn and rewriting (6.3) as
xn+1 = axn(1 − xn). (6.4)
The replacement of Pn by xn changes the units used to deﬁne the various parameters. To write
(6.4) in the standard form (6.1), we deﬁne the parameter r = a/4 and obtain
xn+1 = f (xn) = 4rxn(1 − xn). (6.5)
The rescaled form (6.5) has the desirable feature that its dynamics are determined by a single
control parameter r instead of the two parameters a and b. Note that if xn > 1, xn+1 will be
negative. To avoid this nonphysical feature, we impose the conditions that x is restricted to the
interval 0 ≤ x ≤ 1 and 0 < r ≤ 1, respectively. Because the function f (x) deﬁned in (6.5) transforms
any point on the one-dimensional interval [0,1] into another point in the same interval,
the function f is called a one-dimensional map.
The form of f (x) in (6.5) is known as the logistic map. The logistic map is a simple example
of a dynamical system; that is, the map is a deterministic, mathematical prescription for ﬁnding
the future state of a system given its present state.
The sequence of values x0, x1, x2, ... is called the trajectory. To check your understanding,
suppose that the initial value of x0 or seed is x0 = 0.5 and r = 0.2. Do a calculation to show
that the trajectory is x1 = 0.2,x2 = 0.128,x3 = 0.089293,... . The ﬁrst thirty iterations of (6.5) are
shown for two values of r in Figure 6.1.
The class IterateMapApp computes the trajectory of the logistic map in (6.5). Note that
we have extended the AbstractCalculation class, which is appropriate because many of the
results of Sections 6.1–6.4 were discovered using a programmable calculator.
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 144
0 10 20 30
0.0
0.2
0.4
0.6
0.8
1.0
n
xn
(a)
0 10 20 30
0.0
0.2
0.4
0.6
0.8
1.0
n
xn
(b)
Figure 6.1: (a) The trajectory of x for r = 0.2 and x0 = 0.6. The stable ﬁxed point is at x = 0. (b)
The trajectory for r = 0.7 and x0 = 0.1. Note the initial transient behavior.
Listing 6.1: The IterateMapApp class iterates the logistic map and plots the resulting trajectory
package org . opensourcephysics . sip . ch06 ;
import org . opensourcephysics . frames . ;
import org . opensourcephysics . controls . ;
public class IterateMapApp extends AbstractCalculation {
int datasetIndex = 0;
PlotFrame plotFrame = new PlotFrame ( "iterations" , "x" ,
"trajectory" ) ;
public IterateMapApp ( ) {
/ / keep data between c a l l s to c a l c u l a t e
plotFrame . setAutoclear ( false ) ;
}
public void reset ( ) {
control . setValue ( "r" , 0 . 2 ) ;
control . setValue ( "x" , 0 . 6 ) ;
control . setValue ( "iterations" , 50);
datasetIndex = 0;
}
public void calculate ( ) {
double r = control . getDouble ( "r" ) ;
double x = control . getDouble ( "x" ) ;
int i t e r a t i o n s = control . getInt ( "iterations" ) ;
for ( int i = 0; i <=i t e r a t i o n s ; i ++) {
plotFrame . append ( datasetIndex , i , x ) ;
x = map( r , x ) ;
}
plotFrame . setMarkerSize ( datasetIndex , 1 ) ;
plotFrame . setXYColumnNames ( datasetIndex , "iteration" ,
"calc #"+datasetIndex ) ;
datasetIndex ++;
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 145
}
double map( double r , double x ) {
return 4 r x (1 −x ) ; / / i t e r a t e map
}
public s t a t i c void main ( String [ ] args ) {
CalculationControl . createApp (new IterateMapApp ( ) ) ;
}
}
Problem 6.1. The trajectory of the logistic map
(a) Explore the dynamical behavior of the logistic map in (6.5) with r = 0.24 for diﬀerent values
of x0. Show numerically that x = 0 is a stable ﬁxed point for this value of r. That is, the
iterated values of x converge to x = 0 independently of the value of x0. If x represents the
population of insects, describe the qualitative behavior of the population.
(b) Explore the dynamical behavior of (6.5) for r = 0.26,0.5,0.74, and 0.748. A ﬁxed point is
unstable if for almost all values of x0 near the ﬁxed point, the trajectories diverge from it.
Verify that x = 0 is an unstable ﬁxed point for r > 0.25. Show that for the suggested values
of r, the iterated values of x do not change after an initial transient; that is, the long time
dynamical behavior is period 1. In Appendix 6A we show that for r < 3/4 and for x0 in the
interval 0 < x0 < 1, the trajectories approach the stable attractor at x = 1 − 1/4r. The set of
initial points that iterate to the attractor is called the basin of the attractor. For the logistic
map, the interval 0 < x < 1 is the basin of attraction of the attractor x = 1 − 1/4r.
(c) Explore the dynamical properties of (6.5) for r = 0.752, 0.76, 0.8, and 0.862. For r = 0.752
and 0.862, approximately 1000 iterations are necessary to obtain convergent results. Show
that if r is greater than 0.75, x oscillates between two values after an initial transient behavior.
That is, instead of a stable cycle of period 1 corresponding to one ﬁxed point, the
system has a stable cycle of period 2. The value of r at which the single ﬁxed point x∗ splits
or bifurcates into two values x1
∗ and x2
∗ is r = b1 = 3/4. The pair of x values, x1
∗ and x2
∗,
form a stable attractor of period 2.
(d) What are the stable attractors of (6.5) for r = 0.863 and 0.88? What is the corresponding
period? What are the stable attractors and corresponding periods for r = 0.89, 0.891, and
0.8922?
Another way to determine the behavior of (6.5) is to plot the values of x as a function of
r (see Figure 6.2). The iterated values of x are plotted after the initial transient behavior is
discarded. Such a plot is generated by BifurcateApp. For each value of r, the ﬁrst ntransient
values of x are computed but not plotted. Then the next nplot values of x are plotted with the
ﬁrst half with the ﬁrst half in one color and the second half in another. This process is repeated
for a new value of r until the desired range of r values is reached. The magnitude of nplot
should be at least as large as the longest period that you wish to observe. BifurcateApp extends
AbstractSimulation rather than AbstractCalculation because the calculations can be time
consuming. For this reason you might want to stop them before they are ﬁnished and reset
some of the parameters.
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 146
0.0
0.5
1.0
iteratedvaluesofx
0.7 0.8 0.9 1.0
r
Figure 6.2: Bifurcation diagram of the logistic map. For each value of r, the iterated values of
xn are plotted after the ﬁrst 1000 iterations are discarded. Note the transition from periodic to
chaotic behavior and the narrow windows of periodic behavior within the region of chaos.
Listing 6.2: The BifurcateApp program generates a bifurcation plot of the logistic map
package org . opensourcephysics . sip . ch06 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . ;
public class BifurcateApp extends AbstractSimulation {
double r ; / / c o n t r o l parameter
double dr ; / / incremental change of r , suggest dr <= 0.01
int ntransient ; / / number of i t e r a t i o n s not p l o t t e d
int nplot ; / / number of i t e r a t i o n s p l o t t e d
PlotFrame plotFrame = new PlotFrame ( "r" , "x" ,
"Bifurcation diagram" ) ;
public BifurcateApp ( ) {
/ / small s i z e g i v e s b e t t e r r e s o l u t i o n
plotFrame . setMarkerSize (0 , 0 ) ;
plotFrame . setMarkerSize (1 , 0 ) ;
}
public void i n i t i a l i z e ( ) {
plotFrame . clearData ( ) ;
r = control . getDouble ( "initial r" ) ;
dr = control . getDouble ( "dr" ) ;
ntransient = control . getInt ( "ntransient" ) ;
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 147
nplot = control . getInt ( "nplot" ) ;
}
public void doStep ( ) {
i f ( r <1.0) {
double x = 0 . 5 ;
for ( int i = 0; i <ntransient ; i ++) { / / x values not p l o t t e d
x = map( x , r ) ;
}
/ / p l o t h a l f the points in d a t a s e t zero
for ( int i = 0; i <nplot /2; i ++) {
x = map( x , r ) ;
/ / shows d i f f e r e n t x values f o r given value of r
plotFrame . append (0 , r , x ) ;
}
/ / p l o t remaining points in d a t a s e t one
for ( int i = nplot /2+1; i <nplot ; i ++) {
x = map( x , r ) ;
/ / d a t a s e t one has a d i f f e r e n t c o l o r
plotFrame . append (1 , r , x ) ;
i ++;
}
r += dr ;
}
}
public void reset ( ) {
control . setValue ( "initial r" , 0 . 2 ) ;
control . setValue ( "dr" , 0.005);
control . setValue ( "ntransient" , 200);
control . setValue ( "nplot" , 50);
}
double map( double x , double r ) {
return 4 r x (1 −x ) ;
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl . createApp (new BifurcateApp ( ) ) ;
}
}
Problem 6.2. Qualitative features of the logistic map
(a) Use BifurcateApp to identify period 2, period 4, and period 8 behavior as can be seen in
Figure 6.2. Choose ntransient ≥ 1000. It might be necessary to “zoom in” on a portion of
the plot. How many period doublings can you ﬁnd?
(b) Change the scale so that you can follow the iterations of x from period 4 to period 16 behavior.
How does the plot look on this scale in comparison to the original scale?
(c) Describe the shape of the trajectory near the bifurcations from period 2 to period 4, period
4 to period 8, etc. These bifurcations are frequently called pitchfork bifurcations.
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 148
The bifurcation diagram in Figure 6.2 indicates that the period doubling behavior ends at
r ≈ 0.892. This value of r is known very precisely and is given by r = r∞ = 0.892486417967... .
At r = r∞, the sequence of period doublings accumulates to a trajectory of inﬁnite period. In
Problem 6.3 we explore the behavior of the trajectories for r > r∞.
Problem 6.3. Chaotic behavior
(a) For r > r∞, two initial conditions that are very close to one another can yield very diﬀerent
trajectories after a few iterations. As an example, choose r = 0.91 and consider x0 = 0.5 and
0.5001. How many iterations are necessary for the iterated values of x to diﬀer by more than
ten percent? What happens for r = 0.88 for the same choice of seeds?
(b) The accuracy of ﬂoating point numbers retained on a digital computer is ﬁnite. To test the
eﬀect of the ﬁnite accuracy of your computer, choose r = 0.91 and x0 = 0.5 and compute
the trajectory for 200 iterations. Then modify your program so that after each iteration,
the operation x = x/10 is followed by x = 10*x. This combination of operations truncates
the last digit that your computer retains. Compute the trajectory again and compare your
results. Do you ﬁnd the same discrepancy for r < r∞?
(c) What are the dynamical properties for r = 0.958? Can you ﬁnd other windows of periodic
behavior in the interval r∞ < r < 1?
6.3 Period Doubling
The results of the numerical experiments that we did in Section 6.2 probably have convinced
you that the dynamical properties of a simple, nonlinear deterministic system can be quite
complicated.
To gain more insight into how the dynamical behavior depends on r, we introduce a simple
graphical method for iterating (6.5). In Figure 6.3 we show a graph of f (x) versus x for r = 0.7. A
diagonal line corresponding to y = x intersects the curve y = f (x) at the two ﬁxed points x∗ = 0
and x∗ = 9/14 ≈ 0.642857 [see (6.6b)]. If x0 is not a ﬁxed point, we can ﬁnd the trajectory in
the following way. Draw a vertical line from (x = x0,y = 0) to the intersection with the curve
y = f (x) at (x0,y0 = f (x0)). Next draw a horizontal line from (x0,y0) to the intersection with the
diagonal line at (y0,y0). On this diagonal line y = x, and hence the value of x at this intersection
is the ﬁrst iteration x1 = y0. The second iteration x2 can be found in the same way. From the
point (x1,y0), draw a vertical line to the intersection with the curve y = f (x). Keep y ﬁxed at
y = y1 = f (x1), and draw a horizontal line until it intersects the diagonal line; the value of x at
this intersection is x2. Further iterations can be found by repeating this process.
This graphical method is illustrated in Figure 6.3 for r = 0.7 and x0 = 0.9. If we begin with
any x0 (except x0 = 0 and x0 = 1), the iterations will converge to the ﬁxed point x∗ ≈ 0.643. It
would be a good idea to repeat the procedure shown in Figure 6.3 by hand. For r = 0.7, the ﬁxed
point is stable (an attractor of period 1). In contrast, no matter how close x0 is to the ﬁxed point
at x = 0, the iterates diverge away from it, and this ﬁxed point is unstable.
How can we explain the qualitative diﬀerence between the ﬁxed point at x = 0 and at x∗ =
0.642857 for r = 0.7? The local slope of the curve y = f (x) determines the distance moved
horizontally each time f is iterated. A slope steeper than 45◦ leads to a value of x further away
from its initial value. Hence, the criterion for the stability of a ﬁxed point is that the magnitude
of the slope at the ﬁxed point must be less than 45◦. That is, if |df (x)/dx|x=x∗ is less than unity,
then x∗ is stable; conversely, if |df (x)/dx|x=x∗ is greater than unity, then x∗ is unstable.
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 149
0.0 0.5 1.0x
0.0
0.5
1.0
y
x0
Figure 6.3: Graphical representation of the iteration of the logistic map (6.5) with r = 0.7 and
x0 = 0.9. Note that the graphical solution converges to the ﬁxed point x∗ ≈ 0.643.
An inspection of f (x) in Figure 6.3 shows that x = 0 is unstable because the slope of f (x) at
x = 0 is greater than unity. In contrast, the magnitude of the slope of f (x) at x = x∗ ≈ 0.643 is
less than unity, and this ﬁxed point is stable. In Appendix 6A we show that
x∗
= 0 is stable for 0 < r < 1/4 (6.6a)
and
x∗
= 1 −
1
4r
is stable for 1/4 < r < 3/4. (6.6b)
Thus for 0 < r < 3/4, the behavior after many iterations is known.
What happens if r is greater than 3/4? We found in Section 6.2 that if r is slightly greater
than 3/4, the ﬁxed point of f becomes unstable and bifurcates to a cycle of period 2. Now x
returns to the same value after every second iteration, and the ﬁxed points of f f (x) are the
stable attractors of f (x). In the following, we write f (2)(x) = f f (x) and f (n)(x) for the nth
iterate of f (x). (Do not confuse f (n)(x) with the nth derivative of f (x).) For example, the second
iterate f (2)(x) is given by the fourth-order polynomial:
f (2)
(x) = 4r 4rx(1 − x) − 4r 4rx(1 − x)
2
= 4r[4rx(1 − x)] 1 − 4rx(1 − x)
= 16r2
x − 4rx3
+ 8rx2
− (1 + 4r)x + 1 . (6.7)
What happens if we increase r still further? Eventually the magnitude of the slope of the
ﬁxed points of f (2)(x) exceeds unity, and the ﬁxed points of f (2)(x) become unstable. Now
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 150
f(3)
f(2)
f(3) f(3)
f(2)
f(1)
f(3)
f(2)
f(3) answer
f(1)
f(2)
f(3)
(a) (b) (c) (d) (e) (f)
Figure 6.4: Example of the calculation of f(0.4,0.8,3) using the recursive function deﬁned
in GraphicalSolutionApp. The number in each box is the value of the variable iterate. The
computer executes code from left to right, and each box represents a copy of the function in the
computer’s memory. The input values x = 0.4 and r = 0.8, which are the same in each copy, are
not shown. The arrows indicate when a copy is ﬁnished and its value is returned to one of the
other copies. Notice that the ﬁrst copy of the function f (3) is the last one to ﬁnish. The value of
f(x,r,3) = 0.7842.
the cycle of f is period 4, and the ﬁxed points of the fourth iterate f (4)(x) = f (2) f (2)(x) =
f f f (f (x) are stable. These ﬁxed points also eventually become unstable, and we are led to
the phenomena of period doubling that we observed in Problem 6.2.
GraphicalSolutionApp implements the graphical analysis of the iterations of f (x). The
nth-order iterates are deﬁned in f(x,r,iterate), a recursive method. (The parameter iterate
is 1, 2, and 4 for the functions f (x), f (2)(x), and f (4)(x), respectively.) Recursion is an idea that
is simple once you understand it, but it can be diﬃcult to grasp initially. Although the method
calls itself, the rules for method calls remain the same. Imagine that a recursive method is
called. The computer then starts to execute the code in the method, but comes to another call
of the same method as itself. At this point the computer stops executing the code of the original
method, and makes an exact copy of the method with possibly diﬀerent input parameters, and
starts executing the code in the copy. There are now two possibilities. One is that the computer
comes to the end of the copy without another recursive call. In that case the computer deletes
the copy of the method and continues executing the code in the original method. The other
possibility is that a recursive call is made in the copy, and a third copy is made of the method,
and the code in the third copy is now executed. This process continues until the code in all the
copies is executed. Every recursive method must have a possibility of reaching the end of the
method; otherwise, the program will eventually crash.
To understand the method f(x,r,iterate), suppose we want to compute f(0.4,0.8,3).
First we write f(0.4,0.8,3) as in Figure 6.4a. Follow the statements within the method until
another call to f(0.4,0.8,iterate) occurs. In this case, the call is to f(0.4,0.8,iterate-1)
which equals f(0.4,0.8,2). Write f(0.4,0.8,2) above f(0.4,0.8,3) (see Figure 6.4b). When
you come to the end of the deﬁnition of the method, write down the value of f that is actually
returned, and remove the method from the stack by crossing it out (see Figure 6.4d). This
returned value for f equals y if iterate > 1, or it is the output of the method for iterate = 1.
Continue deleting copies of f as they are ﬁnished, until there are no copies left on the paper.
The ﬁnal value of f is the value returned by the computer. Write a short program that deﬁnes
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 151
f(x,r,iterate) and prints the value of f(0.4,0.8,3). Is the answer the same as your hand
calculation?
Listing 6.3: GraphicalSolutionApp displays the graphical solution of the logistic map trajec-
tory
package org . opensourcephysics . sip . ch06 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . PlotFrame ;
public class GraphicalSolutionApp extends AbstractSimulation {
PlotFrame plotFrame = new PlotFrame ( "iterations" , "x" ,
"graphical solution" ) ;
double r ; / / c o n t r o l parameter
int i t e r a t e ; / / i t e r a t e of f ( x )
double x , y ;
double x0 , y0 ;
public GraphicalSolutionApp ( ) {
plotFrame . setPreferredMinMax (0 , 1 , 0 , 1 ) ;
plotFrame . setConnected ( true ) ;
plotFrame . setXPointsLinked ( true ) ;
/ / second argument i n d i c a t e s no marker
plotFrame . setMarkerShape (2 , 0 ) ;
}
public void reset ( ) {
control . setValue ( "r" , 0 . 8 9 ) ;
control . setValue ( "x" , 0 . 2 ) ;
plotFrame . setMarkerShape (0 , 0 ) ;
control . setAdjustableValue ( "iterate" , 1 ) ;
}
public void i n i t i a l i z e ( ) {
r = control . getDouble ( "r" ) ;
x = control . getDouble ( "x" ) ;
i t e r a t e = control . getInt ( "iterate" ) ;
x0 = x ;
y0 = 0;
clear ( ) ;
}
public void startRunning ( ) {
i f ( i t e r a t e != control . getInt ( "iterate" ) ) {
i t e r a t e = control . getInt ( "iterate" ) ;
clear ( ) ;
}
r = control . getDouble ( "r" ) ;
}
public void doStep ( ) {
y = f ( x , r , i t e r a t e ) ;
plotFrame . append (1 , x0 , y0 ) ;
plotFrame . append (1 , x0 , y ) ;
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 152
plotFrame . append (1 , y , y ) ;
x = x0 = y0 = y ;
control . setValue ( "x" , x ) ;
}
void drawFunction ( ) {
int nplot = 200; / / # of points at which function computed
double delta = 1.0/ nplot ;
double x = 0;
double y = 0;
for ( int i = 0; i <=nplot ; i ++) {
y = f ( x , r , i t e r a t e ) ;
plotFrame . append (0 , x , y ) ;
x += delta ;
}
}
void drawLine ( ) { / / draws l i n e y = x
for ( double x = 0; x<1;x += 0.001) {
plotFrame . append (2 , x , x ) ;
}
}
public double f ( double x , double r , int i t e r a t e ) {
i f ( iterate >1) {
double y = f ( x , r , iterate −1);
return 4 r y (1 −y ) ;
} else {
return 4 r x (1 −x ) ;
}
}
public void clear ( ) {
plotFrame . clearData ( ) ;
drawFunction ( ) ;
drawLine ( ) ;
plotFrame . repaint ( ) ;
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl control = SimulationControl . createApp (
new GraphicalSolutionApp ( ) ) ;
control . addButton ( "clear" , "Clear" , "Clears the trajectory." ) ;
}
}
Problem 6.4. Qualitative properties of the ﬁxed points
(a) Use GraphicalSolutionApp to show graphically that there is a single stable ﬁxed point of
f (x) for r < 3/4. It would be instructive to modify the program so that the value of the slope
df /dx|x=xn
is shown as you step each iteration. At what value of r does the absolute value
of this slope exceed unity? Let b1 denote the value of r at which the ﬁxed point of f (x)
bifurcates and becomes unstable. Verify that b1 = 0.75.
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 153
(b) Describe the trajectory of f (x) for r = 0.785. Is the ﬁxed point given by x = 1−1/4r stable or
unstable? What is the nature of the trajectory if x0 = 1−1/4r? What is the period of f (x) for
all other choices of x0? What are the values of the two-point attractor?
(c) The function f (x) is symmetrical about x = 1/2 where f (x) is a maximum. What are the
qualitative features of the second iterate f (2)(x) for r = 0.785? Is f (2)(x) symmetrical about
x = 1/2? For what value of x does f (2)(x) have a minimum? Iterate xn+1 = f (2)(xn) for
r = 0.785 and ﬁnd its two ﬁxed points x1
∗ and x2
∗. (Try x0 = 0.1 and x0 = 0.3.) Are the ﬁxed
points of f (2)(x) stable or unstable for this value of r? How do these values of x1
∗ and x2
∗
compare with the values of the two-point attractor of f (x)? Verify that the slopes of f (2)(x)
at x1
∗ and x2
∗ are equal.
(d) Verify the following properties of the ﬁxed points of f (2)(x). As r is increased, the ﬁxed
points of f (2)(x) move apart, and the slope of f (2)(x) at its ﬁxed points decreases. What is
the value of r = s2 at which one of the two ﬁxed points of f (2) equals 1/2? What is the value
of the other ﬁxed point? What is the slope of f (2)(x) at x = 1/2? What is the slope at the
other ﬁxed point? As r is further increased, the slopes at the ﬁxed points become negative.
Finally at r = b2 ≈ 0.8623, the slopes at the two ﬁxed points of f (2)(x) equal −1, and the two
ﬁxed points of f (2) become unstable. (The exact value of b2 is b2 = (1 +
√
6)/4.)
(e) Show that for r slightly greater than b2, for example r = 0.87, there are four stable ﬁxed
points of f (4)(x). What is the value of r = s3 when one of the ﬁxed points equals 1/2? What
are the values of the three other ﬁxed points at r = s3?
(f) Determine the value of r = b3 at which the four ﬁxed points of f (4) become unstable.
(g) Choose r = s3 and determine the number of iterations that are necessary for the trajectory
to converge to period 4 behavior. How does this number of iterations change when neighboring
values of r are considered? Choose several values of x0 so that your results do not
depend on the initial conditions.
Problem 6.5. Periodic windows in the chaotic regime
(a) If you look closely at the bifurcation diagram in Figure 6.2, you will see that the range of
chaotic behavior for r > r∞ is interrupted by intervals of periodic behavior. Magnify your
bifurcation diagram so that you can look at the interval 0.957107 ≤ r ≤ 0.960375, where
a periodic trajectory of period 3 occurs. (Period 3 behavior starts at r = (1 +
√
8)/4.) What
happens to the trajectory for slightly larger r, for example, r = 0.9604?
(b) Plot f (3)(x) versus x at r = 0.96, a value of r in the period 3 window. Draw the line y = x
and determine the intersections with f (3)(x). The stable ﬁxed points satisfy the condition
x∗ = f (3)(x∗). Because f (3)(x) is an eighth-order polynomial, there are eight solutions (including
x = 0). Find the intersections of f (3)(x) with y = x and identify the three stable ﬁxed
points. What are the slopes of f (3)(x) at these points? Then decrease r to r = 0.957107,
the (approximate) value of r below which the system is chaotic. Draw the line y = x and
determine the number of intersections with f (3)(x). Note that at this value of r, the curve
y = f (3)(x) is tangent to the diagonal line at the three stable ﬁxed points. For this reason,
this type of transition is called a tangent bifurcation. Note that there is also an unstable point
at x ≈ 0.76.
(c) Plot xn+1 = f (3)(xn) versus n for r = 0.9571, a value of r just below the onset of period 3 behavior.
How would you describe the behavior of the trajectory? This type of chaotic motion
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 154
k bk
1 0.750 000
2 0.862 372
3 0.886 023
4 0.891 102
5 0.892 190
6 0.892 423
7 0.892 473
8 0.892 484
Table 6.1: Values of the control parameter r = bk for the onset of the kth bifurcation. Six decimal
places are shown.
is an example of intermittency; that is, nearly periodic behavior interrupted by occasional
irregular bursts.
(d) To understand the mechanism for the intermittent behavior, we need to “zoom in” on the
values of x near the stable ﬁxed points that you found in part (c). To do so change the
arguments of the setPreferredMinMax method. You will see a narrow channel between the
diagonal line y = x and the plot of f (3)(x) near each ﬁxed point. The trajectory can require
many iterations to squeeze through the channel, and we see apparent period 3 behavior
during this time. Eventually, the trajectory escapes from the channel and bounces around
until it is again enters a channel at some unpredictable later time.
6.4 Universal Properties and Self-Similarity
In Sections 6.2 and 6.3 we found that the trajectory of the logistic map has remarkable properties
as a function of the control parameter r. In particular, we found a sequence of period
doublings accumulating in a chaotic trajectory of inﬁnite period at r = r∞. For most values of
r > r∞, the trajectory is very sensitive to the initial conditions. We also found “windows” of
period 3, 6, 12, ... embedded in the range of chaotic behavior. How typical is this type of behavior?
In the following, we will ﬁnd further numerical evidence that the general behavior of
the logistic map is independent of the details of the form (6.5) of f (x).
You might have noticed that the range of r between successive bifurcations becomes smaller
as the period increases (see Table 6.1). For example, b2 − b1 = 0.112398, b3 − b2 = 0.023624 and
b4 − b3 = 0.00508. A good guess is that the decrease in bk − bk−1 is geometric; that is, the ratio
(bk − bk−1)/(bk+1 − bk) is a constant. You can check that this ratio is not exactly constant, but
converges to a constant with increasing k. This behavior suggests that the sequence of values of
bk has a limit and follows a geometrical progression:
bk ≈ r∞ − Cδ−k
, (6.8)
where δ is known as the Feigenbaum number and C is a constant. From (6.8) it is easy to show
that δ is given by the ratio
δ = lim
k→∞
bk − bk−1
bk+1 − bk
. (6.9)
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 155
0.7 0.8 0.9
0.3
0.5
0.7
0.9
r
y
M1
M2
Figure 6.5: The ﬁrst few bifurcations of the logistic equation showing the scaling of the maximum
distance Mk between the asymptotic values of x describing the bifurcation.
Problem 6.6. Estimation of the Feigenbaum constant
(a) Derive the relation (6.9) given (6.8). Plot δk = (bk −bk−1)/(bk+1 −bk) versus k using the values
of bk in Table 6.1 and determine the value of δ. Is the number of decimal places given in
Table 6.1 for bk suﬃcient for all the values of k shown? The best numerical determination
of δ is
δ = 4.669201609102991... . (6.10)
The number of decimal places in (6.10) is shown to indicate that δ is known precisely. Use
(6.8) and (6.10) and the values of bk to determine the value of r∞.
(b) In Problem 6.4 we found that one of the four ﬁxed points of f (4)(x) is at x∗ = 1/2 for r =
s3 ≈ 0.87464. We also found that the convergence to the ﬁxed points of f (4)(x) for this
value of r is more rapid than at nearby values of r. In Appendix 6A we show that these
superstable trajectories occur whenever one of the ﬁxed points is at x∗ = 1/2. The values
of r = sm that give superstable trajectories of period 2m−1 are much better deﬁned than
the points of bifurcation, r = bk. The rapid convergence to the ﬁnal trajectories also gives
better numerical results, and we always know one member of the trajectory, namely x = 1/2.
Assume that δ can be deﬁned as in (6.9) with bk replaced by sm. Use s1 = 0.5, s2 ≈ 0.809017,
and s3 = 0.874640 to determine δ. The numerical values of sm are found in Project 6.22 by
solving the equation f (m)(x = 1/2) = 1/2 numerically; the ﬁrst eight values of sm are listed in
Table 6.2 in Section 6.11.
We can associate another number with the series of “pitchfork” bifurcations. From Figures
6.3 and 6.5, we see that each pitchfork bifurcation gives birth to “twins” with the new
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 156
0.7 0.8 0.9
0.3
0.5
0.7
0.9
r
y
d1
d2
Figure 6.6: The quantity dk is the distance from x∗ = 1/2 to the nearest element of the attractor
of period 2k. It is convenient to use this quantity to determine the exponent α.
generation more densely packed than the previous generation. One measure of this density is
the maximum distance Mk between the values of x describing the bifurcation (see Figure 6.5).
The disadvantage of using Mk is that the transient behavior of the trajectory is very long at the
boundary between two diﬀerent periodic behaviors. A more convenient measure of the distance
is the quantity dk = xk
∗ − 1/2, where xk* is the value of the ﬁxed point nearest to the ﬁxed point
x∗ = 1/2. The ﬁrst two values of dk are shown in Figure 6.6 with d1 ≈ 0.3090 and d2 ≈ −0.1164.
The next value is d3 ≈ 0.0460. Note that the ﬁxed point nearest to x = 1/2 alternates from one
side of x = 1/2 to the other. We deﬁne the quantity α by the ratio
α = lim
k→∞
−
dk
dk+1
. (6.11)
The ratios α = (0.3090/0.1164) = 2.65 for k = 1 and α = (0.1164/0.0460) = 2.53 for k = 2 are
consistent with the asymptotic value α = 2.5029078750958928485...
We now give qualitative arguments that suggest that the general behavior of the logistic
map in the period doubling regime is independent of the detailed form of f (x). As we have
seen, period doubling is characterized by self-similarities, for example, the period doublings
look similar except for a change of scale. We can demonstrate these similarities by comparing
f (x) for r = s1 = 0.5 for the superstable trajectory with period 1 to the function f (2)(x) for r =
s2 ≈ 0.809017 for the superstable trajectory of period 2 (see Figure 6.7). The function f (x,r = s1)
has unstable ﬁxed points at x = 0 and x = 1 and a stable ﬁxed point at x = 1/2. Similarly,
the function f (2)(x,r = s2) has a stable ﬁxed point at x = 1/2 and an unstable ﬁxed point at
x ≈ 0.69098. Note the similar shape but diﬀerent scale of the curves in the square boxes in
part (a) and part (b) of Figure 6.7. This similarity is an example of scaling. That is, if we scale
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 157
0.5 1.0
x
0.5
1.0
0.0
f(x)
(a)
0.5 1.0
x
0.5
1.0
0.0
f(2)
(x)
(b)
Figure 6.7: Comparison of f (x,r) for r = s1 with the second iterate f (2)(x) for r = s2. (a) The
function f (x,r = s1) has unstable ﬁxed points at x = 0 and x = 1 and a stable ﬁxed point at
x = 1/2. (b) The function f (2)(x,r = s1) has a stable ﬁxed point at x = 1/2. The unstable ﬁxed
point of f (2)(x) nearest to x = 1/2 occurs at x ≈ 0.69098, where the curve f (2)(x) intersects the
line y = x. The upper right-hand corner of the square box in (b) is located at this point, and the
center of the box is at (1/2,1/2). Note that if we reﬂect this square about the point (1/2,1/2),
the shape of the reﬂected graph in the square box is nearly the same as it is in part (a) but on a
smaller scale.
f (2) and change (renormalize) the value of r, we can compare f (2) to f . (See Chapter 12 for a
discussion of scaling and renormalization in another context.)
This graphical comparison is meant only to be suggestive. A precise approach shows that if
we continue the comparison of the higher-order iterates, for example, f (4)(x) to f (2)(x), etc., the
superposition of functions converges to a universal function that is independent of the form of
the original function f (x).
Problem 6.7. Further determinations of the exponents α and δ
(a) Determine the appropriate scaling factor and superimpose f and the rescaled form of f (2)
found in Figure 6.7.
(b) Use arguments similar to those discussed in the text and in Figure 6.7 and compare the
behavior of f (4)(x,r = s3) in the square about x = 1/2 with f (2)(x,r = s2) in its square about
x = 1/2. The size of the squares are determined by the unstable ﬁxed point nearest to x = 1/2.
Find the appropriate scaling factor and superimpose f (2) and the rescaled form of f (4).
∗Problem 6.8. Other one-dimensional maps
It is easy to modify your programs to consider other one-dimensional maps. Determine the
qualitative properties of the one-dimensional maps
f (x) = xer(1−x)
(6.12)
f (x) = r sinπx. (6.13)
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 158
Do they also exhibit the period doubling route to chaos? The map in (6.12) has been used
by ecologists (cf. May) to study a population that is limited at high densities by the eﬀect of
epidemics. Although it is more complicated than (6.5), its advantage is that the population
remains positive no matter what (positive) value is taken for the initial population. There are
no restrictions on the maximum value of r, but if r becomes suﬃciently large, x eventually
becomes eﬀectively zero. What is the behavior of the time series of (6.12) for r = 1.5,2, and 2.7?
Describe the qualitative behavior of f (x). Does it have a maximum?
The sine map (6.13) with 0 < r ≤ 1 and 0 ≤ x ≤ 1 has no special signiﬁcance, except that it
is nonlinear. If time permits, determine the approximate value of δ for both maps. What limits
the accuracy of your determination of δ?
The above qualitative arguments and numerical results suggest that the quantities α and
δ are universal; that is, independent of the detailed form of f (x). In contrast, the values of the
accumulation point r∞ and the constant C in (6.8) depend on the detailed form of f (x). Feigenbaum
has shown that the period doubling route to chaos and the values of δ and α are universal
properties of maps that have a quadratic maximum; that is, f (x)|x=xm
= 0 and f (x)|x=xm
< 0.
Why is the universality of period doubling and the numbers δ and α more than a curiosity?
The reason is that because this behavior is independent of the details, there might exist realistic
systems whose underlying dynamics yield the same behavior as the logistic map. Of course,
most physical systems are described by diﬀerential rather than diﬀerence equations. Can these
systems exhibit period doubling behavior? Several workers (cf. Testa et al.) have constructed
nonlinear RLC circuits driven by an oscillatory source voltage. The output voltage shows bifurcations,
and the measured values of the exponents δ and α are consistent with the predictions
of the logistic map.
Of more general interest is the nature of turbulence in ﬂuid systems. Consider a stream
of water ﬂowing past several obstacles. We know that at low ﬂow speeds, the water ﬂows past
obstacles in a regular and time-independent fashion called laminar ﬂow. As the ﬂow speed is
increased (as measured by a dimensionless parameter called the Reynolds number) some swirls
develop, but the motion is still time independent. As the ﬂow speed is increased still further,
the swirls break away and start moving downstream. The ﬂow pattern as viewed from the bank
becomes time-dependent. For still larger ﬂow speeds, the ﬂow pattern becomes very complex
and looks random. We say that the ﬂow pattern has made a transition from laminar ﬂow to
turbulent ﬂow.
This qualitative description of the transition to chaos in ﬂuid systems is superﬁcially similar
to the description of the logistic map. Can ﬂuid systems be analyzed in terms of the simple
models of the type we have discussed here? In a few instances such as turbulent convection
in a heated saucepan, period doubling and other types of transitions to turbulence have been
observed. The type of theory and analysis we have discussed has suggested new concepts and
approaches, and the study of turbulent ﬂow is a subject of much current interest.
6.5 Measuring Chaos
How do we know if a system is chaotic? The most important characteristic of chaos is sensitivity
to initial conditions. In Problem 6.3, for example, we found that the trajectories starting from
x0 = 0.5 and x0 = 0.5001 for r = 0.91 become very diﬀerent after a small number of iterations.
Because computers only store ﬂoating numbers to a certain number of digits, the implication of
this result is that our numerical predictions of the trajectories of chaotic systems are restricted
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 159
10-8
10-6
10-4
10-2
|∆xn|
0 10 20 30 40 50
n
Figure 6.8: The evolution of the diﬀerence ∆xn between the trajectories of the logistic map at
r = 0.91 for x0 = 0.5 and x0 = 0.5001. The separation between the two trajectories increases with
n, the number of iterations, if n is not too large. (Note that |∆x1| ∼ 10−8 and that the trend is not
monotonic.)
to small time intervals. That is, sensitivity to initial conditions implies that even though the logistic
map is deterministic, our ability to make numerical predictions of its trajectory is limited.
How can we quantify this lack of predictably? In general, if we start two identical dynamical
systems from slightly diﬀerent initial conditions, we expect that the diﬀerence between the
trajectories will increase as a function of n. In Figure 6.8 we show a plot of the diﬀerence |∆xn|
versus n for the same conditions as in Problem 6.3a. We see that, roughly speaking, ln|∆xn|
is a linearly increasing function of n. This result indicates that the separation between the
trajectories grows exponentially if the system is chaotic. This divergence of the trajectories can
be described by the Lyapunov exponent λ, which is deﬁned by the relation
|∆xn| = |∆x0|eλn
(6.14)
where ∆xn is the diﬀerence between the trajectories at time n. If the Lyapunov exponent λ is
positive, then nearby trajectories diverge exponentially. Chaotic behavior is characterized by
the exponential divergence of nearby trajectories.
A naive way of measuring the Lyapunov exponent λ is to run the same dynamical system
twice with slightly diﬀerent initial conditions and measure the diﬀerence of the trajectories as a
function of n. We used this method to generate Figure 6.8. Because the rate of separation of the
trajectories might depend on the choice of x0, a better method would be to compute the rate of
separation for many values of x0. This method would be tedious because we would have to ﬁt
the separation to (6.14) for each value of x0 and then determine an average value of λ.
A more important limitation of the naive method is that because the trajectory is restricted
to the unit interval, the separation |∆xn| ceases to increase when n becomes suﬃciently large.
Fortunately, there is a better way of determining λ. We take the natural logarithm of both sides
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 160
-2.0
-1.0
0.0
1.0
λ
0.7 0.8 0.9 1.0
r
Figure 6.9: The Lyapunov exponent calculated using the method in (6.19) as a function of the
control parameter r. Compare the behavior of λ to the bifurcation diagram in Figure 6.2. Note
that λ < 0 for r < 3/4 and approaches zero at a period doubling bifurcation. A negative spike
corresponds to a superstable trajectory. The onset of chaos is visible near r = 0.892, where λ
ﬁrst becomes positive. For r 0.892, λ generally increases except for dips below zero whenever
a periodic window occurs, for example, the dip due to the period 3 window near r = 0.96. For
each value of r, the ﬁrst 1000 iterations were discarded, and 105 values of ln|f (xn)| were used
to determine λ.
of (6.14), and write λ as
λ =
1
n
ln
∆xn
∆x0
. (6.15)
Because we want to use the data from the entire trajectory after the transient behavior has
ended, we use the fact that
∆xn
∆x0
=
∆x1
∆x0
∆x2
∆x1
···
∆xn
∆xn−1
. (6.16)
Hence, we can express λ as
λ =
1
n
n−1
i=0
ln
∆xi+1
∆xi
. (6.17)
The form (6.17) implies that we can interpret xi for any i as the initial condition.
We see from (6.17) that the problem of computing λ has been reduced to ﬁnding the ratio
∆xi+1/∆xi. Because we want to make the initial diﬀerence between the two trajectories as small
as possible, we are interested in the limit ∆xi → 0.
The idea of the more sophisticated procedure is to compute dxi+1/dxi from the equation of
motion at the same time that the equation of motion is being iterated. We use the logistic map
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 161
as an example. From (6.5) we have
dxi+1
dxi
= f (xi) = 4r(1 − 2xi). (6.18)
We can consider xi for any i as the initial condition and the ratio dxi+1/dxi as a measure of the
rate of change of xi. Hence, we can iterate the logistic map as before and use the values of xi
and the relation (6.18) to compute f (xi) = dxi+1/dxi at each iteration. The Lyapunov exponent
is given by
λ = lim
n→∞
1
n
n−1
i=0
ln f (xi) (6.19)
where we begin the sum in (6.19) after the transient behavior is ﬁnished. We have explicitly
included the limit n → ∞ in (6.19) to remind ourselves to choose n suﬃciently large. Note that
this procedure weights the points on the attractor correctly; that is, if a particular region of the
attractor is not visited often by the trajectory, it does not contribute much to the sum in (6.19).
Problem 6.9. Lyapunov exponent for the logistic map
(a) Modify IterateMapApp to compute the Lyapunov exponent λ for the logistic map using the
naive approach. Choose r = 0.91, x0 = 0.5, and ∆x0 = 10−6, and plot ln|∆xn/∆x0| versus n.
What happens to ln|∆xn/∆x0| for large n? Determine λ for r = 0.91, r = 0.97, and r = 1.0.
Does your result for λ for each value of r depend signiﬁcantly on your choice of x0 or ∆x0?
(b) Modify BifurcateApp to compute λ using the algorithm discussed in the text for r = 0.76
to r = 1.0 in steps of ∆r = 0.01. What is the sign of λ if the system is not chaotic? Plot λ
versus r and explain your results in terms of behavior of the bifurcation diagram shown in
Figure 6.2. Compare your results for λ with those shown in Figure 6.9. How does the sign
of λ correlate with the behavior of the system as seen in the bifurcation diagram? For what
value of r is λ a maximum?
(c) In Problem 6.3b we saw that roundoﬀ errors in the chaotic regime make the computation of
individual trajectories meaningless. That is, if the system’s behavior is chaotic, then small
roundoﬀ errors are ampliﬁed exponentially in time, and the actual numbers we compute for
the trajectory starting from a given initial value are not “real.” Repeat your calculation of λ
for r = 1 by changing the roundoﬀ error as you did in Problem 6.3b. Does your computed
value of λ change? How meaningful is your computation of the Lyapunov exponent? We
will encounter a similar question in Chapter 8 where we compute the trajectories of chaotic
systems of many particles. We will ﬁnd that although the “true” trajectories cannot be
computed for long times, averages over the trajectories yield meaningful results.
We have found that nearby trajectories diverge if λ > 0. For λ < 0, the two trajectories
converge and the system is not chaotic. What happens for λ = 0? In this case we will see that
the trajectories diverge algebraically; that is, as a power of n. In some cases a dynamical system
is at the “edge of chaos” where the Lyapunov exponent vanishes. Such systems are said to
exhibit weak chaos to distinguish their behavior from the strongly chaotic behavior (λ > 0) that
we have been discussing.
If we deﬁne z ≡ |∆xn|/|∆x0|, then z will satisfy the diﬀerential equation
dz
dn
= λz. (6.20)
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 162
For weak chaos we do not ﬁnd an exponential divergence, but instead a divergence that is
algebraic and is given by
dz
dn
= λqzq
(6.21)
where q is a parameter that needs to be determined. The solution to (6.21) is
z = [1 + (1 − q)λqn]1/(1−q)
(6.22)
which can be checked by substituting (6.22) into (6.21). In the limit q → 1, we recover the usual
exponential dependence.
We can determine the type of chaos using the crude approach of choosing a large number
of initial values of x0 and x0 +∆x0 and plotting the average of lnz versus n. If we do not obtain a
straight line, then the system does not exhibit strong chaos. How can we check for the behavior
shown in (6.22)? The easiest way is to plot the quantity
z1−q − 1
1 − q
(6.23)
versus n, which will equal nλq if (6.22) is applicable. We explore these ideas in the following
problem.
∗Problem 6.10. Measuring weak chaos
(a) Write a program that plots lnz if q = 1 or zq if q 1 as a function of n. Your program should
have q, |∆x0|, the number of seeds, and the number of iterations as input parameters. To
compare with work by Añaños and Tsallis, use a variation of the logistic map given by
xn+1 = 1 − ax2
n (6.24)
where |xn| ≤ 1 and 0 ≤ a ≤ 2. The seeds x0 should be equally spaced in the interval |x0| < 1.
(b) Consider strong chaos at a = 2. Choose q = 1, 50 iterations, at least 1000 values of x0,
and |∆x0| = 10−6. Do you obtain a straight line for lnz versus n? Does zn eventually stop
increasing as a function of n? If so why? Try |∆x0| = 10−12. How do your results diﬀer and
how are they the same? Also iterate ∆x directly:
∆xn+1 = xn+1 − ˜xn+1 = −a(x2
n − ˜x2
n) = −a(xn − ˜xn)(xn + ˜xn) = −a∆xn(xn + ˜xn) (6.25)
where xn is the iterate starting at x0, and ˜xn is the iterate starting at x0 + ∆x0. Show that
straight lines are not obtained for your plot if q 1.
(c) The edge of chaos for this map is at a = 1.401155189. Repeat part (a) for this value of a and
various values of q. Simulations with 105 values of x0 points show that linear behavior is
obtained for q ≈ 0.36.
A system of ﬁxed energy (and number of particles and volume) has an equal probability of
being in any microstate speciﬁed by the positions and velocities of the particles (see Sec 15.2).
One way of measuring the ability of a system to be in any state is to measure its entropy deﬁned
by
S = −
i
pi lnpi, (6.26)
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 163
where the sum is over all states, and pi is the probability or relative frequency of being in the
ith state. For example, if the system is always in only one state, then S = 0, the smallest possible
entropy. If the system explores all states equally, then S = lnΩ, where Ω is the number of
possible states. (You can show this result by letting pi = 1/Ω.)
∗Problem 6.11. Entropy of the logistic map
(a) Write a program to compute S for the logistic map. Divide the interval [0,1] into bins or
subintervals of width ∆x = 0.01 and determine the relative number of times the trajectory
falls into each bin. At each value of r in the range 0.7 ≤ r ≤ 1, the map should be iterated
for a ﬁxed number of steps, for example, n = 1000. Choose ∆x = 0.01. What happens to the
entropy when the trajectory is chaotic?
(b) Repeat part (a) with n = 10,000. For what values of r does the entropy change signiﬁcantly?
Decrease ∆x to 0.001 and repeat. Does this decrease make a diﬀerence?
(c) Plot pi as a function of x for r = 1. For what value(s) of x is the plot a maximum?
We can also measure the (generalized) entropy as a function of time. As we will see in
Problem 6.12, S(n) for strong chaos increases linearly with n until all the possible states are
visited. However, for weak chaos this behavior is not found. In the latter case we can generalize
the entropy to a q-dependent function deﬁned by
Sq =
1 − i p
q
i
q − 1
. (6.27)
In the limit q → 1, Sq → S. The following problem discusses measuring the entropy for the
same system as in Problem 6.10.
∗Problem 6.12. Entropy of weak and strong chaotic systems
(a) Write a program that iterates the map (6.24) and plots S if q = 1, or plots Sq, if q 1, as a
function of n. The input parameters should be q, the number of bins, the number of random
seeds in a single bin, and n, the number of iterations. At each iteration compute the entropy.
Then average S over the randomly chosen values of the seeds.
(b) Consider strong chaos at a = 2. Choose q = 1, n = 20, ∆x ≤ 0.001, and ten randomly chosen
seeds per bin. Do you obtain a straight line for S versus n? Does the curve eventually stop
growing? If you decrease ∆x, how do your results diﬀer and how are they the same? Show
that S is not a linear function of n if q 1.
(c) Repeat part (a) with a = 1.401155189 and various values of q. Simulations with 105 bins
show that linear behavior is obtained for q ≈ 0.36, the same value as for the measurements
in Problem 6.10.
6.6 *Controlling Chaos
The dream of classical physics was that if the initial conditions and all the forces acting on a
system were known, then we could predict the future with as much precision as we desire.
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 164
The existence of chaos has shattered that dream. However, if a system is chaotic, we can still
control its behavior with small, but carefully chosen, perturbations of the system. We will illustrate
the method for the logistic map. The application of the method to other one-dimensional
systems is straightforward, but the extension to higher-dimensional systems is more complicated
(cf. Ott, Lai).
Suppose that we want the trajectory to be periodic even though the parameter r is in the
chaotic regime. How can we make the trajectory have periodic behavior without drastically
changing r or imposing an external perturbation that is so large that the internal dynamics of
the map become irrelevant? The key idea is that for any value of r in the chaotic regime, there
is an inﬁnite number of trajectories that have unstable periods. This property of the chaotic
regime means that if we choose the value of the seed x0 to be precisely equal to a point on an
unstable trajectory with period p, the subsequent trajectory will have this period. However, if
we choose a value of x0 that diﬀers ever so slightly from this special value, the trajectory will
not be periodic. Our goal is to make slight perturbations to the system to keep it on the desired
unstable periodic trajectory.
The ﬁrst step is to ﬁnd the values of x(i), i = 1 to p, that constitute the unstable periodic
trajectory. It is an interesting numerical problem to ﬁnd the values of x(i), and we consider this
problem ﬁrst. To ﬁnd a ﬁxed point of the map f (p), we need to ﬁnd the value of x∗ such that
g(p)
(x∗
) ≡ f (p)
(x∗
) − x∗
= 0. (6.28)
The algorithms for ﬁnding the solution to (6.28) are called root-ﬁnding algorithms. You might
have heard of Newton’s method, which we describe in Appendix 6B. Here we use the simplest
root-ﬁnding algorithm, the bisection method. The algorithm works as follows:
(i) Choose two values xleft and xright, with xleft < xright, such that the product g(p)(xleft)g(p)(xright) <
0. Because this product is negative, there must be a value of x such that g(p)(x) = 0 in the
interval xleft,xright
(ii) Choose the midpoint, xmid = xleft + 1
2 (xright − xleft) = 1
2 (xleft + xright), as the guess for x∗.
(iii) If g(p)(xmid) has the same sign as g(p)(xleft), then replace xleft by xmid; otherwise, replace
xright by xmid. The interval for the location of the root is now reduced.
(iv) Repeat steps (ii) and (iii) until the desired level of precision is achieved.
The following program implements this algorithm for the logistic map. An alternative implementation
named FixedPointApp that does not use recursion is not listed, but is available
in the ch06 package. One possible problem is that some of the roots of g(p)(x) = 0 are also roots
of g(p )(x) = 0 for p equal to a factor of p. (For example, if p = 6, 2 and 3 are factors.) As p
increases, it might become more diﬃcult to ﬁnd a root that is part of a period p trajectory and
not part of a period p trajectory.
Listing 6.4: The RecursiveFixedPointApp program ﬁnds stable and unstable periodic trajectories
with the given period using the bisection root- ﬁnding algorithm
package org . opensourcephysics . sip . ch06 ;
import org . opensourcephysics . controls . ;
public class RecursiveFixedPointApp extends AbstractCalculation {
double r ; / / c o n t r o l parameter
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 165
int period ;
double xleft , xright ;
double gleft , gright ;
public void reset ( ) {
control . setValue ( "r" , 0 . 8 ) ; / / c o n t r o l parameter r
control . setValue ( "period" , 2 ) ; / / period
control . setValue ( "epsilon" , 0.0000001); / / d e s i r e d p r e c i s i o n
control . setValue ( "xleft" , 0 . 0 1 ) ; / / guess f o r x l e f t
control . setValue ( "xright" , 0 . 9 9 ) ; / / guess f o r xr ig ht
}
public void calculate ( ) {
double epsilon = control . getDouble ( "epsilon" ) ; / / d e s i r e d p r e c i s i o n
r = control . getDouble ( "r" ) ;
period = control . getInt ( "period" ) ;
x l e f t = control . getDouble ( "xleft" ) ;
xright = control . getDouble ( "xright" ) ;
g l e f t = map( xleft , r , period )− x l e f t ;
gright = map( xright , r , period )− xright ;
i f ( g l e f t gright <0) {
while (Math . abs ( xleft −xright )> epsilon ) {
bisection ( ) ;
}
double x = 0.5 ( x l e f t+xright ) ;
control . println ( "explicit search for period "+period+" behavior" ) ;
control . println (0+"\t"+x ) ; / / r e s u l t
for ( int i = 1; i <=2 period +1; i ++) {
x = map( x , r , 1 ) ;
control . println ( i+"\t"+x ) ;
}
} else {
control . println ( "range does not enclose a root" ) ;
}
}
public void bisection ( ) {
/ / midpoint between x l e f t and xr ig ht
double xmid = 0.5 ( x l e f t+xright ) ;
double gmid = map(xmid , r , period )−xmid ;
i f (gmid gleft >0) {
x l e f t = xmid ; / / change x l e f t
g l e f t = gmid ;
} else {
xright = xmid ; / / change xr ig ht
gright = gmid ;
}
}
double map( double x , double r , double period ) {
i f ( period >1) {
double y = map( x , r , period −1);
return 4 r y (1 −y ) ;
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 166
} else {
return 4 r x (1 −x ) ;
}
}
public s t a t i c void main ( String [ ] args ) {
CalculationControl . createApp (new RecursiveFixedPointApp ( ) ) ;
}
}
Problem 6.13. Unstable periodic trajectories for the logistic map
(a) Test RecursiveFixedPointApp for values of r for which the logistic map has a stable period
with p = 1 and p = 2. Set the desired precision equal to 10−7. Initially use xleft = 0.01
and xright = 0.99. Calculate the stable attractor analytically and compare the results of your
program with the analytic results.
(b) Set r = 0.95 and ﬁnd the periodic trajectories for p = 1, 2, 5, 6, 7, 12, 13, and 19.
(c) Modify RecursiveFixedPointApp so that nb, the number of bisections needed to obtain the
unstable trajectory, is listed. Choose three of the cases considered in part (b) and compute
nb for the precision = 0.01, 0.001, 0.0001, and 0.00001. Determine the functional dependence
of nb on .
Now that we know how to ﬁnd the values of the unstable periodic trajectories, we discuss an
algorithm for stabilizing this period. Suppose that we wish to stabilize the unstable trajectory
of period p for a choice of r = r0. The idea is to make small adjustments of r = r0 + ∆r at each
iteration so that the diﬀerence between the actual trajectory and the target periodic trajectory
is small. If the actual trajectory is xn and we wish the trajectory to be at x(i), we make the next
iterate xn+1 equal to x(i + 1) by expanding the diﬀerence xn+1 − x(i + 1) in a Taylor series and
setting the diﬀerence to zero to ﬁrst order. We have xn+1 − x(i + 1) = f (xn,r) − f (x(i),r0). If we
expand f (xn,r) about (x(i),r0), we have to ﬁrst order
xn+1 − x(i + 1) =
∂f (x,r)
∂x
[xn − x(i)] +
∂f (x,r)
∂r
∆r = 0. (6.29)
The partial derivatives in (6.29) are evaluated at x = x(i) and r = r0. The result is
4r0 1 − 2x(i) xn − x(i) + 4x(i) 1 − x(i) ∆r = 0 (6.30)
and the solution of (6.30) for ∆r can be written as
∆r = −r0
1 − 2x(i) xn − x(i)
x(i) 1 − x(i)
. (6.31)
The procedure is to iterate the logistic map at r = r0 until xn is suﬃciently close to an x(i).
The nature of chaotic systems is that the trajectory is guaranteed to eventually come close to the
desired unstable trajectory. Then we use (6.31) to change the value of r so that the next iteration
is closer to x(i + 1). We summarize the algorithm for controlling chaos as follows:
1. Find the unstable periodic trajectory x(1),x(2)...x(p) for the desired value of r0.
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 167
2. Iterate the map with r = r0 until xn is within of x(i). Then use (6.31) to determine r.
3. Turn oﬀ the control by setting r = r0.
Problem 6.14. Controlling chaos
(a) Write a program that allows the user to turn the control on and oﬀ. The trajectory can be
seen by plotting xn versus n. The program should incorporate as input the desired unstable
periodic trajectory x(i), the period p, the value of r0, and the parameter .
(b) Test your program with r0 = 0.95 and the periods p = 1, 5, and 13. Use = 0.02.
(c) Modify your program so that the values of r as well as the values of xn are shown. How does
r change if we vary ? Try = 0.05, 0.01, and 0.005.
(d) Add a method to compute n , the number of iterations necessary for the trajectory xn to be
within of x(1) when the control is on. Find n , the average value of n , by starting with
100 random values of x0. Compute n as a function of for δ = 0.05, 0.005, 0.0005, and
0.00005. What is the functional dependence of n on ?
6.7 Higher-Dimensional Models
So far we have discussed the logistic map as a mathematical model that has some remarkable
properties and produces some interesting computer graphics. In this section we discuss some
two- and three-dimensional systems that also might seem to have little to do with realistic
physical systems. However, as we will see in Sections 6.8 and 6.9, similar behavior is found in
realistic physical systems under the appropriate conditions.
We begin with a two-dimensional map and consider the sequence of points (xn,yn) generated
by
xn+1 = yn + 1 − axn
2
(6.32a)
yn+1 = bxn. (6.32b)
The map (6.32) was proposed by Hénon who was motivated by the relevance of this dynamical
system to the behavior of asteroids and satellites.
Problem 6.15. The Hénon map
(a) Write a program to iterate (6.32) for a = 1.4 and b = 0.3 and plot 104 iterations starting from
x0 = 0,y0 = 0. Make sure you compute the new value of y using the old value of x and not
the new value of x. Do not plot the initial transient. Look at the trajectory in the region
deﬁned by |x| ≤ 1.5 and |y| ≤ 0.45. Make a similar plot beginning from the second initial
condition, x0 = 0.63135448,y0 = 0.18940634. Compare the shape of the two plots. Is the
shape of the two curves independent of the initial conditions?
(b) Increase the scale of your plot so that all points in the region 0.50 ≤ x ≤ 0.75 and 0.15 ≤
y ≤ 0.21 are shown. Begin from the second initial condition and increase the number of
computed points to 105. Then make another plot showing all points in the region 0.62 ≤
x ≤ 0.64 and 0.185 ≤ y ≤ 0.191. If time permits, make an additional enlargement and plot
all points within the box deﬁned by 0.6305 ≤ x ≤ 0.6325 and 0.1889 ≤ y ≤ 0.1895. You will
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 168
have to increase the number of computed points to order 106. What is the structure of the
curves within each box? Does the attractor appear to have a similar structure on smaller and
smaller length scales? The region of points from which the points cannot escape is the basin
of the Hénon attractor. The attractor is the set of points to which all points in the basin are
attracted. That is, two trajectories that begin from diﬀerent conditions will eventually lie
on the attractor.
(c) Determine if the system is chaotic; that is, sensitive to initial conditions. Start two points
very close to each other and watch their trajectories for a ﬁxed time. Choose diﬀerent colors
for the two trajectories.
(d)∗ It is straightforward in principle to extend the method for computing the Lyapunov exponent
that we used for a one-dimensional map to higher-dimensional maps. The idea is
to linearize the diﬀerence (or diﬀerential) equations and replace dxn by the corresponding
vector quantity drn. This generalization yields the Lyapunov exponent corresponding to
the divergence along the fastest growing direction. If a system has f degrees of freedom, it
has a set of f Lyapunov exponents. A method for computing all f exponents is discussed
in Project 6.24.
One of the earliest indications of chaotic behavior was in an atmospheric model developed
by Lorenz. His goal was to describe the motion of a ﬂuid layer that is heated from below. The
result is convective rolls, where the warm ﬂuid at the bottom rises, cools oﬀ at the top, and
then falls down later. Lorenz simpliﬁed the description by restricting the motion to two spatial
dimensions. This situation has been realized experimentally and is known as a Rayleigh–Benard
cell. The equations that Lorenz obtained are
dx
dt
= −σx + σy (6.33a)
dy
dt
= −xz + rx − y (6.33b)
dz
dt
= xy − bz (6.33c)
where x is a measure of the ﬂuid ﬂow velocity circulating around the cell, y is a measure of
the temperature diﬀerence between the rising and falling ﬂuid regions, and z is a measure
of the diﬀerence in the temperature proﬁle between the bottom and the top from the normal
equilibrium temperature proﬁle. The dimensionless parameters σ, r, and b are determined by
various ﬂuid properties, the size of the Raleigh-Benard cell, and the temperature diﬀerence in
the cell. Note that the variables x, y, and z have nothing to do with the spatial coordinates, but
are measures of the state of the system. Although it is not expected that you will understand
the relation of the Lorenz equations to convection, we have included these equations here to
reinforce the idea that simple sets of equations can exhibit chaotic behavior.
LorenzApp displays the solution to (6.33) using the Open Source Physics 3D drawing framework
and is available in the ch06 package. To make three-dimensional plots, we use the Display3DFrame
class; the only argument of its constructor is the title for the plot. The following
code fragment sets up the plot.
Display3DFrame frame = new Display3DFrame ( "Lorenz attractor" ) ;
Lorenz lorenz = new Lorenz ( ) ;
frame . setPreferredMinMax ( −15.0 , 15.0 , −15.0 , 15.0 , 0.0 , 5 0 . 0 ) ;
frame . setDecorationType ( VisualizationHints .DECORATION_AXES) ;
frame . addElement ( lorenz ) ; / / lorenz i s a 3D element
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 169
0 10 20
-25
0
25
x
t
0 10 20
-25
0
25
y
t
0 10 20
0
25
50
z
t
-25 0 25
-25
0
25
x
y
-25 0 25
0
25
50
z
x
-25 0 25
0
25
50
z
y
Figure 6.10: A trajectory of the Lorenz model with σ = 10, b = 8/3, and r = 28 and the initial
condition x0 = 1,y0 = 1,z0 = 20. A time interval of t = 20 is shown with points plotted at
intervals of 0.01. The fourth-order Runge–Kutta algorithm was used with ∆t = 0.0025.
Housekeeping methods such as reset and initialize are similar to methods in other simulations
and are not shown.
The class Lorenz draws the attractor in the three-dimensional (x,y,z) space deﬁned by
(6.33). The state of the system is shown as a red ball in this 3D space, and the state’s trajectory
is shown as a trail. An easy way to show the time evolution is to extend the 3D Group
class and create the ball and the trail inside the group. When points are added to the group,
the trail is extended and the position of the ball is set. The Lorenz class imports org.opensourcephysics.display3d.simple3d.*.
The ball and trail are then instantiated and added to
the group as follows:
public class Lorenz extends Group implements ODE {
ElementEllipsoid ball = new ElementEllipsoid ( ) ;
ElementTrail t r a i l = new ElementTrail ( ) ;
addElement ( t r a i l ) ; / / adds t r a c e to Lorenz group
addElement ( ball ) ; / / adds b a l l to Lorenz group
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 170
. . .
}
The properties of the ball and trail objects are set by
ball . setSizeXYZ (1 , 1 , 1 ) ; / / s e t s s i z e of b a l l in world c o o r d i n a t e s
ball . getStyle ( ) . s e t F i l l C o l o r ( java . awt . Color .RED) ;
To plot each part of the trajectory through state space, we use the method trail.addPoint(x,y,z)
to add to the trail and ball.setXYZ(x,y,z) to show the current state. The user can project onto
two dimensions using the frame’s menu or rotate the three-dimensional plot using the mouse
because these capabilities are built into the frame. The getRate and getState methods model
(6.33) by implementing the ODE interface.
Problem 6.16. The Lorenz model
(a) Use a Runge–Kutta algorithm such as RK4 or RK45 (see Appendix 3A) to obtain a numerical
solution of the Lorenz equations (6.33). Generate three-dimensional plots using Display3DFrame.
Explore the basin of the attractor with σ = 10, b = 8/3, and r = 28.
(b) Determine qualitatively the sensitivity to initial conditions. Start two points very close to
each other and watch their trajectories for approximately 104 time steps.
(c) Let zm denote the value of z where z is a relative maximum for the mth time. You can
determine the value of zm by ﬁnding the average of the two values of z when the right-hand
side of (6.33) changes sign. Plot zm+1 versus zm and describe what you ﬁnd. This procedure
is one way that a continuous system can be mapped onto a discrete map. What is the slope
of the zm+1 versus zm curve? Is its magnitude always greater than unity? If so, then this
behavior is an indication of chaos. Why?
The application of the Lorenz equations to weather prediction has led to a popular metaphor
known as the butterﬂy eﬀect. This metaphor is made even more meaningful by inspection of Figure
6.10. The “butterﬂy eﬀect” is often ascribed to Lorenz (see Hilborn). In a 1963 paper he
remarked that:
“One meteorologist remarked that if the theory were correct, one ﬂap of a seagull’s wings
would be enough to alter the course of the weather forever.”
By 1972, the seagull had evolved into the more poetic butterﬂy and the title of his talk was
“Predictability: Does the ﬂap of a butterﬂy’s wings in Brazil set oﬀ a tornado in Texas?”
6.8 Forced Damped Pendulum
We now consider the dynamics of nonlinear systems described by classical mechanics. The
general problem in classical mechanics is the determination of the positions and velocities of
a system of particles subjected to certain forces. For example, we considered in Chapter 5 the
celestial two-body problem and were able to predict the motion at any time. We will ﬁnd that
we cannot make long-time predictions for the trajectories of nonlinear classical systems when
these systems exhibit chaos.
A familiar example of a nonlinear mechanical system is the simple pendulum (see Chapter
3). To make its dynamics more interesting, we assume that there is a linear damping term
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 171
present and that the pivot is forced to move vertically up and down. Newton’s second law for
this system is (cf. McLaughlin or Percival and Richards)
d2θ
dt2
= −γ
dθ
dt
− [ω0
2
+ 2Acosωt]sinθ (6.34)
where θ is the angle the pendulum makes with the vertical axis, γ is the damping coeﬃcient,
ω0
2 = g/L is the natural frequency of the pendulum, and ω and A are the frequency and amplitude
of the external force. The eﬀect of the vertical acceleration of the pivot is equivalent to a
time-dependent gravitational ﬁeld, because we can write the total vertical force due to gravity,
−mg plus the pivot motion f (t) as −mg(t), where g(t) ≡ g − f (t)/m.
How do we expect the driven, damped simple pendulum to behave? Because there is damping
present, we expect that if there is no external force, the pendulum would come to rest. That
is, (x = 0,v = 0) is a stable attractor. As A is increased from zero, this attractor remains stable
for suﬃciently small A. At a value of A equal to Ac, this attractor becomes unstable. How does
the driven nonlinear oscillator behave as we increase A?
It is diﬃcult to determine whether the pendulum has some kind of underlying periodic
behavior by plotting only its position or even plotting its trajectory in phase space. We expect
that if it does, the period will be related to the period of the external time-dependent force.
Thus, we analyze the motion by plotting a point in phase space after every cycle of the external
force. Such a phase space plot is called a Poincaré map. Hence, we will plot dθ/dt versus θ for
values of t equal to nT for n equal to 1,2,3,... . If the system has a period T , then the Poincaré
map consists of a single point. If the period of the system is nT , there will be n points.
PoincareApp uses the fourth-order Runge–Kutta algorithm to compute θ(t) and the angular
velocity dθ(t)/dt for the pendulum described by (6.34). This equation is modeled in the
DampedDrivenPendulum class, but is not shown here because it is similar to other ODE implementations.
A phase diagram for dθ(t)/dt versus θ(t) is shown in one frame. In the other frame,
the Poincaré map is represented by drawing a small box at the point (θ,dθ/dt) at time t = nT .
If the system has period 1; that is, if the same values of (θ,dθ/dt) are drawn at t = nT , we
would see only one box; otherwise, we would see several boxes. Because the ﬁrst few values
of (θ,dθ/dt) show the transient behavior, it is desirable to clear the display and draw a new
Poincaré map without changing A, θ, or dθ/dt.
Listing 6.5: PoincareApp plots a phase diagram and a Poincaré map for the damped driven
pendulum.
package org . opensourcephysics . sip . ch06 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . PlotFrame ;
import org . opensourcephysics . numerics .RK4;
public class PoincareApp extends AbstractSimulation {
final s t a t i c double PI = Math . PI ; / / defined f o r b r e v i t y
PlotFrame phaseSpace = new PlotFrame ( "theta" , "angular velocity" ,
"Phase space plot" ) ;
PlotFrame poincare = new PlotFrame ( "theta" , "angular velocity" ,
"Poincare plot" ) ;
int nstep = 100; / / # i t e r a t i o n s between Poincare p l o t
DampedDrivenPendulum pendulum = new DampedDrivenPendulum ( ) ;
RK4 odeMethod = new RK4(pendulum ) ;
public PoincareApp ( ) {
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 172
/ / angular frequency of e x t e r n a l f o r c e equals two and hence
/ / period of e x t e r n a l f o r c e equals pi
odeMethod . setStepSize ( PI/ nstep ) ; / / dt = PI / nsteps
phaseSpace . setMarkerShape (0 , 6 ) ; / / second argument i n d i c a t e s a p i x e l
/ / smaller s i z e g i v e s b e t t e r r e s o l u t i o n
poincare . setMarkerSize (0 , 2 ) ;
poincare . setMarkerColor (0 , java . awt . Color .RED) ;
phaseSpace . setMessage ( "t = " +0);
}
public void reset ( ) {
control . setValue ( "theta" , 0 . 2 ) ;
control . setValue ( "angular velocity" , 0 . 6 ) ;
control . setValue ( "gamma" , 0 . 2 ) ; / / damping constant
control . setValue ( "A" , 0 . 8 5 ) ; / / amplitude
}
public void doStep ( ) {
double s t a t e [ ] = pendulum . getState ( ) ;
for ( int istep = 0; istep <nstep ; istep ++) {
odeMethod . step ( ) ;
i f ( s t a t e [0] > PI ) {
s t a t e [ 0 ] = s t a t e [0] −2.0 PI ;
} else i f ( s t a t e [0]<−PI ) {
s t a t e [ 0 ] = s t a t e [0]+2 PI ;
}
phaseSpace . append (0 , s t a t e [ 0 ] , s t a t e [ 1 ] ) ;
}
poincare . append (0 , s t a t e [ 0 ] , s t a t e [ 1 ] ) ;
phaseSpace . setMessage ( "t = "+decimalFormat . format ( s t a t e [ 2 ] ) ) ;
poincare . setMessage ( "t = "+decimalFormat . format ( s t a t e [ 2 ] ) ) ;
i f ( phaseSpace . isShowing ( ) ) {
phaseSpace . render ( ) ;
}
i f ( poincare . isShowing ( ) ) {
poincare . render ( ) ;
}
}
public void i n i t i a l i z e ( ) {
double theta = control . getDouble ( "theta" ) ; / / i n i t i a l angle
/ / i n i t i a l angular v e l o c i t y
double omega = control . getDouble ( "angular velocity" ) ;
pendulum .gamma = control . getDouble ( "gamma" ) ; / / damping constant
/ / amplitude of e x t e r n a l f o r c e
pendulum .A = control . getDouble ( "A" ) ;
pendulum . i n i t i a l i z e S t a t e (new double [ ] { theta , omega , 0 } ) ;
clear ( ) ;
}
public void clear ( ) {
phaseSpace . clearData ( ) ;
poincare . clearData ( ) ;
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 173
phaseSpace . render ( ) ;
poincare . render ( ) ;
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl control = SimulationControl . createApp (new PoincareApp ( ) ) ;
control . addButton ( "clear" , "Clear" ) ;
}
}
Problem 6.17. Dynamics of a driven, damped simple pendulum
(a) Use PoincareApp to simulate the driven, damped simple pendulum. In the program, ω = 2
so that the period T of the external force equals π. The program also assumes that ω0 = 1.
Use γ = 0.2 and A = 0.85 and compute the phase space trajectory. After the transient, how
many points do you see in the Poincaré plot? What is the period of the pendulum? Vary
the initial values of θ and dθ/dt. Is the attractor independent of the initial conditions?
Remember to ignore the transient behavior.
(b) Modify PoincareApp so that it plots θ and dθ/dt as a function of t. Describe the qualitative
relation between the Poincaré plot, the phase space plot, and the t dependence of θ and
dθ/dt.
(c) The amplitude A plays the role of the control parameter for the dynamics of the system.
Use the behavior of the Poincaré plot to ﬁnd the value A = Ac at which the (0,0) attractor
becomes unstable. Start with A = 0.1 and continue increasing A until the (0,0) attractor
becomes unstable.
(d) Find the period for A = 0.1, 0.25, 0.5, 0.7, 0.75, 0.85, 0.95, 1.00, 1.02, 1.031, 1.033, 1.036,
and 1.05. Note that for small A, the period of the oscillator is twice that of the external
force. The steady state period is 2π for Ac < A < 0.71, π for 0.72 < A < 0.79, and then 2π
again.
(e) The ﬁrst period doubling occurs for A ≈ 0.79. Find the approximate values of A for further
period doubling and use these values of A to compute the exponent δ deﬁned by (6.10).
Compare your result for δ with the result found for the one-dimensional logistic map. Are
your results consistent with those that you found for the logistic map? An analysis of this
system can be found in the article by McLaughlin.
(f) Sometimes a trajectory does not approach a steady state even after a very long time, but
a slight perturbation causes the trajectory to move quickly onto a steady state attractor.
Consider A = 0.62 and the initial condition (θ = 0.3,dθ/dt = 0.3). Describe the behavior of
the trajectory in phase space. During the simulation, change θ by 0.1. Does the trajectory
move onto a steady state trajectory? Do similar simulations for other values of A and other
initial conditions.
(g) Repeat the calculations of parts (b)–(d) for γ = 0.05. What can you conclude about the eﬀect
of damping?
(h) Replace the fourth-order Runge–Kutta algorithm by the lower-order Euler-Richardson algorithm.
Which algorithm gives the better trade-oﬀ between accuracy and speed?
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 174
Problem 6.18. The basin of an attractor
(a) For γ = 0.2 and A > 0.79, the pendulum rotates clockwise or counterclockwise in the steady
state. Each of these two rotations is an attractor. The set of initial conditions that lead to
a particular attractor is called the basin of the attractor. Modify PoincareApp so that the
program draws the basin of the attractor with dθ/dt > 0. For example, your program might
simulate the motion for about 20 periods and then determine the sign of dθ/dt. If dθ/dt > 0
in the steady state, then the program plots a point in phase space at the coordinates of the
initial condition. The program repeats this process for many initial conditions. Describe the
basin of attraction for A = 0.85 and increments of the initial values of θ and dθ/dt equal to
π/10.
(b) Repeat part (a) using increments of the initial values of θ and dθ/dt equal to π/20 or as small
as possible given your computer resources. Does the boundary of the basin of attraction
appear smooth or rough? Is the basin of the attractor a single region or is it disconnected
into more than one piece?
(c) Repeat parts (a) and (b) for other values of A, including values near the onset of chaos and
in the chaotic regime. Is there a qualitative diﬀerence between the basins of periodic and
chaotic attractors? For example, can you always distinguish the boundaries of the basin?
6.9 *Hamiltonian Chaos
Hamiltonian systems are a very important class of dynamical systems. The most familiar are
mechanical systems without friction, and the most important of these is the solar system. The
linear harmonic oscillator and the simple pendulum that we considered in Chapter 3 are two
simple examples. Many other systems can be included in the Hamiltonian framework, for example,
the motion of charged particles in electric and magnetic ﬁelds and ray optics. The Hamiltonian
dynamics of charged particles is particularly relevant to conﬁnement issues in particle
accelerators, storage rings, and plasmas. In each case a function of all the coordinates and momenta
called the Hamiltonian is formed. For many systems this function can be identiﬁed with
the total energy. The Hamiltonian for a particle in a potential V (x,y,z) is
H =
1
2m
(px
2
+ py
2
+ pz
2
) + V (x,y,z). (6.35)
Typically we write (6.35) using the notation
H =
i
p2
i
2m
+ V ({qi}) (6.36)
where p1 ≡ px, q1 ≡ x, etc. This notation emphasizes that the pi and the qi are generalized
coordinates. For example, in some systems p can represent the angular momentum and q can
represent an angle. For a system of N particles in three dimensions, the sum in (6.36) runs from
1 to 3N, where 3N is the number of degrees of freedom.
The methods for constructing the generalized momenta and the Hamiltonian are described
in standard classical mechanics texts. The time dependence of the generalized momenta and
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 175
coordinates is given by
˙pi ≡
dpi
dt
= −
∂H
∂qi
(6.37a)
˙qi ≡
dqi
dt
=
∂H
∂pi
. (6.37b)
Check that (6.37) leads to the usual form of Newton’s second law by considering the simple
example of a single particle in a potential U(x), where q = x and p = m ˙x.
As we found in Chapter 4, an important property of conservative systems is preservation
of areas in phase space. Consider a set of initial conditions of a dynamical system that form
a closed surface in phase space. For example, if phase space is two-dimensional, this surface
would be a one-dimensional loop. As time evolves, this surface in phase space will typically
change its shape. For Hamiltonian systems, the volume (area for a two-dimensional phase
space) enclosed by this surface remains constant in time. For dissipative systems, this volume
will decrease, and hence dissipative systems are not described by a Hamiltonian. One consequence
of the constant phase space volume is that Hamiltonian systems do not have phase space
attractors.
In general, the motion of Hamiltonian systems is very complex. In some systems the motion
is regular, and there is a constant of the motion (a quantity that does not change with time) for
each degree of freedom. Such a system is said to be integrable. For time-independent systems,
an obvious constant of the motion is the total energy. The total momentum and angular momentum
are other examples. There may be others as well. If there are more degrees of freedom
than constants of the motion, then the system can be chaotic. When the number of degrees of
freedom becomes large, the possibility of chaotic behavior becomes more likely. An important
example that we will consider in Chapter 8 is a system of interacting particles. Their chaotic
motion is essential for the system to be described by the methods of statistical mechanics.
For regular motion the change in shape of a closed surface in phase space is uninteresting.
For chaotic motion, nearby trajectories must exponentially diverge from each other, but are
conﬁned to a ﬁnite region of phase space. Hence, there will be local stretching of the surface
accompanied by repeated folding to ensure conﬁnement. There is another class of systems
whose behavior is in between; that is, the system behaves regularly for some initial conditions
and chaotically for others. We will study these mixed systems in this section.
Consider the Hamiltonian for a system of N particles. If the system is integrable, there
are 3N constants of the motion. It is natural to identify the generalized momenta with these
constants. The coordinates that are associated with each of these constants will vary linearly
with time. If the system is conﬁned in phase space, then the coordinates must be periodic. If
we have just one coordinate, we can think of the motion as a point moving on a circle in phase
space. In two dimensions the motion is a point moving in two circles at once; that is, a point
moving on the surface of a torus. In three dimensions we can imagine a generalized torus with
three circles, and so on. If the period of motion along each circle is a rational fraction of the
period of all the other circles, then the torus is called a resonant torus, and the motion in phase
space is periodic. If the periods are not rational fractions of each other, then the torus is called
nonresonant.
If we take an integrable Hamiltonian and change it slightly, what happens to these tori?
A partial answer is given by a theorem due to Kolmogorov, Arnold, and Moser (KAM), which
states that, under certain circumstances, the tori will remain. When the perturbation of the
Hamiltonian becomes suﬃciently large, these KAM tori are destroyed.
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 176
θ
periodic
impulse
Figure 6.11: Model of a kicked rotor consisting of a rigid rod with moment of inertia I. Gravity
and friction at the pivot is ignored. The motion of the rotor is given by the standard map in
(6.39).
To understand the basic ideas associated with mixed systems, we consider a simple model
of a rotor known as the standard map (see Figure 6.11). The rod has a moment of inertia I and
length L and is fastened at one end to a frictionless pivot. The other end is subjected to a vertical
periodic impulsive force of strength k/L applied at time t = 0, τ, 2τ,... Gravity is ignored. The
motion of the rotor can be described by the angle θ and the corresponding angular momentum
pθ. The Hamiltonian for this system can be written as
H(θ,pθ,t) =
pθ
2
2I
+ k cosθ
n
δ(t − nτ). (6.38)
The term δ(t − nτ) is zero everywhere except at t = nτ; its integral over time is unity if t =
nτ is within the limits of integration. If we use (6.37) and (6.38), it is easy to show that the
corresponding equations of motion are given by
dpθ
dt
= k sinθ
n
δ(t − nτ) (6.39a)
dθ
dt
=
pθ
I
. (6.39b)
From (6.39) we see that pθ is constant between kicks (remember that gravity is assumed to be
absent), but changes discontinuously at each kick. The angle θ varies linearly with t between
kicks and is continuous at each kick.
It is convenient to know the values of θ and pθ at times just after the kick. We let θn and pn
be the values of θ(t) and pθ(t) at times t = nτ + 0+, where 0+ is a inﬁnitesimally small positive
number. If we integrate (6.39a) from t = (n + 1)τ − 0+ to t = (n + 1)τ + 0+, we obtain
pn+1 − pn = k sinθn+1. (6.40a)
(Remember that p is constant between kicks and the delta function contributes to the integral
only when t = (n + 1)τ.) From (6.39b) we have
θn+1 − θn = (τ/I)pn. (6.40b)
If we choose units such that τ/I = 1, we obtain the standard map
θn+1 = (θn + pn) modulo 2π (6.41a)
pn+1 = pn + k sinθn+1 (standard map). (6.41b)
We have added the requirement in (6.41a) that the value of the angle θ is restricted to be between
zero and 2π.
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 177
Before we iterate (6.41), let us check that (6.41) represents a Hamiltonian system; that is,
the area in q-p space is constant as n increases. (Here q corresponds to θ.) Suppose we start with
a rectangle of points of length dqn and dpn. After one iteration, this rectangle will be deformed
into a parallelogram of sides dqn+1 and dpn+1. From (6.41) we have
dqn+1 = dqn + dpn (6.42a)
dpn+1 = dpn + k cosqn+1 dqn+1. (6.42b)
If we substitute (6.42a) in (6.42b), we obtain
dpn+1 = (1 + k cosqn+1)dpn + k cosqn+1 dqn. (6.43)
To ﬁnd the area of a parallelogram, we take the magnitude of the cross product of the vectors
dqn+1 = (dqn,dpn) and dpn+1 = (1 + k cosqndqn,k cosqndpn). The result is dqn dpn, and hence the
area in phase space has not changed. The standard map is an example of an area-preserving map
.
The qualitative properties of the standard map are explored in Problem 6.19. You will
ﬁnd that for k = 0, the rod rotates with a ﬁxed angular velocity determined by the momentum
pn = p0 = constant. If p0 is a rational number times 2π, then the trajectory in phase space
consists of a sequence of isolated points lying on a horizontal line (resonant tori). Can you
see why? If p0 is not a rational number times 2π or if your computer does not have suﬃcient
precision, then after a long time, the trajectory will consist of a horizontal line in phase space.
As we increase k, these horizontal lines are deformed into curves that run from q = 0 to q = 2π,
and the isolated points of the resonant tori are converted into closed loops. For some initial
conditions, the trajectories will become chaotic after the transient behavior has ended.
Problem 6.19. The standard map
(a) Write a program to iterate the standard map and plot its trajectory in phase space. Use different
colors so that several trajectories can be shown at the same time for the same value of
the parameter k. Choose a set of initial conditions that form a rectangle (see Problem 4.10).
Does the shape of this area change with time? What happens to the total area?
(b) Begin with k = 0 and choose an initial value of p that is a rational number times 2π. What
types of trajectories do you obtain? If you obtain trajectories consisting of isolated points,
do these points appear to shift due to numerical roundoﬀ errors? How can you tell? What
happens if p0 is an irrational number times 2π? Remember that a computer can only approximate
an irrational number.
(c) Consider k = 0.2 and explore the nature of the phase space trajectories. What structures
appear that do not appear for k = 0? Discuss the motion of the rod corresponding to some
of the typical trajectories that you ﬁnd.
(d) Increase k until you ﬁrst ﬁnd several chaotic trajectories. How can you tell that they are
chaotic? Do these chaotic trajectories ﬁll all of phase space? If there is one trajectory that is
chaotic at a particular value of k, are all trajectories chaotic? What is the approximate value
for kc above which chaotic trajectories appear?
We now discuss a discrete map that models the rings of Saturn (see Fröyland). The model
assumes that the rings of Saturn are due to perturbations produced by Mimas. There are two
important forces acting on objects near Saturn. The force due to Saturn can be incorporated
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 178
as follows. We know that each time Mimas completes an orbit, it traverses a total angle of 2π.
Hence, the angle θ of any other moon of Saturn relative to Mimas can be expressed as
θn+1 = θn + 2π
σ3/2
rn
3/2
(6.44)
where rn is the radius of the orbit after n revolutions and σ = 185.7×103 km is the mean distance
of Mimas from Saturn. The other important force is due to Mimas and causes the radial distance
rn to change. A discrete approximation to the radial acceleration dvr/dt is (see (3.16))
∆vr
∆t
≈
r(t + ∆t) − 2r(t) + r(t − ∆t)
(∆t)2
. (6.45)
The acceleration equals the radial force due to Mimas. If we average over a complete period,
then a reasonable approximation for the change in rn due to Mimas is
rn+1 − 2rn + rn−1 = f (rn,θn) (6.46)
where f (rn,θn) is proportional to the radial force. (We have absorbed the factor of (∆t)2 and the
mass into f .)
In general, the form of f (rn,θn) is very complicated. We make a major simplifying assumption
and take f to be proportional to −(rn − σ)−2 and to be periodic in θn. This form for the
force incorporates the fact that for large rn, the force has the usual form for the gravitational
force. For simplicity, we express this periodicity in the simplest possible way, that is, as cosθn.
We also want the map to be area conserving. These considerations lead to the following twodimensional
map:
θn+1 = θn + 2π
σ3/2
rn
3/2
(6.47a)
rn+1 = 2rn − rn−1 − a
cosθn
(rn − σ)2
. (6.47b)
The constant a for Saturn’s rings is approximately 2 × 1012 km3
. We can show, using a similar
technique as before, that the volume in (r,θ) space is preserved, and hence (6.47) is a Hamiltonian
map.
The purpose of the above discussion was only to motivate and not to derive the form of the
map (6.47). In Problem 6.20 we investigate how the map (6.47) yields the qualitative structure
of Saturn’s rings. In particular, what happens to the values of rn if the period of a moon is
related to the period of Mimas by the ratio of two integers?
Problem 6.20. A simple model of the rings of Saturn
(a) Write a program to implement the map (6.47). Be sure to save the last two values of r so
that the values of rn are updated correctly. The radius of Saturn is 60.4 × 103 km. Express
all lengths in units of 103 km. In these units a = 2000. Plot the points (rn cosθn,rn sinθn).
Choose initial values for r between the radius of Saturn and σ, the distance of Mimas from
Saturn, and ﬁnd the bands of rn values where stable trajectories are found.
(b) What is the eﬀect of changing the value of a? Try a = 200 and a = 20,000 and compare your
results with part (a).
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 179
L
L
2L
0
m
m
θ1
θ2
Figure 6.12: The double pendulum.
(c) Vary the force function. Replace cosθ by other trigonometric functions. How do your results
change? If the changes are small, does that give you some conﬁdence that the model
has something to do with Saturn’s rings?
A more realistic dynamical system is the double pendulum, a system that can be demonstrated
in the laboratory. This system consists of two equal point masses m, with one suspended
from a ﬁxed support by a rigid weightless rod of length L and the other suspended from the
ﬁrst by a similar rod (see Figure 6.12). Because there is no friction, this system is an example of
a Hamiltonian system. The four rectangular coordinates x1,y1,x2, and y2 of the two masses can
be expressed in terms of two generalized coordinates θ1,θ2:
x1 = Lsinθ1 (6.48a)
y1 = 2L − Lcosθ1 (6.48b)
x2 = Lsinθ1 + Lsinθ2 (6.48c)
y2 = 2L − Lcosθ1 − Lcosθ2. (6.48d)
The kinetic energy is given by
K =
1
2
m( ˙x2
1 + ˙x2
2 + ˙y2
1 + ˙y2
2) =
1
2
mL2
[2 ˙θ2
1 + ˙θ2
2 + 2 ˙θ1
˙θ2 cos(θ1 − θ2)], (6.49)
and the potential energy is given by
U = mgL 3 − 2cosθ1 − cosθ2 . (6.50)
For convenience, U has been deﬁned so that its minimum value is zero.
To use Hamilton’s equations of motion (6.37), we need to express the sum of the kinetic
energy and potential energy in terms of the generalized momenta and coordinates. In rectangular
coordinates the momenta are equal to pi = ∂K/∂ ˙qi, where, for example, qi = x1 and pi is
the x-component of mv1. This relation works for generalized momenta as well, and the generalized
momentum corresponding to θ1 is given by p1 = ∂K/∂ ˙θ1. If we calculate the appropriate
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 180
q1
p1
-6
-4
-2
0
2
4
6
8
-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5
Figure 6.13: Poincaré plot for the double pendulum with p1 plotted versus q1 for q2 = 0 and
p2 > 0. Two sets of initial conditions, (q1,q2,p2) = (0,0,0) and (1.1,0,0), respectively, were used
to create the plot. The initial value of the coordinate p2 is found from (6.52) by requiring that
E = 15.
derivatives, we can show that the generalized momenta can be written as
p1 = mL2
2 ˙θ1 + ˙θ2 cos(θ1 − θ2) (6.51a)
p2 = mL2 ˙θ2 + ˙θ1 cos(θ1 − θ2) . (6.51b)
The Hamiltonian or total energy becomes
H =
1
2mL2
p1
2 + 2p2
2 − 2p1p2 cos(q1 − q2)
1 + sin2
(q1 − q2)
+ mgL(3 − 2cosq1 − cosq2) (6.52)
where q1 = θ1 and q2 = θ2. The equations of motion can be found by using (6.52) and (6.37).
Figure 6.13 shows a Poincaré map for the double pendulum. The coordinate p1 is plotted
versus q1 for the same total energy E = 15, but for two diﬀerent initial conditions. The map
includes the points in the trajectory for which q2 = 0 and p2 > 0. Note the resemblance between
Figure 6.13 and plots for the standard map above the critical value of k; that is, there is a regular
trajectory and a chaotic trajectory for the same parameters, but diﬀerent initial conditions.
Problem 6.21. Double pendulum
(a) Use either the fourth-order Runge–Kutta algorithm (with ∆t = 0.003) or the second-order
Euler–Richardson algorithm (with ∆t = 0.001) to simulate the double pendulum. Choose
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 181
m = 1, L = 1, and g = 9.8. The input parameter is the total energy E. The initial values of
q1 and q2 can be chosen either randomly within the interval |qi| < π or by the user. Then
set the initial p1 = 0 and solve for p2 using (6.52) with H = E. First explore the pendulum’s
behavior by plotting the generalized coordinates and momenta as a function of time in four
windows. Consider the energies E = 1, 5, 10, 15, and 40. Try a few initial conditions for
each value of E. Visually determine whether the steady state behavior is regular or appears
to be chaotic. Are there some values of E for which all the trajectories appear regular? Are
there values of E for which all trajectories appear chaotic? Are there values of E for which
both types of trajectories occur?
(b) Repeat part (a) but plot the phase space diagrams p1 versus q1 and p2 versus q2. Are these
plots more useful for determining the nature of the trajectories than those drawn in part
(a)?
(c) Draw the Poincaré plot with p1 plotted versus q1 only when q2 = 0 and p2 > 0. Overlay
trajectories from diﬀerent initial conditions but with the same total energy on the same
plot. Duplicate the plot shown in Figure 6.13. Then produce Poincaré plots for the values
of E given in part (a) with at least ﬁve diﬀerent initial conditions for each energy. Describe
the diﬀerent types of behavior.
(d) Is there a critical value of the total energy at which some chaotic trajectories ﬁrst occur?
(e) Animate the double pendulum, showing the two masses moving back and forth. Describe
how the motion of the pendulum is related to the behavior of the Poincaré plot.
Hamiltonian chaos has important applications in physical systems such as the solar system,
the motion of the galaxies, and plasmas. It also has helped us understand the foundation for
statistical mechanics. One of the most fascinating applications has been to quantum mechanics,
which has its roots in the Hamiltonian formulation of classical mechanics. A current area of
interest is the quantum analogue of classical Hamiltonian chaos. The meaning of this analogue
is not obvious because well-deﬁned trajectories do not exist in quantum mechanics. Moreover,
Schrödinger’s equation is linear and can be shown to have only periodic and quasiperiodic so-
lutions.
6.10 Perspective
As the many books and review articles on chaos can attest, it is impossible to discuss all aspects
of chaos in a single chapter. We will revisit chaotic systems in Chapter 13 where we introduce
the concept of fractals. We will ﬁnd that one of the characteristics of chaotic dynamics is that
the resulting attractors often have an intricate geometrical structure.
The most general ideas that we have discussed in this chapter are that simple systems can
exhibit complex behavior and that chaotic systems exhibit extreme sensitivity to initial conditions.
We have also learned that computers allow us to explore the behavior of dynamical systems and
visualize the numerical output. However, the simulation of a system does not automatically lead
to understanding. If you are interested in learning more about the phenomena of chaos and the
associated theory, the suggested readings at the end of the chapter are a good place to start. We
also invite you to explore chaotic phenomenon in more detail in the following projects.
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 182
6.11 Projects
The ﬁrst several projects are on various aspects of the logistic map. These projects do not exhaust
the possible investigations of the properties of the logistic map.
Project 6.22. A more accurate determination of δ and α
We have seen that it is diﬃcult to determine δ accurately by ﬁnding the sequence of values of
bk at which the trajectory bifurcates for the kth time. A better way to determine δ is to compute
it from the sequence sm of superstable trajectories of period 2m−1. We already have found that
s1 = 1/2, s2 ≈ 0.80902, and s3 ≈ 0.87464. The parameters s1, s2,... can be computed directly
from the equation
f (2m−1)
x =
1
2
=
1
2
. (6.53)
For example, s2 satisﬁes the relation f (2)(x = 1/2) = 1/2. This relation, together with the analytic
form for f (2)(x) given in (6.7), yields
8r2
(1 − r) − 1 = 0. (6.54)
If we wish to solve (6.54) numerically for r = s2, we need to be careful not to ﬁnd the irrelevant
solutions corresponding to a lower period. In this case we can factor out the solution r = 1/2 and
solve the resultant quadratic equation analytically to ﬁnd s2 = (1 +
√
5)/4. Clearly r = s1 = 1/2
solves (6.54) with period 1, because from (6.53), f (1)(x = 1/2) = 4r 1
2 (1 − 1
2 ) = r = 1/2 only for
r = 1/2.
(a) It is straightforward to adapt the bisection method discussed in Section 6.6. Adapt the
class RecursiveFixedPointApp to ﬁnd the numerical solutions of (6.53). Good starting
values for the left-most and right-most values of r are easy to obtain. The left-most value is
r = r∞ ≈ 0.8925. If we already know the sequence s1, s2,...,sm, then we can determine δ by
δm =
sm−1 − sm−2
sm − sm−1
. (6.55)
We use this determination for δm to ﬁnd the right-most value of r:
r
(m+1)
right =
sm − sm−1
δm
. (6.56)
We choose the desired precision to be 10−16. A summary of our results is given in Table 6.2.
Verify these results and determine δ.
(b) Use your values of sm to obtain a more accurate determination of α and δ.
Project 6.23. From chaos to order
The bifurcation diagram of the logistic map (see Figure 6.2) has many interesting features that
we have not explored. For example, you might have noticed that there are several smooth dark
bands in the chaotic region for r > r∞. Use BifurcateApp to generate the bifurcation diagram
for r∞ ≤ r ≤ 1. If we start at r = 1.0 and decrease r, we see that there is a band that narrows
and eventually splits into two parts at r ≈ 0.9196. If you look closely, you will see that the band
splits into four parts at r ≈ 0.899. If you look even more closely, you will see many more bands.
What type of change occurs near the splitting (merging) of these bands)? Use IterateMap to
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 183
m Period sm
1 1 0.500 000 000
2 2 0.809 016 994
3 4 0.874 640 425
4 8 00.888 660 216
5 16 0.891 666 845
6 32 0.892 310 883
7 64 0.892 448 823
8 128 0.892 478 091
Table 6.2: Values of the control parameter sm for the superstable trajectories of period 2m−1.
Nine decimal places are shown.
look at the time series of (6.5) for r = 0.9175. You will notice that although the trajectory looks
random, it oscillates back and forth between two bands. This behavior can be seen more clearly
if you look at the time series of xn+1 = f (2)(xn). A detailed discussion of the splitting of the
bands can be found in Peitgen et al.
Project 6.24. Calculation of the Lyapunov spectrum
In Section 6.5 we discussed the calculation of the Lyapunov exponent for the logistic map. If
a dynamical system has a multidimensional phase space, for example, the Hénon map and
the Lorenz model, there is a set of Lyapunov exponents called the Lyapunov spectrum that
characterize the divergence of the trajectory. As an example, consider a set of initial conditions
that forms a ﬁlled sphere in phase space for the (three-dimensional) Lorenz model. If we iterate
the Lorenz equations, then the set of phase space points will deform into another shape. If the
system has a ﬁxed point, this shape contracts to a single point. If the system is chaotic, then the
sphere will typically diverge in one direction but become smaller in the other two directions. In
this case we can deﬁne three Lyapunov exponents to measure the deformation in three mutually
perpendicular directions. These three directions generally will not correspond to the axes of the
original variables. Instead, we must use a Gram–Schmidt orthogonalization procedure.
The algorithm for ﬁnding the Lyapunov spectrum is as follows:
(i) Linearize the dynamical equations. If r is the f -component vector containing the dynamical
variables, then deﬁne ∆r as the linearized diﬀerence vector. For example, the linearized
Lorenz equations are
d∆x
dt
= −σ∆x + σ∆y (6.57a)
d∆y
dt
= −x∆z − z∆x + r∆x − ∆y (6.57b)
d∆z
dt
= x∆y + y∆x − b∆z. (6.57c)
(ii) Deﬁne f orthonormal initial values for ∆r. For example, ∆r1(0) = (1,0,0), ∆r2(0) = (0,1,0),
and ∆r3(0) = (0,0,1). Because these vectors appear in a linearized equation, they do not
have to be small in magnitude.
(iii) Iterate the original and linearized equations of motion. One iteration yields a new vector
from the original equation of motion and f new vectors ∆rα from the linearized equations.
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 184
(iv) Find the orthonormal vectors ∆rα from the ∆rα using the Gram–Schmidt procedure. That
is,
∆r1 =
∆r1
|∆r1|
(6.58a)
∆r2 =
∆r2 − (∆r1 · ∆r2)∆r1
|∆r2 − (∆r1 · ∆r2)∆r1|
(6.58b)
∆r3 =
∆r3 − (∆r1 · ∆r3)∆r1 − (∆r2 · ∆r3)∆r2
∆r3 − (∆r1 · ∆r3)∆r1 − (∆r2 · ∆r3)∆r2
. (6.58c)
It is straightforward to generalize the method to higher-dimensional models.
(v) Set the ∆rα(t) equal to the orthonormal vectors ∆rα(t).
(vi) Accumulate the running sum, Sα as Sα → Sα + log|∆rα(t)|.
(vii) Repeat steps (iii)–(vi) and periodically output the approximate Lyapunov exponents λα =
(1/n)Sα, where n is the number of iterations.
To obtain a result for the Lyapunov spectrum that represents the steady state attractor, include
only data after the transient behavior has ended.
(a) Compute the Lyapunov spectrum for the Lorenz model for σ = 16, b = 4, and r = 45.92. Try
other values of the parameters and compare your results.
(b) Linearize the equations for the Hénon map and ﬁnd the Lyapunov spectrum for a = 1.4 and
b = 0.3 in (6.32).
Project 6.25. A spinning magnet
Consider a compass needle that is free to rotate in a periodically reversing magnetic ﬁeld which
is perpendicular to the axis of the needle. The equation of motion of the needle is given by
d2φ
dt2
= −
µ
I
B0 cosωt sinφ (6.59)
where φ is the angle of the needle with respect to a ﬁxed axis along the ﬁeld, µ is the magnetic
moment of the needle, I its moment of inertia, and B0 and ω are the amplitude and the angular
frequency of the magnetic ﬁeld, respectively. Choose an appropriate numerical method for
solving (6.59) and plot the Poincaré map at time t = 2πn/ω. Verify that if the parameter λ =
2B0µ/I/ω2 > 1, then the motion of the needle exhibits chaotic motion. Briggs (see references)
discusses how to construct the corresponding laboratory system and other nonlinear physical
systems.
Project 6.26. Billiard models
Consider a two-dimensional planar geometry in which a particle moves with constant velocity
along straight line orbits until it elastically reﬂects oﬀ the boundary. This straight line motion
occurs in various “billiard” systems. A simple example of such a system is a particle moving
with ﬁxed speed within a circle. For this geometry the angle between the particle’s momentum
and the tangent to the boundary at a reﬂection is the same for all points.
Suppose that we divide the circle into two equal parts and connect them by straight lines
of length L as shown in Figure 6.14a. This geometry is called a stadium billiard. How does the
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 185
r
(a)
L
L
r
(b)
Figure 6.14: (a) Geometry of the stadium billiard model. (b) Geometry of the Sinai billiard
model.
motion of a particle in the stadium compare to the motion in the circle? In both cases we can
ﬁnd the trajectory of the particle by geometrical considerations. The stadium billiard model
and a similar geometry known as the Sinai billiard model (see Figure 6.14b) have been used
as model systems for exploring the foundations of statistical mechanics. There is also much
interest in relating the behavior of a classical particle in various billiard models to the solution
of Schrödinger’s equation for the same geometries.
(a) Write a program to simulate the stadium billiard model. Use the radius r of the semicircles
as the unit of length. The algorithm for determining the path of the particle is as follows:
(i) Begin with an initial position (x0,y0) and momentum (px0,py0) of the particle such that
|p0| = 1.
(ii) Determine which of the four sides the particle will hit. The possibilities are the top
and bottom line segments and the right and left semicircles.
(iii) Calculate the next position of the particle from the intersection of the straight line
deﬁned by the current position and momentum, and the equation for the segment
where the next reﬂection occurs.
(iv) Determine the new momentum, (px,py), of the particle after reﬂection such that the
angle of incidence equals the angle of reﬂection. For reﬂection oﬀ the line segments
we have (px,py) = (px,−py). For reﬂection oﬀ a circle we have
px = y2
− (x − xc)2
px − 2(x − xc)ypy (6.60a)
py = −2(x − xc)ypx + (x − xc)2
− y2
py (6.60b)
where (xc,0) is the center of the circle. (Note that the momentum px rather than px is
on the right-hand side of (6.60b). Remember that all lengths are scaled by the radius
of the circle.)
(v) Repeat steps (ii)–(iv).
(b) Determine if the particle dynamics is chaotic by estimating the largest Lyapunov exponent.
One way to do so is to start two particles with almost identical positions and/or momenta
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 186
(varying by say 10−5). Compute the diﬀerence ∆s of the two phase space trajectories as a
function of the number of reﬂections n, where ∆s is deﬁned by
∆s = |r1 − r2|2 + |p1 − p2|2. (6.61)
Choose L = 1 and r = 1. The Lyapunov exponent can be found from a semilog plot of ∆s
versus n. Repeat your calculation for diﬀerent initial conditions and average your values of
∆s before plotting. Repeat the calculation for L = 0.5 and 2.0 and determine if your results
depend on L.
(c) Another test for the existence of chaos is the reversibility of the motion. Reverse the momentum
after the particle has made n reﬂections, and let the drawing color equal the background
color so that the path can be erased. What limitation does roundoﬀ error place on
your results? Repeat this simulation for L = 1 and L = 0.
(d) Place a small hole of diameter d in one of the circular sections of the stadium so that the
particle can escape. Choose L = 1 and set d = 0.02. Give the particle a random position and
momentum, and record the time when the particle escapes through the hole. Repeat for at
least 104 particles and compute the fraction of particles S(n) remaining after a given number
of reﬂections n. The function S(n) will decay with n. Determine the functional dependence
of S on n and calculate the characteristic decay time if S(n) decays exponentially. Repeat for
L = 0.1, 0.5, and 2.0. Is the decay time a function of L? Does S(n) decay exponentially for
the circular billiard model (L = 0) (see Bauer and Bertsch)?
(e) Choose an arbitrary initial position for the particle in a stadium with L = 1 and a small hole
as in part (d). Choose at least 5000 values of the initial value px0 uniformly distributed
between 0 and 1. Choose py0 so that |p| = 1. Plot the escape time versus px0 and describe
the visual pattern of the trajectories. Then choose 5000 values of px0 in a smaller interval
centered about the value of px0 for which the escape time was greatest. Plot these values of
the escape time versus px0. Do you see any evidence of self-similarity?
(f) Repeat steps (a)–(e) for the Sinai billiard geometry.
Project 6.27. The circle map and mode locking
The driven damped pendulum can be approximated by a one-dimensional diﬀerence equation
for a range of amplitudes and frequencies of the driving force. This diﬀerence equation is
known as the circle map and is given by
θn+1 = θn + Ω −
K
2π
sin2πθn (modulo 1). (6.62)
The variable θ represents an angle, and Ω represents a frequency ratio, the ratio of the natural
frequency of the pendulum to the frequency of the periodic driving force. The parameter K is
a measure of the strength of the nonlinear coupling of the pendulum to the external force. An
important quantity is the winding number which is deﬁned as
W = lim
m→∞
1
m
m−1
n=0
∆θn (6.63)
where ∆θn = Ω − (K/2π)sin2πθn.
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 187
(a) Consider the linear case K = 0. Choose Ω = 0.4 and θ0 = 0.2 and determine W . Verify that
if Ω is a ratio of two integers, then W = Ω and the trajectory is periodic. What is the value
of W if Ω =
√
2/2, an irrational number? Verify that W = Ω and that the trajectory comes
arbitrarily close to any particular value of θ. Does θn ever return exactly to its initial value?
This type of behavior of the trajectory is termed quasiperiodic.
(b) For K > 0, we will ﬁnd that W Ω and “locks” into rational frequency ratios for a range of
values of K and Ω. This type of behavior is called mode locking. For K < 1, the trajectory
is either periodic or quasiperiodic. Determine the value of W for K = 1/2 and values of Ω
in the range O < Ω ≤ 1. The widths in Ω of the various mode-locked regions where W is
ﬁxed increase with K. Consider other values of K, and draw a diagram in the K-Ω plane
(0 ≤ K,Ω ≤ 1) so that those areas corresponding to frequency locking are shaded. These
shaded regions are called Arnold tongues.
(c) For K = 1, all trajectories are frequency-locked periodic trajectories. Fix K at K = 1 and
determine the dependence of W on Ω. The plot of W versus Ω for K = 1 is called the Devil’s
staircase.
Project 6.28. Chaotic scattering
In Chapter 5 we discussed the classical scattering of particles oﬀ a ﬁxed target, and found that
the diﬀerential cross section for a variety of interactions is a smoothly varying function of the
scattering angle. That is, a small change in the impact parameter b leads to a small change
in the scattering angle θ. Here we consider examples where small changes in b lead to large
changes in θ. Such a phenomenon is called chaotic scattering because of the sensitivity to initial
conditions that is characteristic of chaos. The study of chaotic scattering is relevant to the
design of electronic nanostructures, because many experimental structures exhibit this type of
scattering.
A typical scattering model consists of a target composed of a group of ﬁxed hard disks and
a scatterer consisting of a point particle. The goal is to compute the path of the scatterer as it
bounces oﬀ the disks and measure θ and the time of ﬂight as a function of the impact parameter
b. If a particle bounces inside the target region before leaving, the time of ﬂight can be very long.
There are even some trajectories for which the particle never leaves the target region.
Because it is diﬃcult to monitor a trajectory that bounces back and forth between the hard
disks, we consider instead a two-dimensional map that contains the key features of chaotic
scattering (see Yalcinkaya and Lai for further discussion). The map is given by
xn+1 = a xn −
1
4
(xn + yn)2
(6.64a)
yn+1 =
1
a
yn +
1
4
(xn + yn)2
(6.64b)
where a is a parameter. The target region is centered at the origin. In an actual scattering experiment,
the relation between (xn+1,yn+1) and (xn,yn) would be much more complicated, but the
map (6.64) captures most of the important features of realistic chaotic scattering experiments.
The iteration number n is analogous to the number of collisions of the scattered particle oﬀ the
disks. When xn or yn is signiﬁcantly diﬀerent from zero, the scatterer has left the target region.
(a) Write a program to iterate the map (6.64). Let a = 8.0 and y0 = −0.3. Choose 104 initial
values of x0 uniformly distributed in the interval 0 < x0 < 0.1. Determine the time T (x0),
the number of iterations for which xn ≤ −5.0. After this time, xn rapidly moves to −∞. Plot
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 188
T (x0) versus x0. Then choose 104 initial values in a smaller interval centered about a value
of x0 for which T (x0) > 7. Plot these values of T (x0) versus x0. Do you see any evidence of
self-similarity?
(b) A trajectory is said to be uncertain if a small change in x0 leads to a change in T (x0). We
expect that the number of uncertain trajectories N will depend on a power of ; that is,
N ∼ α. Determine N( ) for = 10−p with p = 2 to 7 using the values of x0 in part (a). Then
determine the uncertainty dimension 1 − α from a log-log plot of N versus . Repeat these
measurements for other values of a. Does α depend on a?
(c) Choose 4×104 initial conditions in the same interval as in part (a) and determine the number
of trajectories S(n) that have not yet reached xn = −5 as a function of the number of iterations
n. Plot lnS(n) versus n and determine if the decay is exponential. It is possible to obtain
algebraic decay for values of a less than approximately 6.5.
(d) Let a = 4.1 and choose 100 initial conditions uniformly distributed in the region 1.0 < x0 <
1.05 and 0.60 < y0 < 0.65. Are there any trajectories that are periodic and hence have inﬁnite
escape times? Due to the accumulation of roundoﬀ error, it is possible to ﬁnd only ﬁnite,
but very long escape times. These periodic trajectories form closed curves, and the regions
enclosed by them are called KAM surfaces.
Project 6.29. Chemical reactions
In Project 4.17 we discussed how chemical oscillations can occur when the reactants are continuously
replenished. In this project we introduce a set of chemical reactions that exhibits the
period doubling route to chaos. Consider the following reactions (see Peng et al.):
P → A (6.65a)
P + C → A + C (6.65b)
A → B (6.65c)
A + 2B → 3B (6.65d)
B → C (6.65e)
C → D. (6.65f)
Each of the above reactions has an associated rate constant. The time dependence of the concentrations
of A,B, and C is given by:
dA
dt
= k1P + k2P C − k3A − k4AB2
(6.66a)
dB
dt
= k3A + k4AB2
− k5B (6.66b)
dC
dt
= k5B − k5C. (6.66c)
We assume that P is held constant by replenishment from an external source. We also assume
the chemicals are well mixed so that there is no spatial dependence. In Section 7.8 we discuss
the eﬀects of spatial inhomogeneities due to molecular diﬀusion. Equations (6.65) can be
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 189
written in a dimensionless form as
dX
dτ
= c1 + c2Z − X − XY 2
(6.67a)
c3
dY
dτ
= X + XY 2
− Y (6.67b)
c4
dZ
dτ
= Y − Z (6.67c)
where the ci are constants, τ = k3t, and X, Y , and Z are proportional to A, B, and C, respectively.
(a) Write a program to solve the coupled diﬀerential equations in (6.67). Use a fourth-order
Runge–Kutta algorithm with an adaptive step size. Plot lnY versus the time τ.
(b) Set c1 = 10, c3 = 0.005, and c4 = 0.02. The constant c2 is the control parameter. Consider
c2 = 0.10 to 0.16 in steps of 0.005. What is the period of lnY for each value of c2?
(c) Determine the values of c2 at which the period doublings occur for as many period doublings
as you can determine. Compute the constant δ [see (6.9)] and compare its value to
the value of δ for the logistic map.
(d) Make a bifurcation diagram by taking the values of lnY from the Poincaré plot at X = Z and
plotting them versus the control parameter c2. Do you see a sequence of period doublings?
(e) Use three-dimensional graphics to plot the trajectory of (6.67) with lnX, lnY , and lnZ as
the three axes. Describe the attractors for some of the cases considered in part (a).
Appendix 6A: Stability of the Fixed Points of the Logistic Map
In the following, we derive analytic expressions for the ﬁxed points of the logistic map. The
ﬁxed-point condition is given by
x∗
= f (x∗
). (6.68)
From (6.5) this condition yields the two ﬁxed points
x∗
= 0 and x∗
= 1 −
1
4r
. (6.69)
Because x is restricted to be positive, the only ﬁxed point for r < 1/4 is x = 0. To determine the
stability of x∗, we let
xn = x∗
+ n (6.70a)
and
xn+1 = x∗
+ n+1. (6.70b)
Because | n| 1, we have
xn+1 = f (x∗
+ n) ≈ f (x∗
) + nf (x∗
) = x∗
+ nf (x∗
). (6.71)
If we compare (6.70b) and (6.71), we obtain
n+1/ n = f (x∗
). (6.72)
If |f (x∗)| > 1, the trajectory will diverge from x∗ because | n+1| > | n|. The opposite is true
for |f (x∗)| < 1. Hence, the local stability criteria for a ﬁxed point x∗ are
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 190
1. |f (x∗)| < 1, x∗ is stable;
2. |f (x∗)| = 1, x∗ is marginally stable;
3. |f (x∗)| > 1, x∗ is unstable.
If x∗ is marginally stable, the second derivative f (x) must be considered, and the trajectory
approaches x∗ with deviations from x∗ inversely proportional to the square root of the number
of iterations.
For the logistic map, the derivatives at the ﬁxed points are. respectively,
f (x = 0) =
d
dx
[4rx(1 − x)]
x=0
= 4r (6.73)
and
f (x = x∗
) =
d
dx
[4rx(1 − x)]
x=1−1/4r
= 2 − 4r. (6.74)
It is straightforward to use (6.73) and (6.74) to ﬁnd the range of r for which x∗ = 0 and x∗ =
1 − 1/4r are stable.
If a trajectory has period two, then f (2)(x) = f (f (x)) has two ﬁxed points. If you are interested,
you can solve for these ﬁxed points analytically. As we found in Problem 6.2, these two
ﬁxed points become unstable at the same value of r. We can derive this property of the ﬁxed
points using the chain rule of diﬀerentiation:
d
dx
f (2)
(x) x=x0
=
d
dx
f (f (x)) x=x0
= f (f (x0))f (x) x=x0
. (6.75)
If we substitute x1 = f (x0), we can write
d
dx
f (f (x)) x=x0
= f (x1)f (x0). (6.76)
In the same way, we can show that
d
dx
f (2)
(x) x=x1
= f (x0)f (x1). (6.77)
We see that if x0 becomes unstable, then |f (2) (x0)| > 1 as does |f (2) (x1)|. Hence, x1 is also
unstable at the same value of r, and we conclude that both ﬁxed points of f (2)(x) bifurcate at
the same value of r, leading to an trajectory of period 4.
From (6.74) we see that f (x = x∗) = 0 when r = 1/2 and x∗ = 1/2. Such a ﬁxed point is said to
be superstable, because as we found in Problem 6.4, convergence to the ﬁxed point is relatively
rapid. Superstable trajectories occur whenever one of the ﬁxed points is at x∗ = 1/2.
Appendix 6B: Finding the Roots of a Function
The roots of a function f (x) are the values of the variable x for which the function f (x) is zero.
Even an apparently simple equation such as
f (x) = tanx − x − c = 0 (6.78)
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 191
where c is a constant cannot be solved analytically for x.
Regardless of the function and the approach to root ﬁnding, the ﬁrst step should be to
learn as much as possible about the function. For example, plotting the function will help us to
determine the approximate locations of the roots.
Newton’s (or the Newton–Raphson) method is based on replacing the function by the ﬁrst
two terms of the Taylor expansion of f (x) about the root x. If our initial guess for the root is x0,
we can write f (x) ≈ f (x0) + (x − x0)f (x0). If we set f (x) equal to zero and solve for x, we ﬁnd
x = x0 − f (x0)/f (x0). If we have made a good choice for x0, the resultant value of x should be
closer to the root than x0. The general procedure is to calculate successive approximations as
follows:
xn+1 = xn −
f (xn)
f (xn)
. (6.79)
If this series converges, it converges very quickly. However, if the initial guess is poor or if
the function has closely spaced multiple roots, the series may not converge. The successive
iterations of Newton’s method is an another example of a map. Newton’s method also works
with complex functions as we will see in the following problem.
Problem 6.30. Cube roots
Consider the function f (z) = z3−1, where z = x+iy, and f (z) = z2. Map the range of convergence
of (6.79) in the region [−2 < x < 2,−2 < y < 2] in the complex plane. Color the starting z value
red, green, or blue depending on the root to which the initial guess converges. If the trajectory
does not converge, color the starting point black. For more insight add a mouse handler to your
program so that if you click on your plot, the sequence of iterations starting from the point
where you clicked will be shown.
The following problem discusses a situation that typically arises in courses on quantum
mechanics.
Problem 6.31. Energy levels in a ﬁnite square well
The quantum mechanical energy levels in the one-dimensional ﬁnite square well can be found
by solving the relation
tan = ρ2 − 2 (6.80)
where =
√
mEa2/2 and ρ = mV0a2/2 are deﬁned in terms of the particle mass m, the
particle energy E, the width of the well a, and the depth of the well V0. The function tan
has zeros at = 0,π,2π,... and asymptotes at = 0,π/2,3π/2,5π/2... . The function ρ − 2 is
a quarter circle of radius ρ. Write a program to plot these two functions with ρ = 3, and then
use Newton’s method to determine the roots of (6.80). Find the value of ρ and thus V0 such that
below this value there is only one energy level and above this value there is more than one. At
what value of ρ do three energy levels ﬁrst appear?
In Section 6.6 we introduced the bisection root-ﬁnding algorithm. This algorithm is implemented
in the Root class in the numerics package. It can be used with any function.
Listing 6.6: The bisection method deﬁned in the Root class in the numerics package
public s t a t i c double bisection ( final Function f , double x1 , double x2 ,
final double tolerance ) {
int count = 0;
int maxCount = ( int ) (Math . log (Math . abs ( x2 − x1 )/ tolerance )/Math . log ( 2 ) ) ;
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 192
maxCount = Math .max(MAX_ITERATIONS, maxCount ) + 2;
double y1 = f . evaluate ( x1 ) , y2 = f . evaluate ( x2 ) ;
i f ( y1 y2 > 0) { / / y1 and y2 must have o p p o s i t e sign
return Double .NaN; / / i n t e r v a l does not contain a root
}
while ( count < maxCount ) {
double x = ( x1 + x2 ) / 2;
double y = f . evaluate ( x ) ;
i f (Math . abs ( y ) < tolerance ) return x ;
i f ( y y1 > 0) { / / r e p l a c e the endpoint that has the same sign
x1 = x ;
y1 = y ;
}
else {
x2 = x ;
y2 = y ;
}
count ++;
}
return Double .NaN; / / did not converge in max i t e r a t i o n s
}
The bisection algorithm is guaranteed to converge if you can ﬁnd an interval where the
function changes sign. However, it is slow. Newton’s algorithm is very fast but may not converge.
We develop an algorithm in the following problem that combines these two approaches.
Problem 6.32. Finding roots
Modify Newton’s algorithm to keep track of the interval between the minimum and the maximum
of x while iterating (6.79). If the iterate xn+1 jumps outside this interval, interrupt Newton’s
method and use the bisection algorithm for one iteration. Test the root at the end of the
iterative process to check that the algorithm actually found a root. Test your algorithm on the
function in (6.78).
References and Suggestions for Further Reading
Books
Ralph H. Abraham and Christopher D. Shaw, Dynamics – The Geometry of Behavior, 2nd ed.
(Addison–Wesley, 1992). The authors use an abundance of visual representations.
Hao Bai-Lin, Chaos II (World Scientiﬁc, 1990). A collection of reprints on chaotic phenomena.
The following papers were cited in the text. James P. Crutchﬁeld, J. Doyne Farmer, Norman
H. Packhard, and Robert S. Shaw, “Chaos,” Sci. Am. 255 (6), 46–57 (1986); Mitchell J.
Feigenbaum, “Quantitative universality for a class of nonlinear transformations,” J. Stat.
Phys. 19, 25–52 (1978); M. Hénon, “A two-dimensional mapping with a strange attractor,”
Commun. Math. Phys. 50, 69–77 (1976); Robert M. May, “Simple mathematical models
with very complicated dynamics,” Nature 261, 459–467 (1976); Robert Van Buskirk and
Carson Jeﬀries, “Observation of chaotic dynamics of coupled nonlinear oscillators,” Phys.
Rev. A 31, 3332–3357 (1985).
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 193
G. L. Baker and J. P. Gollub, Chaotic Dynamics: An Introduction, 2nd ed. (Cambridge University
Press, 1995). A good introduction to chaos with special emphasis on the forceddamped-nonlinear
harmonic oscillator. Several programs are given.
Pedrag Cvitanovic, Universality in Chaos, 2nd ed. (Adam-Hilger, 1989). A collection of reprints
on chaotic phenomena including the articles by Hénon and May also reprinted in the
Bai-Lin collection and the chaos classic, Mitchell J. Feigenbaum, “Universal behavior in
nonlinear systems,” Los Alamos Sci. 1, 4–27 (1980).
Robert Devaney, A First Course in Chaotic Dynamical Systems Addison–Wesley, 1992). This
text is a good introduction to the more mathematical ideas behind chaos and related top-
ics.
Jan Fröyland, Introduction to Chaos and Coherence (Institute of Physics Publishing, 1992). See
Chapter 7 for a simple model of Saturn’s rings.
Martin C. Gutzwiller, Chaos in Classical and Quantum Mechanics (Springer–Verlag, 1990). A
good introduction to problems in quantum chaos for the more advanced student.
Robert C. Hilborn, Chaos and Nonlinear Dynamics (Oxford University Press, 1994). An excellent
pedagogically oriented text.
Douglas R. Hofstadter, Metamagical Themas (Basic Books, 1985). A shorter version is given in
his article, “Metamagical themas,” Sci. Am. 245 (11), 22–43 (1981).
E. Atlee Jackson, Perspectives of Nonlinear Dynamics, Vols. 1 and 2. (Cambridge University
Press, 1989, 1991). An advanced text that is a joy to read.
R. V. Jensen, “Chaotic scattering, unstable periodic orbits, and ﬂuctuations in quantum transport,”
Chaos 1, 101–109 (1991). This paper discusses the quantum version of systems
similar to those discussed in Projects 6.28 and 6.26.
Francis C. Moon, Chaotic and Fractal Dynamics, An Introduction for Applied Scientists and
Engineers (Wiley–VCH, 1992). An engineering oriented text with a section on how to
build devices that demonstrate chaotic dynamics.
Edward Ott, Chaos in Dynamical Systems (Cambridge University Press, 1993). An excellent
textbook on chaos at the upper undergraduate to graduate level. See also E. Ott, “Strange
attractors and chaotic motions of dynamical systems,” Rev. Mod. Phys. 53, 655–671 (1981).
Edward Ott, Tim Sauer, and James A. Yorke, editors, Coping with Chaos (John Wiley & Sons,
1994). A reprint volume emphasizing the analysis of experimental time series from chaotic
systems.
Heinz–Otto Peitgen, Hartmut Jürgens, and Dietmar Saupe, Fractals for the Classroom, Part II
(Springer–Verlag, 1992). A delightful book with many beautiful illustrations. Chapter 11
discusses the nature of the bifurcation diagram of the logistic map.
Ian Percival and Derek Richards, Introduction to Dynamics (Cambridge University Press, 1982).
An advanced undergraduate text that introduces phase trajectories and the theory of stability.
A derivation of the Hamiltonian for the driven damped pendulum considered in
Section 6.4 is given in Chapter 5, example 5.7.
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 194
Ivars Peterson, Newton’s Clock: Chaos in the Solar System (W. H. Freeman, 1993). An historical
survey of our understanding of the motion of bodies within the solar system with a focus
on chaotic motion.
Stuart L. Pimm, The Balance of Nature (The University of Chicago Press, 1991). An introductory
treatment of ecology with a chapter on applications of chaos to real biological
systems. The author contends that much of the diﬃculty in assessing the importance of
chaos is that ecological studies are too short.
William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery, Numerical
Recipes, 2nd ed. (Cambridge University Press, 1992). Chapter 9 discusses various rootﬁnding
methods.
S. Neil Rasband, Chaotic Dynamics of Nonlinear Systems (Wiley–Interscience, 1990). Clear
presentation of the most important topics in classical chaos theory.
M. Lakshmanan and S. Rajaseekar, Nonlinear Dynamics (Springer–Verlag, 2003). Although
this text is for advanced students, many parts are accessible.
Robert Shaw, The Dripping Faucet as a Model Chaotic System (Aerial Press, 1984).
Steven Strogatz, Nonlinear Dynamics and Chaos with Applications to Physics, Biology, Chemistry
and Engineering (Addison–Wesley, 1994). Another outstanding text.
Anastasios A. Tsonis, Chaos: From Theory to Applications (Plenum Press, 1992). Of particular
interest is the discussion of applications to nonlinear time series forecasting.
Nicholas B. Tuﬁllaro, Tyler Abbott, and Jeremiah Reilly, Nonlinear Dynamics and Chaos (Addison–
Wesley, 1992). See also, N. B. Tuﬁllaro and A. M. Albano, “Chaotic dynamics of a bouncing
ball,” Am. J. Phys. 54, 939–944 (1986). The authors describe an undergraduate level experiment
on a bouncing ball subject to repeated impacts with a vibrating table. See also
the article by Warr et al.
Articles
Garin F. J. Añaños and Constantino Tsallis, “Ensemble averages and nonextensivity of onedimensional
maps,” Phys. Rev. Lett. 93, 020601 (2004).
Gregory L. Baker, “Control of the chaotic driven pendulum,” Am. J. Phys. 63 (9), 832–838
(1995).
W. Bauer and G. F. Bertsch, “Decay of ordered and chaotic systems,” Phys. Rev. Lett. 65, 2213
(1990). See also the comment by Olivier Legrand and Didier Sornette, “First return, transient
chaos, and decay in chaotic systems,” Phys. Rev. Lett. 66, 2172 (1991), and the reply
by Bauer and Bertsch on the following page. The dependence of the decay laws on
chaotic behavior is very general and has been considered in various contexts including
room acoustics and the chaotic scattering of microwaves in an “elbow” cavity. Chaotic
behavior is a suﬃcient but not necessary condition for exponential decay.
Keith Briggs, “Simple experiments in chaotic dynamics,” Am. J. Phys. 55, 1083–1089 (1987).
S. N. Coppersmith, “A simpler derivation of Feigenbaum’s renormalization group equation for
the period-doubling bifurcation sequence,” Am. J. Phys. 67 (1), 52–54 (1999).
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 195
J. P. Crutchﬁeld, J. D. Farmer, and B. A. Huberman, “Fluctuations and simple chaotic dynamics,”
Phys. Repts. 92, 45–82 (1982).
Robert DeSerio, “Chaotic pendulum: The complete attractor,” Am. J. Phys. 71 (3), 250–257
(2003).
William L. Ditto and Louis M. Pecora, “Mastering chaos,” Sci. Am. 262 (8), 78–82 (1993).
J. C. Earnshaw and D. Haughey, “Lyapunov exponents for pedestrians,” Am. J. Phys. 61, 401
(1993).
Daniel J. Gauthier, “Resource letter: CC-1: Controlling chaos,” Am. J. Phys. 71 (8), 750–759
(2003). The article includes a bibliography of materials on controlling chaos.
Wayne Hayes, “Computer simulations, exact trajectories, and the gravitational N-body problem,”
Am. J. Phys. 72 (9), 1251–1257 (2004). The article discusses the concept of shadowing
which is used in the simulation of chaotic systems.
Robert C. Hilborn, “Sea gulls, butterﬂies, and grasshoppers: A brief history of the butterﬂy
eﬀect in nonlinear dynamics,” Am. J. Phys. 72 (4), 425–427 (2004).
Robert C. Hilborn and Nicholas B. Tuﬁllaro, “Resource letter: ND-1: Nonlinear dynamics,”
Am. J. Phys. 65 (9), 822–834 (1997).
Ying–Cheng Lai, “Controlling chaos,” Computers in Physics 8, 62 (1994). Section 6.6 is based
on this article.
R. B. Levien and S. M. Tan, “Double pendulum: An experiment in chaos,” Am. J. Phys. 61 (11),
1038–1044 (1993).
V. Lopac and V. Danani, “Energy conservation and chaos in the gravitationally driven Fermi
oscillator,” Am. J. Phys. 66 (10), 892–902 (1998).
J. B. McLaughlin, “Period-doubling bifurcations and chaotic motion for a parametrically forced
pendulum,” J. Stat. Phys. 24, 375–388 (1981).
Sergio De Souza–Machado, R. W. Rollins, D. T. Jacobs, and J. L. Hartman, “Studying chaotic
systems using microcomputer simulations and Lyapunov exponents,” Am. J. Phys. 58 (4),
321–329 (1990).
Bo Peng, Stephen K. Scott, and Kenneth Showalter, “Period doubling and chaos in a threevariable
autocatalator,” J. Phys. Chem. 94, 5243–5246 (1990).
Bo Peng, Valery Petrov, and Kenneth Showalter, “Controlling chemical chaos," J. Phys. Chem.
95, 4957–4959 (1991).
Troy Shinbrot, Celso Grebogi, Jack Wisdom, and James A. Yorke, “Chaos in a double pendulum,”
Am. J. Phys. 60 (6), 491–499 (1992).
Niraj Srivastava, Charles Kaufman, and Gerhard Müller, “Hamiltonian chaos,” Computers in
Physics 4, 549–553 (1990); ibid. 5, 239–243 (1991); ibid. 6, 84–88 (1992).
Todd Timberlake, “A computational approach to teaching conservative chaos,” Am. J. Phys. 72
(8), 1002–1007 (2004).
CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 196
Jan Tobochnik and Harvey Gould, “Quantifying chaos,” Computers in Physics 3 (6), 86 (1989).
There is a typographical error in this paper in the equations for step (3) of the algorithm
for computing the Lyapunov spectrum. The correct equations are given in Project 6.24.
S. Warr, W. Cooke, R. C. Ball, and J. M. Huntley, “Probability distribution functions for a single
particle vibrating in one dimension: Experimental study and theoretical analysis,” Physica
A 231, 551–574 (1996). This paper and the book by Tuﬁllaro, Abbott, and Reilly consider
the motion of a ball bouncing on a periodically vibrating table. This nonlinear dynamical
system exhibits ﬁxed points, periodic and strange attractors, and period-doubling bifurcations
to chaos, similar to the logistic map. Simulations of this system are very interesting,
but not straightforward.
Tolga Yalcinkaya and Ying–Cheng Lai, “Chaotic scattering,” Computers in Physics 9, 511–
518 (1995). Project 6.28 is based on a draft of this article. The map (6.64) is discussed
in more detail in Yun–Tung Lau, John M. Finn, and Edward Ott, “Fractal dimension in
nonhyperbolic chaotic scattering,” Phys. Rev. Lett. 66, 978 (1991).
Chapter 7
Random Processes
Random processes are introduced in the context of several simple physical systems, including
random walks on a lattice, polymers, and diﬀusion–controlled chemical reactions. The generation
of random number sequences is also discussed.
7.1 Order to Disorder
In Chapter 6 we saw several examples of how, under certain conditions, the behavior of a nonlinear
deterministic system can appear to be random. In this chapter we will see some examples
of how chance can generate statistically predictable outcomes. For example, we know that if we
bet often on the outcome of a game for which the probability of winning is less than 50%, we
will lose money eventually.
We ﬁrst discuss an example that illustrates the tendency of systems of many particles to
evolve to a well-deﬁned state. Imagine a closed box that is divided into two parts of equal
volume (see Figure 7.1). The left half contains a gas of N identical particles and the right half
is initially empty. We then make a small hole in the partition between the two halves. What
happens? We know that after some time, the average number of particles in each half of the box
will become N/2, and we say that the system has reached equilibrium.
How can we simulate this process? One way is to give each particle an initial velocity and
position and adopt a deterministic model of the motion of the particles. For example, we could
assume that each particle moves in a straight line until it hits a wall of the box or another
particle and undergoes an elastic collision. We will consider similar deterministic models in
Chapter 8. Instead, we ﬁrst simulate a probabilistic model based on a random process.
The basic assumptions of this model are that the motion of the particles is random and the
particles do not interact with one another. Hence, the probability per unit time that a particle
goes through the hole in the partition is the same for all N particles regardless of the number
of particles in either half. We also assume that the size of the hole is such that only one particle
can pass through at a time. We ﬁrst model the motion of a particle passing through the hole
by choosing one of the N particles at random and moving it to the other side. For visualization
purposes, we will use arrays to specify the position of each particle. We then randomly generate
an integer i between 0 and N − 1 and change the arrays appropriately. A more eﬃcient Monte
Carlo algorithm is discussed in Problem 7.2b. The tool we need to simulate this random process
is a random number generator.
197
CHAPTER 7. RANDOM PROCESSES 198
Figure 7.1: A box is divided into two equal halves by a partition. After a small hole is opened
in the partition, one particle can pass through the hole per unit time.
It is counterintuitive that we can use a deterministic computer to generate sequences of
random numbers. In Section 7.9 we discuss some of the methods for computing a set of numbers
that appear statistically random but are in fact generated by a deterministic algorithm.
These algorithms are sometimes called pseudorandom number generators to distinguish their output
from intrinsically random physical processes, such as the time between clicks in a Geiger
counter near a radioactive sample.
For the present we will be content to use the random number generator supplied with Java,
although the random number generators included with various programming languages vary in
quality. The method Math.random() produces a random number r that is uniformly distributed
in the interval 0 ≤ r < 1. To generate a random integer i between 0 and N − 1, we write:
int i = ( int ) (N Math . random ( ) ) ;
The eﬀect of the (int) cast is to eliminate the decimal digits from a ﬂoating point number. For
example, (int)(5.7) = 5.
The algorithm for simulating the evolution of the model can be summarized by the following
steps:
1. Use a random number generator to choose a particle at random.
2. Move this particle to the other side of the box.
3. Give the particle a random position on the new side of the box. This step is for visualization
purposes only.
4. Increase the “time” by unity.
Note that this deﬁnition of time is arbitrary. Class Box implements this algorithm and class
BoxApp plots the evolution of the number of particles on the left half of the box.
Listing 7.1: Class Box for the simulation of the approach to equilibrium.
package org . opensourcephysics . sip . ch07 ;
import java . awt . ;
import org . opensourcephysics . display . ;
public class Box implements Drawable {
public double x [ ] , y [ ] ;
CHAPTER 7. RANDOM PROCESSES 199
public int N, nleft , time ;
public void i n i t i a l i z e ( ) {
/ / l o c a t i o n of p a r t i c l e s ( f o r v i s u a l i z a t i o n purposes only )
x = new double [N] ;
y = new double [N] ;
n l e f t = N; / / s t a r t with a l l p a r t i c l e s on the l e f t
time = 0;
for ( int i = 0; i <N; i ++) {
x [ i ] = 0.5 Math . random ( ) ; / / needed only f o r v i s u a l i z a t i o n
y [ i ] = Math . random ( ) ;
}
}
public void step ( ) {
int i = ( int ) (Math . random ( ) N) ;
i f ( x [ i ] <0.5) {
nleft −−; / / move to r i g h t
x [ i ] = 0.5 (1+Math . random ( ) ) ;
y [ i ] = Math . random ( ) ;
} else {
n l e f t ++; / / move to l e f t
x [ i ] = 0.5 Math . random ( ) ;
y [ i ] = Math . random ( ) ;
}
time++;
}
public void draw ( DrawingPanel panel , Graphics g ) {
i f ( x==null ) {
return ;
}
int size = 2;
/ / p o s i t i o n of p a r t i t i o n in middle of box
int xMiddle = panel . xToPix ( 0 . 5 ) ;
g . setColor ( Color . black ) ;
g . drawLine ( xMiddle , panel . yToPix ( 0 ) , xMiddle , panel . yToPix ( 0 . 4 5 ) ) ;
g . drawLine ( xMiddle , panel . yToPix ( 0 . 5 5 ) , xMiddle , panel . yToPix ( 1 . 0 ) ) ;
g . setColor ( Color . red ) ;
for ( int i = 0; i <N; i ++) {
int xpix = panel . xToPix ( x [ i ] ) ;
int ypix = panel . yToPix ( y [ i ] ) ;
g . f i l l O v a l ( xpix , ypix , size , size ) ;
}
}
}
Listing 7.2: Target class for plotting the approach to equilibrium.
package org . opensourcephysics . sip . ch07 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . ;
public class BoxApp extends AbstractSimulation {
CHAPTER 7. RANDOM PROCESSES 200
Box box = new Box ( ) ;
PlotFrame plotFrame = new PlotFrame ( "time" , "number on left" ,
"Box data" ) ;
DisplayFrame displayFrame = new DisplayFrame ( "Partitioned box" ) ;
public void i n i t i a l i z e ( ) {
displayFrame . clearDrawables ( ) ;
displayFrame . addDrawable ( box ) ;
box .N = control . getInt ( "Number of particles" ) ;
box . i n i t i a l i z e ( ) ;
plotFrame . clearData ( ) ;
displayFrame . setPreferredMinMax (0 , 1 , 0 , 1 ) ;
}
public void doStep ( ) {
box . step ( ) ;
plotFrame . append (0 , box . time , box . n l e f t ) ;
}
public void reset ( ) {
/ / c l i c k i n g r e s e t s e r a s e p o s i t i o n s of p a r t i c l e s
control . setValue ( "Number of particles" , 64);
plotFrame . clearData ( ) ;
enableStepsPerDisplay ( true ) ;
setStepsPerDisplay ( 1 0 ) ;
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl . createApp (new BoxApp ( ) ) ;
}
}
How long does it take for the system to reach equilibrium? How does this time depend on
the number of particles? After the system reaches equilibrium, what is the magnitude of the
ﬂuctuations? How do the ﬂuctuations depend on the number of particles? Problems 7.2 and 7.3
explore such questions.
Exercise 7.1. Simple tests of operators and methods
(a) It is frequently quicker to write a short program to test how the operators and methods of
a computer language work than to look them up in a manual or online. Write a test class to
determine the values of 3/2, 3.0/2.0, (int) (3.0/2.0), 2/3, and (int) (-3/2).
(b) Determine the behavior of Math.round(arg), Math.ceil(arg), and Math.rint(arg).
(c) Write a program to test whether the same sequence of random numbers appears each time
the program is run if we use the method Math.random() to generate the sequence.
(d) Create an object from class Random and use the methods setSeed(long seed) and nextDouble().
Show that you obtain the same sequence of random numbers if the same seed is used.
One reason to specify the seed rather than to choose it at random from the time (as is the
default) is that it is convenient to use the same random number sequence when testing a
program. Suppose that your program gives a strange result for a particular run. If you
CHAPTER 7. RANDOM PROCESSES 201
notice a error in the program and change the program, you would want to use the same
random number sequence to test whether your changes corrected the error. Another reason
for specifying the seed is that another user could obtain the same results if you tell them
the seed that you used.
Problem 7.2. Approach to equilibrium
(a) Use BoxApp and Box and describe the nature of the evolution of n, the number of particles
on the left side of the box. Choose the total number of particles N to be N = 8, 16, 64, 400,
800, and 3600. Does the system reach equilibrium? What is your qualitative criterion for
equilibrium? Does n, the number of particles on the left-hand side, change when the system
is in equilibrium?
(b) The algorithm we have used is needlessly cumbersome, because our only interest is the
number of particles on each side. We used the positions only for visualization purposes.
Because each particle has the same chance to go through the hole, the probability per unit
time that a particle moves from left to right equals the number of particles on the left divided
by the total number of particles, that is, p = n/N. Modify the program so that the
following algorithm is implemented.
(i) Generate a random number r from a uniformly distributed set of random numbers in
the interval 0 ≤ r < 1.
(ii) If r ≤ p = n/N, move a particle from left to right, that is n → n−1; otherwise, n → n+1.
(c) Does the time dependence of n appear to be deterministic for suﬃciently large N? What is
the qualitative behavior of n(t)? Estimate the time for the system to reach equilibrium from
the plots. How does this time depend on N?
Problem 7.3. Equilibrium ﬂuctuations
(a) As a rough measure of the equilibrium ﬂuctuations, visually estimate the deviation of n(t)
from N/2 for N = 16, 64, 400, 800, and 3600? Choose a time interval that is bigger than the
time needed to reach equilibrium. How do your results for the deviation depend on N?
(b) A better measure of the equilibrium ﬂuctuations is the mean square ﬂuctuations ∆n2, which
is deﬁned as
∆n2
= (n − n )2
= n2
− 2 n n + n 2
= n2
− 2 n 2
+ n 2
= n2
− n 2
. (7.1)
The brackets ··· denote an average taken after the system has reached equilibrium. The
relative magnitude of the ﬂuctuations is ∆n/ n . Modify your program so that averages are
taken after equilibrium has been reached. Run for a time that is long enough to obtain
meaningful results. Compute the mean square ﬂuctuations ∆n2 for the same values of N
considered in part (a). How do the relative ﬂuctuations, ∆n/ n , depend on N? (You might
ﬁnd it helpful to see how averages are computed in Listings 7.3 and 7.4.)
From Problem 7.2 we see that n(t) decreases in time from its initial value to its equilibrium
value in an almost deterministic manner if N 1. It is instructive to derive the time dependence
of n(t) to show explicitly how chance can generate deterministic behavior. If there are
CHAPTER 7. RANDOM PROCESSES 202
n(t) particles on the left side after t moves, then the change in n (t) in the time interval ∆t is
given by
∆ n =
− n (t)
N
+
N − n (t)
N
∆t. (7.2)
(We deﬁned the time so that the time interval ∆t = 1 in our simulations.) What is the meaning
of the two terms in (7.2)? If we treat n and t as continuous variables and take the limit ∆t → 0,
we have
∆ n
∆t
→
d n
dt
= 1 −
2 n (t)
N
. (7.3)
The solution of the diﬀerential equation (7.3) is
n (t) =
N
2
[1 + e−2t/N
] (7.4)
where we have used the initial condition n (t = 0) = N. Note that n (t) decays exponentially
to its equilibrium value N/2. How does this form (7.4) compare to your simulation results for
various values of N? We can deﬁne a relaxation time τ as the time it takes the diﬀerence [ n (t)−
N/2] to decrease to 1/e of its initial value. How does τ depend on N? Does this prediction for τ
agree with your results from Problem 7.2?
∗Problem 7.4. A simple modiﬁcation
Modify your program so that each side of the box is chosen with equal probability. One particle
is then moved from the side chosen to the other side. If the side chosen does not have a particle,
then no particle is moved during this time interval. Do you expect that the system behaves in
the same way as before? Do the simulation starting with all the particles on the left side of
the box and choose N = 800. Do not keep track of the positions of the particles. Compare the
behavior of n(t) with the behavior of n(t) found in Problem 7.3. How do the values of n and
∆n compare?
The probabilistic method discussed on page 198 for simulating the approach to equilibrium
is an example of a Monte Carlo algorithm, that is, the random sampling of the most probable
outcomes. An alternative method is to use exact enumeration and determine all the possibilities
at each time interval. For example, suppose that N = 8 and n(t = 0) = 8. At t = 1 the only
possibility is n = 7 and n = 1. Hence, P (n = 7,t = 1) = 1, and all other probabilities are zero. At
t = 2 one of the seven particles on the left can move to the right, or the one particle on the right
can move to the left. Because the ﬁrst possibility can occur in seven diﬀerent ways, we have the
nonzero probabilities, P (n = 6,t = 2) = 7/8 and P (n = 8,t = 2) = 1/8. Hence, at t = 2 the average
number of particles on the left side of the box is
n(t = 2) = 6P (n = 6,t = 2) + 8P (n = 8,t = 2) =
1
8
[6 × 7 + 8 × 1] = 6.25. (7.5)
Is this exact result consistent with what you found in Problem 7.2? In this example N is small,
and we could continue the enumeration of all the possibilities indeﬁnitely. However, for larger
N the number of possibilities becomes very large after a few time intervals, and we need to to
use Monte Carlo methods to sample the most probable outcomes.
7.2 Random Walks
In Section 7.1 we considered the random motion of many particles in a box, but we did not
care about their positions—all we needed to know was the number of particles on each side.
CHAPTER 7. RANDOM PROCESSES 203
Suppose that we want to characterize the motion of a dust particle in the atmosphere. We know
that as a given dust particle collides with molecules in the atmosphere, it changes its direction
frequently, and its motion appears to be random. A simple model for the trajectory of a dust
particle in the atmosphere is based on the assumption that the particle moves in any direction
with equal probability. Such a model is an example of a random walk.
The original statement of a random walk was formulated in the context of a drunken sailor.
If a drunkard begins at a lamp post and takes N steps of equal length in random directions, how
far will the drunkard be from the lamp post? We will ﬁnd that the mean square displacement
of a random walker, for example, a dust particle or a drunkard, grows linearly with time. This
result and its relation to diﬀusion leads to many applications that might seem to be unrelated
to the original drunken sailor problem.
We ﬁrst consider an idealized example of a random walker that can move only along a line.
Suppose that the walker begins at x = 0 and that each step is of equal length a. At each time
interval the walker has a probability p of a step to the right and a probability q = 1 − p of a step
to the left. The direction of each step is independent of the preceding one. After N steps the
displacement x of the walker from the origin is given by
xN =
N
i=1
si (7.6)
where si = ±a. For p = 1/2 we can generate one walk of N steps by ﬂipping a coin N times and
increasing x by a each time the coin is heads and decreasing x by a each time the coin is tails.
We expect that if we average over a suﬃcient number of walks of N steps, then the average
of xN , denoted by xN , would be (p−q)Na. We can derive this result by writing x = N
i=1 si =
N s , because the average of the sum is the sum of the averages, and the average of each step
is the same. We have s = p(a) + q(−a) = (p − q)a, and the result follows. For simplicity, we will
frequently drop the subscript N and write x .
Because x = 0 for p = 1/2, we need a better measure of the extent of the walk. One measure
is the displacement squared:
x2
N =
N
i=1
si
2
. (7.7)
For p 1/2 it is convenient to consider the mean square net displacement ∆x2 deﬁned as
∆x2
≡ x − x
2
(7.8a)
= x2
− x 2
. (7.8b)
(We have written ∆x2 rather than ∆x2 for simplicity.) To determine ∆x2, we write (7.8a) as the
sum of two terms:
∆x2
=
N
i=1
∆i
N
j=1
∆j =
N
i=1
∆2
i +
N
i j=1
∆i∆j (7.9)
where ∆i = si − s . The ﬁrst sum on the right-hand side of (7.9) includes terms for which
i = j; the second sum is over i and j with i j. Because each step is independent, we have
∆i∆j = ∆i ∆j . This term is zero because ∆i = 0 for any i. The ﬁrst term in (7.9) equals
N ∆2 = N[ s2 − s 2] = N[a2 − (p − q)2a2] = N4pqa2, where we have used the fact that p + q = 1.
Hence
∆x2
= 4pqNa2
(analytic result). (7.10)
CHAPTER 7. RANDOM PROCESSES 204
We can gain more insight into the nature of random walks by doing a Monte Carlo simulation,
that is, by using a computer to “ﬂip coins.” The implementation of the random walk
algorithm is simple, for example,
i f (p < Math . random ( ) ) {
x++;
}
else {
x−−;
}
Clearly we have to sample many N step walks because, in general, each walk will give a
diﬀerent outcome. We need to do a Monte Carlo simulation many times and average over the
results to obtain meaningful averages. Each N-step walk is called a trial. How do we know
how many trials to use? The simple answer is to average over more and more trials until the
average results don’t change within the desired accuracy. The more sophisticated answer is to
do an error analysis similar to what we do in measurements in the laboratory. Such an analysis
is discussed in Section 11.4.
The more diﬃcult parts of a program to simulate random walks are associated with bookkeeping.
The walker takes a total of N steps in each trial, and the net displacement x is computed
after every step. Our convention will be to use a variable name ending in Accumulator
(or Accum) to denote a variable that accumulates the value of some variable. In Listing 7.3 we
provide two classes to be used to simulate random walks.
Listing 7.3: Listing of Walker class.
package org . opensourcephysics . sip . ch07 ;
public class Walker {
/ / accumulated data on displacement of walkers , index i s time
int xAccum [ ] , xSquaredAccum [ ] ;
int N; / / maximum number of s t e p s
double p ; / / p r o b a b i l i t y of st ep to the r i g h t
int position ; / / p o s i t i o n of walker
public void i n i t i a l i z e ( ) {
xAccum = new int [N+1];
xSquaredAccum = new int [N+1];
}
public void step ( ) {
position = 0;
for ( int t = 0; t<N; t ++) {
i f (Math . random() <p) {
position ++;
} else {
position −−;
}
/ / determine displacement of walker a f t e r each st ep
xAccum[ t +1] += position ;
xSquaredAccum [ t +1] += position position ;
}
}
}
CHAPTER 7. RANDOM PROCESSES 205
Listing 7.4: Target class for random walk simulation.
package org . opensourcephysics . sip . ch07 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . ;
public class WalkerApp extends AbstractSimulation {
Walker walker = new Walker ( ) ;
PlotFrame plotFrame = new PlotFrame ( "time" , "<x>,<x^2>" , "Averages" ) ;
HistogramFrame distribution =
new HistogramFrame ( "x" , "H(x)" , "Histogram" ) ;
int t r i a l s ; / / number of t r i a l s
public WalkerApp ( ) {
plotFrame . setXYColumnNames (0 , "t" , "<x>" ) ;
plotFrame . setXYColumnNames (1 , "t" , "<x^2>" ) ;
}
public void i n i t i a l i z e ( ) {
walker . p = control . getDouble ( "Probability p of step to right" ) ;
walker .N = control . getInt ( "Number of steps N" ) ;
walker . i n i t i a l i z e ( ) ;
t r i a l s = 0;
}
public void doStep ( ) {
t r i a l s ++;
walker . step ( ) ;
distribution . append ( walker . position ) ;
distribution . setMessage ( "trials = "+ t r i a l s ) ;
}
public void stopRunning ( ) {
plotFrame . clearData ( ) ;
for ( int t = 0; t<=walker .N; t ++) {
double xbar = walker . xAccum[ t ] 1.0/ t r i a l s ;
double x2bar = walker . xSquaredAccum [ t ] 1.0/ t r i a l s ;
plotFrame . append (0 , 1.0 t , xbar ) ;
plotFrame . append (1 , 1.0 t , x2bar − xbar xbar ) ;
}
plotFrame . repaint ( ) ;
}
public void reset ( ) {
control . setValue ( "Probability p of step to right" , 0 . 5 ) ;
control . setValue ( "Number of steps N" , 100);
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl . createApp (new WalkerApp ( ) ) ;
}
}
CHAPTER 7. RANDOM PROCESSES 206
Problem 7.5. Random walks in one dimension
(a) In class Walker the steps are of unit length so that a = 1. Use Walker and WalkerApp to
estimate the number of trials needed to obtain ∆x2 for N = 20 and p = 1/2 with an accuracy
of approximately 5%. Compare your result for ∆x2 to the exact answer in (7.10). Approximately
how many trials do you need to obtain the same relative accuracy for N = 100?
(b) Is x exactly zero in your simulations? Explain the diﬀerence between the analytic result
and the results of your simulations. Note that we have used the same notation ... to denote
the exact average calculated analytically and the approximate average computed by
averaging over many trials. The distinction between the two averages should be clear from
the context.
(c) How do your results for x and ∆x2 change for p q? Choose p = 0.7 and determine the N
dependence of x and ∆x2.
(d)∗ Determine ∆x2 for N = 1 to N = 5 by enumerating all the possible walks. For simplicity,
choose p = 1/2 so that x = 0. For N = 1 there are two possible walks: one step to the right
and one step to the left. In both cases x2 = 1, and hence x1
2 = 1. For N = 2 there are four
possible walks with the same probability: (i) two steps to the right, (ii) two steps to the
left, (iii) ﬁrst step to the right and second step to the left, and (iv) ﬁrst step to the left and
second step to the right. The value of x2
2 for these walks is 4, 4, 0, and 0, respectively, and
hence x2
2 = (4 + 4 + 0 + 0)/4 = 2. Write a program that enumerates all the possible walks
of a given number of steps and compute the various averages of interest exactly.
The class WalkerApp displays the distribution of values of the displacement x after N steps.
One way of determining the number of times that the variable x has a certain value would be to
deﬁne a one-dimensional array, probability, and let
probability[x] += 1;
In this case because x takes only integer values, the array index of probability is the same
as x itself. However, the above statement does not work in Java because x can be negative as
well as positive. What we need is a way of mapping the value x to a bin or index number.
The HistogramFrame class, which is part of the Open Source Physics display package, does
this mapping automatically using the Java Hashtable class. In simple data structures, data is
accessed by an index that indicates the location of the data in the data structure. Hashtable data
is accessed by a key, which in our case is the value of x. A hashing function converts the key to
an index. The append method of the HistogramFrame class takes a value, ﬁnds the index using a
hashing function, and then increments the data associated with that key. The HistogramFrame
class also draws itself.
The HistogramFrame class is very useful for taking a quick look at the distribution of values
in a data set. You do not need to know how to group the data into bins or the range of values
of the data. The default bin width is unity, but the bin width can be set using the setBinWidth
method. See WalkerApp for an example of the use of the HistogramFrame class. Frequently,
we wish to use the histogram data to compute other quantities. You can collect the data using
the Data Table menu item in HistogramFrame and copy the data to a ﬁle. Another option is to
include additional code in your program to analyze the data. The following statements assume
that a HistogramFrame object called histogram has been created and data entered into it.
CHAPTER 7. RANDOM PROCESSES 207
/ / c r e a t e s array e n t r i e s of data from histogram
java . u t i l .Map. Entry [ ] entries = histogram . entries ( ) ;
for ( int i = 0 , length = entries . length ; i < length ; i ++) {
/ / g e t s bin number
Integer binNumber = ( Integer ) entries [ i ] . getKey ( ) ;
/ / g e t s number of occurrences f o r bin number i
Double occurrences = ( Double ) entries [ i ] . getValue ( ) ;
/ / g e t s value of l e f t edge of bin
double value = histogram . getLeftMostBinPosition ( binNumber . intValue ( ) ) ;
/ / s e t s value to middle of bin
value += 0.5 histogram . getBinWidth ( ) ;
/ / convert from Double c l a s s to double data type
double number = occurrences . doubleValue ( ) ;
/ / use value and number in your a n a l y s i s
}
Problem 7.6. Nature of the probability distribution
(a) Compute PN (x), the probability that the displacement of the walker from the origin is x
after N steps. What is the diﬀerence between the histogram, that is, the number of occurrences,
and the probability? Consider N = 10 and N = 40 and at least 1000 trials. Does the
qualitative form of PN (x) change as the number of trials increases? What is the approximate
width of PN (x) and the value of PN (x) at its maximum for each value of N?
(b) What is the approximate shape of the envelope of PN (x)? Does the shape change as N is
increased?
(c) Fit the envelope PN (x) for suﬃciently large N to the continuous function
C
1
√
2π∆x2
e−(x− x )2/2∆x2
. (7.11)
The form of (7.11) is the standard form of the Gaussian distribution with C = 1. The easiest
way to do this ﬁt is to plot your results for PN (x) and the form (7.11) on the same graph
using your results for x and ∆x2 as input parameters. Visually choose the constant C to
obtain a reasonable ﬁt. What are the possible values of x for a given value of N? What is the
minimum diﬀerence between these values? How does this diﬀerence compare to your value
for C?
Problem 7.7. More random walks in one dimension
(a) Suppose that the probability of a step to the right is p = 0.7. Compute x and ∆x2 for
N = 4, 8, 16, and 32. What is the interpretation of x in this case? What is the qualitative
dependence of ∆x2 on N?
(b) An interesting property of random walks is the mean number DN of distinct lattice sites
visited during the course of an N step walk. Do a Monte Carlo simulation of DN and
determine its N dependence.
We can consider either a large number of successive walks as in Problem 7.7 or a large
number of noninteracting walkers moving at the same time as in Problem 7.8.
CHAPTER 7. RANDOM PROCESSES 208
Figure 7.2: An example of a 6 × 6 square lattice. Note that each site or node has four nearest
neighbors.
Problem 7.8. A random walk in two dimensions
(a) Consider a collection of walkers initially at the origin of a square lattice (see Figure 7.2). At
each unit of time, each of the walkers moves at random with equal probability in one of the
four possible directions. Create a drawable class, Walker2D, which contains the positions
of M walkers moving in two dimensions and draws their location, and modify WalkerApp.
Unlike WalkerApp, this new class need not specify the maximum number of steps. Instead,
the number of walkers should be speciﬁed.
(b) Run your application with the number of walkers M ≥ 1000 and allow the walkers to take
at least 500 steps. If each walker represents a bee, what is the qualitative nature of the
shape of the swarm of bees? Describe the qualitative nature of the surface of the swarm as
a function of the number of steps N. Is the surface jagged or smooth?
(c) Compute the quantities x , y , ∆x2, and ∆y2 as a function of N. The average is over the M
walkers. Also compute the mean square displacement R2 given by
R2
= x2
− x 2
+ y2
− y 2
= ∆x2
+ ∆y2
. (7.12)
What is the dependence of each quantity on N? (As before, we will frequently write R2
instead of R2
N .)
(d) Estimate R2 for N = 8, 16, 32, and 64 by averaging over a large number of walkers for each
value of N. Assume that R =
√
R2 has the asymptotic N dependence:
R ∼ Nν
(N 1), (7.13)
and estimate the exponent ν from a log-log plot of R2 versus N. We will see in Chapter 13
that the exponent 1/ν is related to how a random walk ﬁlls space. If ν ≈ 1/2, estimate the
magnitude of the self-diﬀusion coeﬃcient D from the relation R2 = 4DN.
CHAPTER 7. RANDOM PROCESSES 209
Figure 7.3: Examples of the random path of a raindrop to the ground. The step probabilities
are given in Problem 7.9. The walker starts at x = 0, y = h.
(e) Do a Monte Carlo simulation of R2 on a triangular lattice (see Figure 8.5) and estimate
ν. Can you conclude that the exponent ν is independent of the symmetry of the lattice?
Does D depend on the symmetry of the lattice? If so, give a qualitative explanation for this
dependence.
(f)∗ Enumerate all the random walks on a square lattice for N = 4 and obtain exact results for
x , y , and R2. Assume that all four directions are equally probable. Verify your program
by comparing the Monte Carlo and exact enumeration results.
Problem 7.9. The fall of a rain drop
Consider a random walk that starts at a site a distance y = h above a horizontal line (see Figure
7.3). If the probability of a step down is greater than the probability of a step up, we expect
that the walker will eventually reach a site on the horizontal line. This walk is a simple model
of the fall of a rain drop in the presence of a random swirling breeze. Do a Monte Carlo simulation
to determine the mean time τ for the walker to reach any site on the line y = 0 and ﬁnd
the functional dependence of τ on h. Is it possible to deﬁne a velocity in the vertical direction?
Because the walker does not always move vertically, it suﬀers a net displacement x in the horizontal
direction. How does ∆x2 depend on h and τ? Reasonable values for the step probabilities
are 0.1, 0.6, 0.15, 0.15, corresponding to up, down, right, and left, respectively.
7.3 Modiﬁed Random Walks
So far we have considered random walks on one- and two-dimensional lattices where the walker
has no “memory” of the previous step. What happens if the walker remembers the nature of the
previous steps? What happens if there are multiple random walkers, with the condition that no
double occupancy is allowed? We explore these and other variations of the simple random walk
in this section. All these variations have applications to physical systems, but the applications
are more diﬃcult to understand than the models themselves.
The fall of a raindrop considered in Problem 7.9 is an example of a restricted random walk,
that is, a walk in the presence of a boundary. In the following problem, we discuss in a more
CHAPTER 7. RANDOM PROCESSES 210
general context the eﬀects of various types of restrictions or boundaries on random walks. Other
examples of a restricted random walk are given in Problems 7.17 and 7.23.
Problem 7.10. Restricted random walks
(a) Consider a one-dimensional lattice with trap sites at x = 0 and x = L(L > 0). A walker begins
at site x0 (0 < x0 < L) and takes unit steps to the left and right with equal probability. When
the walker arrives at a trap site, it can no longer move. Do a Monte Carlo simulation and
verify that the mean number of steps τ for the particle to be trapped (the mean ﬁrst passage
time) is given by
τ = (2D)−1
x0(L − x0) (7.14)
where D is the self-diﬀusion coeﬃcient in the absence of traps [see (7.29)].
(b) Random walk models in the presence of traps have had an important role in condensed
matter physics. For example, consider the following idealized model of energy transport
in solids. The solid is represented as a lattice with two types of sites: hosts and traps. An
incident photon is absorbed at a host site and excites the host molecule or atom. The excitation
energy or exciton is transferred at random to one of the host’s nearest neighbors, and
the original excited molecule returns to its ground state. In this way the exciton wanders
through the lattice until it reaches a trap site at which a chemical reaction occurs. A simple
version of this energy transport model is given by a one-dimensional lattice with traps
placed on a periodic sublattice. Because the traps are placed at regular intervals, we can
replace the random walk on an inﬁnite lattice by a random walk on a circular ring. Consider
a lattice of N host or nontrapping sites and one trap site. If a walker has an equal
probability of starting from any host site and an equal probability of a step to each nearest
neighbor site, what is the N dependence of the mean survival time τ (the mean number of
steps taken before a trap site is reached)? Use the results of part (a) rather than doing a
simulation.
(c) Consider a one-dimensional lattice with reﬂecting sites at x = −L and x = L. For example,
if a walker reaches the reﬂecting site at x = L, it is reﬂected at the next step to x = L − 1. At
t = 0 the walker starts at x = 0 and steps with equal probability to the left and right. Write
a Monte Carlo program to determine PN (x), the probability that the walker is at site x after
N steps. Compare the form of PN (x) with and without the presence of the reﬂecting sites.
Can you distinguish the two probability distributions if N is the order of L? At what value
of N can you ﬁrst distinguish the two distributions?
Problem 7.11. A persistent random walk
(a) In a persistent random walk, the transition or jump probability depends on the previous
step. Consider a walk on a one-dimensional lattice, and suppose that step N − 1 has been
made. Then step N is made in the same direction with probability α; a step in the opposite
direction occurs with probability 1 − α. Write a program to do a Monte Carlo simulation of
the persistent random walk in one dimension. Estimate x , ∆x2, and PN (x). Note that it is
necessary to specify both the initial position and an initial direction of the walker. What is
the α = 1/2 limit of the persistent random walk?
(b) Consider α = 0.25 and α = 0.75 and determine ∆x2 for N = 8, 64, 256, and 512. Assume
that ∆x2 ∼ N2ν for large N, and estimate the value of ν from a log-log plot of ∆x2 versus N
CHAPTER 7. RANDOM PROCESSES 211
for large N. Does ν depend on α? If ν ≈ 1/2, determine the self-diﬀusion coeﬃcient D for
α = 0.25 and 0.75. In general, D is given by
D =
1
2d
lim
N→∞
∆x2
N
(7.15)
where d is the dimension of space. That is, D is given by the asymptotic behavior of the
mean square displacement. (For the simple random walk considered in Section 7.2, ∆x2 ∝ N
for all N.) Give a physical argument for why D(α 0.5) is greater (smaller) than D(α = 0.5).
(c) You might have expected that the persistent random walk yields a nonzero value for x .
Verify that x = 0, and explain why this result is exact. How does the persistent random
walk diﬀer from the biased random walk for which p q?
(d) A persistent random walk can be considered as an example of a multistate walk in which
the state of the walk is deﬁned by the last transition. The walker is in one of two states; at
each step the probabilities of remaining in the same state or switching states are α and 1−α,
respectively. One of the earliest applications of a two-state random walk was to the study
of diﬀusion in a chromatographic column. Suppose that a molecule in a chromatographic
column can be either in a mobile phase (constant velocity v) or in a trapped phase (zero
velocity). Instead of each step changing the position by ±1, the position at each step changes
by +v or 0. A quantity of experimental interest is the probability PN (x) that a molecule has
moved a distance x in N steps. Choose v = 1 and α = 0.75 and determine the qualitative
behavior of PN (x).
Problem 7.12. Synchronized random walks
(a) Randomly place two walkers on a one-dimensional lattice of L sites, so that both walkers
are not at the same site. At each time step randomly choose whether the walkers move to
the left or to the right. Both walkers move in the same direction. If a walker cannot move in
the chosen direction because it is at a boundary, then this walker remains at the same site
for this time step. A trial ends when both walkers are at the same site. Write a program to
determine the mean time and the mean square ﬂuctuations of the time for two walkers to
reach the same site. This model is relevant to a method of doing cryptography using neural
networks (see Rutter et al.).
(b) Change your program so that you use biased random walkers for which p q. How does
this change aﬀect your results?
Problem 7.13. Random walk on a continuum
One of the ﬁrst continuum models of a random walk was proposed by Rayleigh in 1919. In this
model the length a of each step is a random variable and the direction of each step is uniformly
random. In this case the variable of interest is R, the distance of the walker from the origin after
N steps. The model is known as the freely jointed chain in polymer physics (see Section 7.7)
in which case R is the end-to-end distance of the polymer. For simplicity, we ﬁrst consider a
walker in two dimensions with steps of equal (unit) length at a random angle.
(a) Write a Monte Carlo program to compute R and determine its dependence on N.
(b) Because R is a continuous variable, we need to compute pN (R)∆R, the probability that R is
between R and R + ∆R after N steps. The quantity pN (R) is the probability density. Because
CHAPTER 7. RANDOM PROCESSES 212
the area of the ring between R and R + ∆R is π(R + ∆R)2 − πR2 = 2πR∆R + π(∆R)2 ≈ 2πR∆R,
we see that pN (R)∆R is proportional to R∆R. Verify that for suﬃciently large N, pN (R)∆R
has the form
pN (R)∆R ∝ 2πR∆Re−(R− R )2/2∆R2
(7.16)
where ∆R2 = R2 − R 2.
Problem 7.14. Random walks with steps of variable length
(a) Consider a random walk in one dimension with jumps of all lengths. The probability that
the length of a single step is between a and a + ∆a is f (a)∆a, where f (a) is the probability
density. If the form of f (a) is given by f (a) = C e−a for a > 0 with the normalization condition
∞
0
f (a)da = 1, the code needed to generate step lengths according to this probability density
is given by (see Section 11.5)
stepLength = −Math . log (1 − Math . random ( ) ) ;
Modify Walker and WalkerApp to simulate walks of variable length with this probability
density. Consider N ≥ 100 and visualize the motion of the walker. Generate many walks
of N steps and determine p(x)∆x, the probability that the displacement is between x and
x + ∆x after N steps. Plot p(x) versus x and conﬁrm that the form of p(x) is consistent with
a Gaussian distribution. Note that the bin width ∆a is one of the input parameters.
(b) Assume that the probability density f (a) is given by f (a) = C/a2 for a ≥ 1. Determine the
normalization constant C using the condition C
∞
1
a−2 da = 1. In this case we will learn in
Section 11.5 that the statement
stepLength = 1.0/(1.0 − Math . random ( ) ) ;
generates values of a according to this form of f (a). Do a Monte Carlo simulation as in
part (a) and determine p(x)∆x. Is the form of p(x) a Gaussian? This type of random walk,
for which f (a) decreases as a power law a−1−α, is known as a Levy ﬂight for α ≤ 2.
Problem 7.15. Exploring the central limit theorem
Consider a continuous random variable x with probability density f (x). That is, f (x)∆x is the
probability that x has a value between x and x + ∆x. The mth moment of f (x) is deﬁned as
xm
= xm
f (x)dx. (7.17)
The mean value x is given by (7.17) with m = 1. The variance σ2
x of f (x) is deﬁned as
σ2
x = x2
− x 2
. (7.18)
Consider the sum yn corresponding to the average of n values of x:
y = yn =
1
n
(x1 + x2 + ··· + xn). (7.19)
Suppose that we make many measurements of y. We know that the values of y will not be
identical but will be distributed according to a probability density p(y), where p(y)∆y is the
probability that the measured value of y is in the range y to y + ∆y. The main quantities of
interest are y , p(y), and an estimate of the probable variability of y in a series of measurements.
CHAPTER 7. RANDOM PROCESSES 213
(a) Suppose that f (x) is uniform in the interval [−1,1]. Calculate x , x2 , and σx analytically.
(b) Write a program to make a suﬃcient number of measurements of y and determine y and
p(y)∆y. Use the HistogramFrame class to determine and plot p(y)∆y. Choose at least 104
measurements of y for n = 4, 16, 32, and 64. What is the qualitative form of p(y)? Does the
qualitative form of p(y) change as the number of measurements of y is increased for a given
value of n? Does the qualitative form of p(y) change as n is increased?
(c) Each value of y can be considered to be a measurement. How much does the value of y vary
(on the average) from one measurement to another? Make a rough estimate of this variability
by comparing several measurements of y for a given value of n. Increase n by a factor of
four and estimate the variability of y again. Does the variability from one measurement to
another decrease (on the average) as n is increased?
(d) The sample variance ˜σ2 is given by
˜σ2
=
n
i=1[yi − y ]2
n − 1
. (7.20)
The reason for the factor of n − 1 rather than n in (7.20) is that to compute ˜σ2, we need to
use the n values of x to compute the mean y, and thus, loosely speaking, we have only n − 1
independent values of x remaining to calculate ˜σ2. Show that if n 1, then ˜σ2 ≈ σ2
y , where
σ2
y is given by
σ2
y = y2
− y 2
. (7.21)
(e) The quantity ˜σ is known as the standard deviation of the mean. That is, ˜σ gives a measure of
how much variation we expect to ﬁnd if we make repeated measurements of y. How does
the value of ˜σ compare with your estimate of the variability in part (b)?
(f) What is the qualitative shape of the probability density p(y) that you obtained in part (b)?
What is the order of magnitude of the width of the probability?
(g) Verify from your results that ˜σ ≈ σy ≈ σx/
√
n − 1 ≈ σx/
√
n.
(h) To test the generality of your results, consider the exponential probability density
f (x) =



e−x x ≥ 0
0 x < 0.
(7.22)
Calculate x and σx analytically. Modify your Monte Carlo program and estimate y , ˜σ,
σy, and p(y). How are ˜σ, σy, and σx related for a given value of n? Plot p(y) and discuss its
qualitative form and its dependence on n and on the number of measurements of y.
Problem 7.15 illustrates the central limit theorem, which states that the probability distribution
of a sum of random variables, the random variable y, is a Gaussian centered at y with
a standard deviation approximately given by 1/
√
n times the standard deviation of f (x). The
requirements are that f (x) has ﬁnite ﬁrst and second moments, that the measurements of y
are statistically independent, and that n is large. What is the relation of the central limit theorem
to the calculations of the probability distribution in the random walk models that we have
considered?
CHAPTER 7. RANDOM PROCESSES 214
Problem 7.16. Generation of the Gaussian distribution
Consider the sum
y =
12
i=1
ri (7.23)
where ri is a uniform random number in the unit interval. Make many measurements of y and
show that the probability distribution of y approximates the Gaussian distribution with mean
value 6 and variance 1. What is the relation of this result to the central limit theorem? Discuss
how to use this result to generate a Gaussian distribution with arbitrary mean and variance.
This way of generating a Gaussian distribution is particularly useful when a “quick and dirty”
approximation is appropriate. A better method for generating a sequence of random numbers
distributed according to the Gaussian distribution is discussed in Section 11.5.
Many of the problems we have considered have revealed the slow convergence of Monte
Carlo simulations and the diﬃculty of obtaining quantitative results for asymptotic quantities.
We conclude this section with a cautionary note and consider a “simple” problem for which
straightforward Monte Carlo methods give misleading asymptotic results.
∗Problem 7.17. Random walk on lattices containing random traps
(a) In Problem 7.10 we considered the mean survival time of a one-dimensional random walker
in the presence of a periodic distribution of traps. Now suppose that the trap sites are
distributed at random on a one-dimensional lattice with density ρ = N/L. For example, if
ρ = 0.01, the probability that a site is a trap site is 1%. (A site is a trap site if r ≤ ρ, where, as
usual, r is uniformly distributed in the interval 0 ≤ r < 1.) If a walker is placed at random
at any nontrapping site, determine its mean survival time τ; that is, the mean number of
steps before a trap site is reached. Assume that the walker has an equal probability of
moving to a nearest neighbor site at each step and use periodic boundary conditions; that
is, the lattice sites are located on a ring. The major complication is that it is necessary to
perform three averages: the distribution of traps, the origin of the walker, and the diﬀerent
walks for a given trap distribution and origin. Choose reasonable values for the number of
trials associated with each average and do a Monte Carlo simulation to estimate the mean
survival time τ. If τ exhibits a power law dependence on ρ, for example τ ≈ τ0 ρ−z, estimate
the exponent z.
(b) A seemingly straightforward extension of part (a) is to estimate the survival probability
SN after N steps. Choose ρ = 0.5 and do a Monte Carlo simulation of SN for N as large
as possible. (Published results are for N = 3 × 104, on lattices large enough that a walker
doesn’t reach the boundary, with about 54 000 trials.) Assume that the asymptotic form of
SN for large N is given by
SN ∼ e−bNα
(7.24)
where the exponent α is the quantity of interest, and b is a constant that depends on ρ. Are
your results consistent with this form? Is it possible to make a meaningful estimate of the
exponent α?
(c) It has been proven that the asymptotic N dependence of SN has the form (7.24) with α = 1/3.
Are your Monte Carlo results consistent with this value of α? The object of part (b) is to
convince you that it is not possible to use simple Monte Carlo methods directly to obtain
the correct asymptotic behavior of SN . The diﬃculty is that we are trying to estimate SN in
the asymptotic region where SN is very small, and the small number of trials in this region
prevents us from obtaining meaningful results.
CHAPTER 7. RANDOM PROCESSES 215
1 0 1 1 1 1 0 1 1 0 1 0 1 1
N = 0
2
1
0
2
1
1 1
2
1
0
2
1
2
1
0 0 0
2
1
1 N = 1
2
1
0
2
1
4
3
4
3
2
1
0
4
1
4
1
0 0 0
2
1
2
1
N = 2
Figure 7.4: Example of the exact enumeration of walks on a given conﬁguration of traps. The
ﬁlled and empty squares denote regular and trap sites, respectively. At step N = 0, a walker
is placed at each regular site. The numbers at each site i represent the number of walkers
wi. Periodic boundary conditions are used. The initial number of walkers in this example is
w0 = 10. The mean survival probability at step N = 1 and N = 2 is found to be 0.6 and 0.475,
respectively.
(d) One way to reduce the number of required averages is to determine exactly the probability
that the walker is at site i after N steps for a given distribution of trap sites. The method is
illustrated in Figure 7.4. The ﬁrst line represents a given conﬁguration of traps distributed
randomly on a one-dimensional lattice. One walker is placed at each nontrap site; trap sites
are assigned the value 0. Because each walker moves with probability 1/2 to each neighbor,
the number of walkers wi(N + 1) on site i at step N + 1 is given by
wi(N + 1) =
1
2
[wi+1(N) + wi−1(N)]. (7.25)
(Compare the relation (7.25) to the relation that you found in Problem 7.5d.) The survival
probability SN after N steps for a given conﬁguration of traps is given exactly by
SN =
1
w0 i
wi(N) (7.26)
where w0 is the initial number of walkers and the sum is over all sites in the lattice. Explain
the relation (7.26) and write a program that computes SN using (7.25) and (7.26). Then
obtain SN by averaging over several conﬁgurations of traps. Choose ρ = 0.5 and determine
SN for N = 32, 64, 128, 512, and 1024. Choose periodic boundary conditions and as large
a lattice as possible. How well can you estimate the exponent α? For comparison Havlin et
al. consider a lattice of L = 50,000 and values of N up to 107.
One reason that random walks are very useful in simulating many physical processes is that
they are closely related to solutions of the diﬀusion equation. The one-dimensional diﬀusion
equation can be written as
∂P (x,t)
∂t
= D
∂2P (x,t)
∂x2
(7.27)
where D is the self-diﬀusion coeﬃcient and P (x,t)∆x is the probability of a particle being in
the interval between x and x + ∆x at time t. In a typical application P (x,t) might represent
the concentration of ink molecules diﬀusing in a ﬂuid. In three dimensions the second derivative
∂2/∂x2 is replaced by the Laplacian 2. In Appendix 7B we show that the solution to the
CHAPTER 7. RANDOM PROCESSES 216
diﬀusion equation with the boundary condition P (x = ±∞,t) = 0 yields
x(t) = 0 (7.28)
and
x2
(t) = 2Dt (one dimension). (7.29)
If we compare the form of (7.29) with (7.10), we see that the random walk on a one-dimensional
lattice and the diﬀusion equation give the same time dependence if we identify t with N∆t and
D with a2/∆t.
The relation of discrete random walks to the diﬀusion equation is an example of how we can
approach many problems in several ways. The traditional way to treat diﬀusion is to formulate
the problem as a partial diﬀerential equation as in (7.27). The usual method for solving (7.27)
numerically is known as the Crank–Nicholson method (see Press et al.). One diﬃculty with this
approach is the treatment of complicated boundary conditions. An alternative is to formulate
the problem as a random walk on a lattice for which it is straightforward to incorporate various
boundary conditions. We will consider random walks in many contexts (see, for example,
Section 10.5 and Chapter 16).
7.4 The Poisson Distribution and Nuclear Decay
As we have seen, we can often change variable names and consider a seemingly diﬀerent physical
problem. Our goal in this section is to discuss the decay of unstable nuclei, but we ﬁrst
discuss a conceptually easier problem related to throwing darts at random. Related physical
problems are the distribution of stars in the sky and the distribution of photons on a photographic
plate.
Suppose we randomly throw N = 100 darts at a board that has been divided into M = 1000
equal size regions. The probability that a dart hits a given region or cell in any one throw is
p = 1/M. If we count the number of darts in the diﬀerent regions, we would ﬁnd that most
cells are empty, some cells have one dart, and other cells have more than one dart. What is the
probability P (n) that a given cell has n darts?
Problem 7.18. Throwing darts
Write a program that simulates the throwing of N darts at random into M cells in a dart board.
Throwing a dart at random at the board is equivalent to choosing an integer at random between
1 and M. Determine H(n), the number of cells with n darts. Average H(n) over many trials and
then compute the probability distribution
P (n) =
H(n)
M
. (7.30)
As an example, choose N = 50 and M = 500. Choose the number of trials to be suﬃciently large.
so that you can determine the qualitative form of P (n). What is n ?
In this case the probability p that a dart lands in a given cell is much less then unity. The
conditions N 1 and p 1 with n = Np ﬁxed and the independence of the events (the
presence of a dart in a particular cell) satisfy the requirements for a Poisson distribution. The
form of the Poisson distribution is
P (n) =
n n
n!
e− n
(7.31)
CHAPTER 7. RANDOM PROCESSES 217
where n is the number of darts in a given cell and n is the mean number, n = N
n=0 nP (n).
Because N 1, we can take the upper limit of this sum to be ∞ when it is convenient.
Problem 7.19. Darts and the Poisson distribution
(a) Write a program to compute n=0 P (n), n=0 nP (n), and n=0 n2 P (n) using the form (7.31)
for P (n) and reasonable values of p and N. Verify that P (n) in (7.31) is normalized. What is
the value of σ2
n = n2 − n 2 for the Poisson distribution?
(b) Modify the program that you developed in Problem 7.18 to compute n as well as P (n).
Choose N = 50 and M = 1000. How do your computed values of P (n) compare to the
Poisson distribution in (7.31) using your measured value of n as input? If time permits,
use larger values of N and M.
(c) Choose N = 50 and M = 100 and redo part (b). Are your results consistent with a Poisson
distribution? What happens if M = N = 50?
Now that we are more familiar with the Poisson distribution, we consider the decay of radioactive
nuclei. We know that a collection of radioactive nuclei will decay; however, we cannot
know a priori which nucleus will decay next. If all nuclei of a particular type are identical, why
do they not all decay at the same time? The answer is based on the uncertainty inherent in the
quantum description of matter at the microscopic level. In the following, we will see that a
simple model of the decay process leads to exponential decay. This approach complements the
continuum approach discussed in Section 3.9.
Because each nucleus is identical, we assume that during any time interval ∆t, each nucleus
has the same probability per unit time p of decaying. The basic algorithm is simple – choose
an unstable nucleus and generate a random number r uniformly distributed in the unit interval
0 ≤ r < 1. If r ≤ p, the unstable nucleus decays; otherwise, it does not. Each unstable nucleus is
tested once during each time interval. Note that for a system of unstable nuclei, there are many
events that can happen during each time interval; for example, 0,1,2,...,n nuclei can decay.
Once a nucleus decays, it is no longer in the group of unstable nuclei that is tested at each time
interval. Class Nuclei in Listing 7.5 implements the nuclear decay algorithm.
Listing 7.5: The Nuclei class.
package org . opensourcephysics . sip . ch07 ;
public class Nuclei {
int n [ ] ;
/ / accumulated data on number of unstable nuclei , index i s time
int tmax ; / / maximum time to record data
int n0 ; / / i n i t i a l number of unstable n u c l e i
double p ; / / decay p r o b a b i l i t y
public void i n i t i a l i z e ( ) {
n = new int [ tmax +1];
}
public void step ( ) {
n [ 0 ] += n0 ;
int nUnstable = n0 ;
for ( int t = 0; t<tmax ; t ++) {
for ( int i = 0; i <nUnstable ; i ++) {
CHAPTER 7. RANDOM PROCESSES 218
i f (Math . random() <p) {
nUnstable −−;
}
}
n[ t +1] += nUnstable ;
}
}
}
Problem 7.20. Monte Carlo simulation of nuclear decay
(a) Write a target class that extends AbstractSimulation, does many trials, and plots the average
number of unstable nuclei as a function of time. Assume that the time interval ∆t is one
second. Choose the initial number of unstable nuclei n0 = 10,000, p = 0.01, and tmax = 100
and average over 100 trials. Is your result for n(t), the mean number of unstable nuclei at
time t, consistent with the expected behavior n(t) = n(0)e−λt found in Section 3.9? What is
the value of λ for this value of p?
(b) There are a very large number of unstable nuclei in a typical radioactive source. We also
know that over any reasonable time interval, only a relatively small number decay. Because
N 1 and p 1, we expect that P (n), the probability that n nuclei decay during a speciﬁed
time interval, is a Poisson distribution. Modify your target class so that it outputs the
probability that n unstable nuclei decay during the ﬁrst time interval. Choose n0 = 1000,
p = 0.001, and tmax = 1 and average over 1000 trials. What is the mean number n of nuclei
that decay during this interval? What is the associated variance? Plot P (n) versus n and
compare your results to the Poisson distribution (7.31) with your measured value of n as
input. Then consider p = 0.02 and determine if P (n) is a Poisson distribution.
(c) Modify your target class so that it outputs the probability that n unstable nuclei decay during
the ﬁrst two time intervals. Choose n0 = 10000, p = 0.001, and tmax = 2. Average over
1000 trials. Compare the probability you obtain with your results from part (b). How do
your results change as the time interval becomes larger?
(d) Increase p for ﬁxed n0 = 10,000 and determine P (n) for a ﬁxed time interval. Estimate the
values of p and n for which the Poisson distribution is no longer applicable.
(e) Modify your program so that it ﬂashes a small circle on the screen or makes a sound (like
that of a Geiger counter) when a nucleus decays. You can have the computer make a beep
by using the method Toolkit.getDefaultToolkit().beep() in java.awt. Choose the location
of the small circle at random. Do a single run and describe the qualitative diﬀerences
between the visual or audio patterns for the cases in parts (a)–(d)? Choose n0 ≥ 5000. Such
a visualization might be somewhat misleading on a serial computer because only one nuclei
can be considered at a time. In contrast, for a real system, the nuclei can decay simultane-
ously.
7.5 Problems in Probability
Why have we bothered to simulate many random processes that can be solved by analytic methods?
The main reason is that it is simpler to introduce new methods in a familiar context. Another
reason is that if we change the nature of many random processes slightly, it is often the
CHAPTER 7. RANDOM PROCESSES 219
Team Won Lost Percentage
St. Louis Cardinals 105 57 0.648
Houston Astros 92 70 0.568
Chicago Cubs 89 73 0.469
Pittsburgh Pirates 72 89 0.447
Cincinnati Reds 76 86 0.426
Milwaukee Brewers 67 94 0.416
Table 7.1: The National League Central standings for 2004.
case that it would be diﬃcult or impossible to obtain the answers by familiar methods. Still
another reason is that writing a program and doing a simulation can aid your intuitive understanding
of the system, especially if the questions involve the subtle concept of probability.
Probability is an elusive concept in part because it cannot be measured at one time. To reinforce
the importance of thinking about how to solve a problem on a computer, we suggest some problems
in probability in the following. Does thinking about how to write a program to simulate
these problems help you to ﬁnd a pencil and paper solution?
Problem 7.21. Three boxes: stick or switch?
Suppose that there are three identical boxes, each with a lid. When you leave the room, a friend
places a $10 bill in one of the boxes and closes the lid of each box. The friend knows the location
of the $10 bill, but you do not. You then reenter the room and guess which one of the boxes has
the $10 bill. As soon as you do, your friend opens the lid of a box that is empty. If you have
chosen an empty box, your friend will open the lid of the other empty box. If you have chosen
the right box, your friend will open the lid of one of the two empty boxes. You now have the
opportunity to stay with your original choice or switch to the other unopened box. Suppose that
you play this contest many times, and that each time you guess correctly, you keep the money.
To maximize your winnings, should you maintain your initial choice or should you switch?
Which strategy is better? This contest is known as the Monty Hall problem. Write a program to
simulate this game and output the probability of winning for switching and for not switching.
It is likely that before you ﬁnish your program, the correct strategy will become clear. To make
your program more useful, consider an arbitrary number of boxes.
Problem 7.22. Conditional probability
Suppose that many people in a community are tested at random for HIV. The accuracy of the
test is 87%, and the incidence of the disease in the general population, independent of any test,
is 1%. If a person tests positive for HIV, what is the probability that this person really has HIV?
Write a program to compute the probability. The answer can be found by using Bayes’ theorem
(cf. Bernardo and Smith). The answer is much less than 87%.
Problem 7.23. The roll of the dice
Suppose that two gamblers each begin with $100 in capital, and on each throw of a coin, one
gambler must win $1 and the other must lose $1. How long can they play on the average until
the capital of the loser is exhausted? How long can they play if they each begin with $1000?
Neither gambler is allowed to go into debt. The eventual outcome is known as the gambler’s
ruin.
CHAPTER 7. RANDOM PROCESSES 220
Problem 7.24. The boys of summer
Luck plays a large role in the outcome of any baseball season. The National League Central
Division standings for 2004 are given in Table 7.1. Suppose that the teams remain unchanged
and their probability of winning a particular game is given by their 2004 winning percentage.
Do a simulation to determine the probability that the Cardinals would lead the division for
another season. For simplicity, assume that the teams play only each other.
Much of the present day motivation for the development of probability comes from science,
rather than from gambling. The next problem has much to do with statistical physics, even
though this application is not apparent.
Problem 7.25. Money exchange
Consider a line that has been subdivided into bins. There can be an indeﬁnite number of coins in
each bin. For simplicity, we initially assign one coin to each bin. The money exchange proceeds
as follows. Select two bins at random. If there is at least one coin in the ﬁrst bin, move one coin
to the second bin. If the ﬁrst bin is empty, then do nothing. After many coin exchanges, how
is the occupancy of the bins distributed? Are the coins uniformly distributed as in the initial
state or are many bins empty? Write a Monte Carlo program to simulate this money exchange
and show the state of the bins visually. Consider a system with at least 256 bins. Plot the
histogram H(n) versus n, where H(n) is the number of bins with n coins. Do your results change
qualitatively if you consider bigger systems or begin with more coins in each bin?
Problem 7.26. Distribution of cooking times
An industrious physics major ﬁnds a job at a local fast food restaurant to help her pay her
way through college. Her task is to cook 20 hamburgers on a grill at any one time. When a
hamburger is cooked, she is supposed to replace it with an uncooked hamburger. However,
our physics major does not pay attention to whether the hamburger is cooked or not. Her
method is to choose a hamburger at random and replace it by a uncooked one She does not
check if the hamburger that she removes from the grill is ready. What is the distribution of
cooking times of the hamburgers that she removes? For simplicity, assume that she replaces
a hamburger at regular intervals of thirty seconds and that there is an indeﬁnite supply of
uncooked hamburgers. Does the qualitative nature of the distribution change if she cooks 40
hamburgers at any one time?
7.6 Method of Least Squares
In Problem 7.20 we did a simulation of N(t), the number of unstable nuclei at time t. Given
the ﬁnite accuracy of our data, how do we know if our simulation results are consistent with
the exponential relation between N and t? The approach that we have been using is to plot
the computed values of logN(t) as a function of t and to rely on our eye to help us draw the
curve that best ﬁts the data points. Such a visual approach works best when the curve is a
straight line; that is, when the relation is linear. The advantages of this approach are that it is
straightforward and allows us to see what we are doing. For example, if a data point is far from
a straight line or if there is a gap in the data, we will notice it easily. If the analytic relation is
not linear, it is likely that we will notice that the data points do not ﬁt a simple straight line but
instead show curvature. If we blindly let a computer ﬁt the data to a straight line, we might not
notice that the ﬁt is not very good unless we have already had experience ﬁtting data. Finally,
the visceral experience of ﬁtting the data manually gives us some feeling for the nature of the
CHAPTER 7. RANDOM PROCESSES 221
data that might otherwise be missed. It is a good idea to plot some data in this way even though
a computer can do it much faster.
Although the visual approach is simple, it does not yield precise results; we need to use
more systematic ﬁtting methods. The most common method for ﬁnding the best straight line
ﬁt to a series of measured points is called linear regression or least squares. Suppose we have n
pairs of measurements (x1,y1),(x2,y2),...,(xn,yn) and that the errors are entirely in the values
of y. For simplicity, we assume that the uncertainties in {yi} all have the same magnitude. Our
goal is to obtain the best ﬁt to the linear function
y = mx + b. (7.32)
The problem is to calculate the values of the parameters m and b for the best straight line
through the n data points. The diﬀerence
di = yi − mxi − b (7.33)
is a measure of the discrepancy in yi. It is reasonable to assume that the best pair of values of m
and b are those that minimize the quantity
χ2
=
n
i=1
(yi − mxi − b)2
. (7.34)
Why should we minimize the sum of the squared diﬀerences between the experimental values yi
and the analytic values mxi +b, and not some other function of the diﬀerences? The justiﬁcation
is based on the assumption that if we did many simulations or measurements, then the values
of di would be distributed according to the Gaussian distribution (see Problems 7.5 and 7.15).
Based on this assumption, it can be shown that the values of m and b that minimize χ yield a set
of values of mxi + b that are the most probable set of measurements that we would ﬁnd based on
the available information. This link to probability is the reason we have discussed least squares
ﬁts in this chapter, even though we will not explicitly show that the diﬀerence di is distributed
according to a Gaussian distribution.
To minimize χ, we take the partial derivative of S with respect to b and m:
∂χ
∂m
= −2
n
i=1
xi(yi − mxi − b) = 0 (7.35a)
∂χ
∂b
= −2
n
i=1
(yi − mxi − b) = 0. (7.35b)
From (7.35) we obtain two simultaneous equations:
m
n
i=1
x2
i + b
n
i=1
xi =
n
i=1
xiyi (7.36a)
m
n
i=1
xi + bn =
n
i=1
yi. (7.36b)
CHAPTER 7. RANDOM PROCESSES 222
It is convenient to deﬁne the quantities
x =
1
n
n
i=1
xi (7.37a)
y =
1
n
n
i=1
yi (7.37b)
xy =
1
n
n
i=1
xiyi (7.37c)
and rewrite (7.36) as
m x2
+ b x = xy (7.38a)
m x + b = y . (7.38b)
The solution of (7.38) can be expressed as
m =
xy − x y
σ2
x
(7.39a)
b = y − m x , (7.39b)
where
σ2
x = x2
− x 2
. (7.39c)
Equation (7.39) determines the slope m and the intercept b of the best straight line through the
n data points. (Note that the averages for the coeﬃcients m and b are over the data points.)
As an example, consider the data shown in Table 7.2 for a one-dimensional random walk.
To make the example more interesting, suppose that the walker takes steps of length 1 or 2 with
equal probability. The direction of the step is random and p = 1/2. As in Section 7.2, we assume
that the mean square displacement ∆x2 obeys the general relation
∆x2
= aN2ν
(7.40)
with an unknown exponent ν and a a constant. Note that the ﬁtting problem in (7.40) is nonlinear;
that is, x2 − x 2 depends on Nν rather than N. Often a problem that looks nonlinear can
be turned into a linear problem by a change of variables. In this case we convert the nonlinear
relation (7.40) to a linear relation by taking the logarithm of both sides:
ln(∆x2
) = lna + 2ν lnN. (7.41)
The values of y = ln(∆x2) and x = lnN in Table 7.2 and the least squares ﬁt are shown in
Figure 7.5. We use (7.39) and ﬁnd that m = 1.02 and b = 0.83. Hence, we conclude from our
limited data and the relation 2ν = m that ν ≈ 0.51, which is consistent with the expected result
ν = 1/2.
The least squares ﬁtting procedure also allows us to estimate the uncertainty or the most
probable error in m and b by analyzing the measurements themselves. The result of this analysis
is that the most probable error in m and b, σm and σb, respectively, is given by
σm =
1
√
n
∆
σx
(7.42a)
σb =
1
√
n
x2 1/2
∆
σx
(7.42b)
CHAPTER 7. RANDOM PROCESSES 223
N ∆x2
8 19.43
16 37.65
32 76.98
64 160.38
Table 7.2: Computed values of the mean square displacement ∆x2 as a function of the total
number of steps N. The mean square displacement was averaged over 1000 trials. The onedimensional
random walker takes steps of length 1 or 2 with equal probability, and the direction
of the step is random with p = 1/2.
0.0
2.0
4.0
6.0
0 1 2 3 4 5
ln N
ln <x
2
N>
Figure 7.5: Plot of ∆x2 versus lnN for the data listed in Table 7.2. The straight line y = 1.02x +
0.83 through the points is found by minimizing the sum (7.34).
where
∆2
=
1
n − 2
n
i=1
d2
i (7.43)
and di is given by (7.33).
Because there are n data points, we might have guessed that n rather than n − 2 would be
present in the denominator of (7.43). The reason for the factor of n − 2 is related to the fact that
to determine ∆, we ﬁrst need to calculate two quantities m and b, leaving only n−2 independent
degrees of freedom. To see that the n − 2 factor is reasonable, consider the special case of n = 2.
In this case we can ﬁnd a line that passes exactly through the two data points, but we cannot
deduce anything about the reliability of the set of measurements because the ﬁt is exact. If we
use (7.43), we see that both the numerator and denominator would be zero, and hence ∆ would
be undetermined. If a factor of n rather than n − 2 appeared in (7.43), we would conclude that
∆ = 0/2 = 0, an absurd conclusion. Usually n 1, and the diﬀerence between n and n − 2 is
negligible.
For our example, ∆ = 0.03, σb = 0.07, and σm = 0.02. The uncertainties δm and δν are related
CHAPTER 7. RANDOM PROCESSES 224
by 2δν = δm. Because δm = σm, we conclude that our best estimate for ν is ν = 0.51 ± 0.01.
If the values of yi have diﬀerent uncertainties σi, then the data points are weighted by the
quantity wi = 1/σ2
i . In this case it is reasonable to minimize the quantity
χ2
=
n
i=1
wi(yi − mxi − b)2
. (7.44)
The resulting expressions in (7.39) for m and b are unchanged if we generalize the deﬁnition of
the averages to be
f =
1
n w
n
i=1
wifi (7.45)
where
w =
1
n
n
i=1
wi. (7.46)
Problem 7.27. Example of least squares ﬁt
(a) Write a program to ﬁnd the least squares ﬁt for a set of data. As a check on your program,
compute the most probable values of m and b for the data shown in Table 7.2.
(b) Modify the random walk program so that steps of length 1 and 2 are taken with equal
probability. Use at least 10,000 trials and do a least squares ﬁt to ∆x2 as done in the text. Is
your most probable estimate for ν closer to ν = 1/2?
For simple random walk problems the relation ∆x2 = aNν holds for all N. However, in
many random walk problems, a power law relation between ∆x2 and N holds only asymptotically
for large N, and hence we should use only the larger values of N to estimate the slope.
Also, because we are ﬁnding the best ﬁt for the logarithm of the independent variable N, we
need to give equal weight to all intervals of lnN. In the above example, we used N = 8, 16, 32,
and 64, so that the values of lnN are equally spaced.
7.7 Applications to Polymers
Random walk models play an important role in polymer physics (cf. de Gennes). A polymer
consists of N repeat units (monomers) with N 1 (N ∼ 103 – 105). For example, polyethylene
can be represented as ···−CH2−CH2−CH2−···. The detailed structure of the polymer is
important for many practical applications. For example, if we wish to improve the fabrication
of rubber, a good understanding of the local motions of the monomers in the rubber chain is
essential. However, if we are interested in the global properties of the polymer, the details of
the chain structure can be ignored.
Let us consider a familiar example of a polymer chain in a good solvent: a noodle in warm
water. A short time after we place a noodle in warm water, the noodle becomes ﬂexible, and it
neither collapses into a little ball or becomes fully stretched. Instead, it adopts a random structure
as shown schematically in Figure 7.6. If we do not add too many noodles, we can say that
the noodles behave as a dilute solution of polymer chains in a good solvent. The dilute nature
of the solution implies that we can ignore entanglement eﬀects of the noodles and consider each
CHAPTER 7. RANDOM PROCESSES 225
(a) (b)
Figure 7.6: (a) Schematic illustration of a linear polymer in a good solvent. (b) Example of the
corresponding self-avoiding walk on a square lattice.
noodle individually. The presence of a good solvent implies that the polymers can move freely
and adopt many diﬀerent conﬁgurations.
A fundamental geometrical property that characterizes a polymer in a good solvent is the
mean square end-to-end distance R2
N , where N is the number of monomers. (For simplicity,
we will frequently write R2 in the following.) For a dilute solution of polymer chains in a good
solvent, it is known that the asymptotic dependence of R2 is given by (7.13) with ν ≈ 0.5874 in
three dimensions. If we were to ignore the interactions of the monomers, the simple random
walk model would yield ν = 1/2, independent of the dimension and symmetry of the lattice.
Because this result for ν does not agree with experiment, we know that we are overlooking an
important physical feature of polymers.
We now discuss a random walk that incorporates the global features of dilute linear polymers
in solution. We have already introduced a model of a polymer chain consisting of straight
line segments of the same size joined together at random angles (see Problem 7.13). A further
idealization is to place the polymer chain on a lattice (see Figure 7.6). A more realistic model of
linear polymers accounts for its most important physical feature; that is, two monomers cannot
occupy the same spatial position. This constraint is known as the excluded volume condition,
which is ignored in a simple random walk. A well-known lattice model for a linear polymer
chain that incorporates this constraint is known as the self-avoiding walk (SAW). This model
consists of the set of all N-step walks starting from the origin subject to the global constraint
that no lattice site can be visited more than once in each walk; this constraint accounts for the
excluded volume condition.
Self-avoiding walks have many applications, such as the physics of magnetic materials and
the study of phase transitions, and they are of interest as purely mathematical objects. Many of
the obvious questions have resisted rigorous analysis, and exact enumeration and Monte Carlo
simulation have played an important role in our current understanding. The result for ν in
two dimensions for the self-avoiding walk is known to be exactly ν = 3/4. The proportionality
CHAPTER 7. RANDOM PROCESSES 226
1
2
3
4
(a)
1 4
5
6
7
(b)
1
2
3
w4 = 2/3
(c)
1 4
5
w6 = 1/3
2 3 2 3
Figure 7.7: Examples of self-avoiding walks on a square lattice. The origin is denoted by a ﬁlled
circle. (a) An N = 3 walk. The fourth step shown is forbidden. (b) An N = 7 walk that leads to a
self-intersection at the next step; the weight of the N = 8 walk is zero. (c) Two examples of the
weights of walks in the enrichment method.
constant in (7.13) depends on the structure of the monomers and on the solvent. In contrast,
the exponent ν is independent of these details and depends only on the spatial dimension.
We consider Monte Carlo simulations of the self-avoiding walk in two dimensions in Problems
7.28–7.30. Another algorithm for the self-avoiding walk is considered in Project 7.41.
Problem 7.28. The two-dimensional self-avoiding walk
Consider the self-avoiding walk on the square lattice. Choose an arbitrary site as the origin and
assume that the ﬁrst step is “up.” The walks generated by the three other possible initial directions
only diﬀer by a rotation of the whole lattice and do not have to be considered explicitly.
The second step can be in three rather than four possible directions because of the constraint
that the walk cannot return to the origin. To obtain unbiased results, we generate a random
number to choose one of the three directions. Successive steps are generated in the same way.
Unfortunately, the walk will very likely not continue indeﬁnitely. To obtain unbiased results,
we must choose at random one of the three steps, even though one or more of these steps might
lead to a self-intersection. If the next step does lead to a self-intersection, the walk must be
terminated to keep the statistics unbiased. An example of a three-step walk is shown in Figure
7.7a. The next step leads to a self-intersection and violates the constraint. In this case we
must start a new walk at the origin.
(a) Write a program that implements this algorithm and record the fraction f (N) of successful
attempts at constructing polymer chains with N total monomers. Represent the lattice as a
array so that you can record the sites that already have been visited. What is the qualitative
dependence of f (N) on N? What is the maximum value of N that you can reasonably
consider?
(b) Determine the mean square end-to-end distance R2
N for values of N that you can reasonably
consider with this sampling method.
The disadvantage of the straightforward sampling method in Problem 7.28 is that it becomes
very ineﬃcient for long chains; that is, the fraction of successful attempts decreases exponentially.
To overcome this attrition, several “enrichment” techniques have been developed.
We ﬁrst discuss a relatively simple algorithm proposed by Rosenbluth and Rosenbluth in which
each walk of N steps is associated with a weighting function w(N). Because the ﬁrst step to the
north is always possible, we have w(1) = 1. In order that all allowed conﬁgurations of a given
N are counted equally, the weights w(N) for N > 1 are determined according to the following
possibilities:
CHAPTER 7. RANDOM PROCESSES 227
1. All three possible steps violate the self-intersection constraint (see Figure 7.7b). The walk
is terminated with a weight w(N) = 0, and a new walk is generated at the origin.
2. All three steps are possible and w(N) = w(N − 1).
3. Only m steps are possible with 1 ≤ m < 3 (see Figure 7.7c). In this case w(N) = (m/3)w(N −
1), and one of the m possible steps is chosen at random.
The desired unbiased value of R2 is obtained by weighting R2
i , the value of R2 obtained in
the ith trial, by the value of wi(N), the weight found for this trial. Hence, we write
R2
=
i wi(N)R2
i
i wi(N)
(7.47)
where the sum is over all trials.
Problem 7.29. Rosenbluth and Rosenbluth enrichment method
Incorporate the Rosenbluth method into your Monte Carlo program and compute R2 for N = 4,
8, 16, and 32. Estimate the exponent ν from a log-log plot of R2 versus N. Can you distinguish
your estimate for ν from its random walk value ν = 1/2?
The Rosenbluth and Rosenbluth procedure is not particularly eﬃcient because many walks
still terminate, and thus we do not obtain many walkers for large N. Grassberger improved this
algorithm by increasing the population of walkers with high weights and reducing the population
of walkers with low weights. The idea is that if w(N) for a given trial is above a certain
threshold, we add a new walker and give the new and old walker half of the original weight.
If w(N) is below a certain threshold, then we eliminate half of the walkers with weights below
this threshold (for example, every second walker) and double the weights of the remaining half.
It is a good idea to adjust the thresholds as the simulation runs in order to maintain a relatively
constant number of walkers.
More recently Prellberg and Krawczyk further improved the Rosenbluth and Rosenbluth
enrichment method so that there is no need to provide a threshold value. After each step the average
weight of the walkers w(N) is computed for a given trial, and the ratio r = w(N)/ w(N)
is used to determine whether to add walkers (enrichment) or eliminate walkers (pruning). If
r > 1, then c = min(r,m) copies of the walker are made each with weight w(N)/c. If r < 1, then
remove this walker with probability 1 − r. This algorithm leads to an approximately constant
number of walkers and is related to the Wang–Landau method which we will discuss in Problem
15.30.
Another enrichment algorithm is the “reptation” method (see Wall and Mandel). For simplicity,
consider a model polymer chain in which all bond angles are ±90◦. As an example of
this model, the ﬁve independent N = 5 polymer chains are shown in Figure 7.8. (Other chains
diﬀer only by a rotation or a reﬂection.) The reptation method can be stated as follows:
1. Choose a chain at random and remove the tail link.
2. Attempt to add a link to the head of the chain. There is a maximum of two directions in
which the new head link can be added.
3. If the attempt violates the self-intersection constraint, return to the original chain and
interchange the head and tail. Include the chain in the statistical sample.
CHAPTER 7. RANDOM PROCESSES 228
(a) (b) (c)
(d) (e)
Figure 7.8: The ﬁve independent possible walks of N = 5 steps on a square lattice with ±90◦
bond angles. The tail and head of each walk are denoted by a circle and arrow, respectively.
The above steps are repeated many times to obtain an estimate of R2.
As an example of the reptation method, consider chain a of Figure 7.8. A new link can
be added in two directions (see Figure 7.9a), so that on the average we ﬁnd a → 1
2 c + 1
2 d. In
contrast, a link can be added to chain b in only one direction, and we obtain b → 1
2 e + 1
2 b,
where the tail and head of chain b have been interchanged (see Figure 7.9b). Conﬁrm that
c → 1
2 e+ 1
2 a, d → 1
2 c+ 1
2 d, and e → 1
2 a+ 1
2 b, and that all ﬁve chains are equally probable. That is,
the transformations in the reptation method preserve the proper statistical weights of the chains
without attrition. There is just one problem: unless we begin with a double-ended “cul-de-sac”
conﬁguration, such as shown in Figure 7.10, we will never obtain such a conﬁguration using
the above transformation. Hence, the reptation method introduces a small statistical bias, and
the calculated mean end-to-end distance will be slightly larger than if all conﬁgurations were
considered. However, the probability of such trapped conﬁgurations is very small, and the bias
can be neglected for most purposes.
∗Problem 7.30. The reptation method
(a) Adopt the ±90◦ bond angle restriction and calculate by hand the exact value of R2 for
N = 5. Then write a Monte Carlo program that implements the reptation method. Generate
one walk of N = 5 and use the reptation method to generate a statistical sample of chains.
As a check on your program, compute R2 for N = 5 and compare your result with the
exact result. Then extend your Monte Carlo computations of R2 to larger N.
(b) Modify the reptation model so that the bond angle can also be 180◦. This modiﬁcation leads
to a maximum of three directions for a new bond. Compare your results with those from
part (a).
In principle, the dynamics of a polymer chain undergoing collisions with solvent molecules
can be simulated by using a molecular dynamics method. However, in practice, only relatively
small chains can be simulated in this way. An alternative approach is to use a Monte
CHAPTER 7. RANDOM PROCESSES 229
(a) (c) (d)
(b) (e) (b)
+
+
Figure 7.9: The possible transformations of chains a and b. One of the two possible transformations
of chain b violates the self-intersection restriction, and the head and tail are interchanged.
Figure 7.10: Example of a double-cul-de-sac conﬁguration for the self-avoiding walk that cannot
be obtained by the reptation method.
Carlo model that simpliﬁes the eﬀect of the random collisions of the solvent molecules with the
atoms of the chain. Most of these models (cf. Verdier and Stockmayer) consider the chain to be
composed of beads connected by bonds and restrict the positions of the beads to the sites of a
lattice. For simplicity, we assume that the bond angles can be either ±90◦ or 180◦. The idea is to
begin with an allowed conﬁguration of N beads (N −1 bonds). A possible starting conﬁguration
can be generated by taking successive steps in the positive y direction and positive x directions.
The dynamics of the Verdier–Stockmayer algorithm is summarized by the following steps:
1. Select at random a bead (occupied site) on the polymer chain. If the bead is not an end
site, then the bead can move to a nearest neighbor site of another bead if this site is empty
and if the new angle between adjacent bonds is either ±90◦ or 180◦. For example, bead
4 in Figure 7.11 can move to position 4 , while bead 3 cannot move if selected. That is, a
selected bead can move to a diagonally opposite unoccupied site only if the two bonds to
which it is attached are mutually perpendicular.
2. If the selected bead is an end site, move it to one of two (maximum) possible unoccupied
sites, so that the bond to which it is connected changes its orientation by ±90◦ (see
Figure 7.11).
CHAPTER 7. RANDOM PROCESSES 230
1
1''
2
1' 4'
3
5
4
6
7
8'
7'
8
Figure 7.11: Examples of possible moves of the simple polymer dynamics model considered in
Problem 7.31. For this conﬁguration, beads 2, 3, 5, and 6 cannot move, while beads 1, 4, 7, and
8 can move to the positions shown if they are selected. Only one bead can move at a time. This
ﬁgure is adopted from the article by Verdier and Stockmayer.
3. If the selected bead cannot move, retain the previous conﬁguration.
The physical quantities of interest include R2 and the mean square displacement of the
center of mass of the chain r2 = x2 − x 2 + y2 − y 2, where x and y are the coordinates of
the center of mass. The unit of time is the number of Monte Carlo steps per bead; in one Monte
Carlo step per bead, each bead has one chance on the average to move to a diﬀerent site.
Another eﬃcient method for simulating the dynamics of a polymer chain is the bond ﬂuctuation
model (see Carmesin and Kremer).
Problem 7.31. The dynamics of polymers in a dilute solution
(a) Consider a two-dimensional lattice and compute R2 and r2 for various values of N. How
do these quantities depend on N? (The ﬁrst published results for three dimensions were
limited to 32 Monte Carlo steps per bead for N = 8, 16, and 32 and only 8 Monte Carlo
steps per bead for N = 64.) Also compute the probability density P (R) that the end-to-end
distance is R. How does this probability compare to a Gaussian distribution?
(b)∗ Two conﬁgurations are strongly correlated if they diﬀer by the position of only one bead.
Hence, it would be a waste of computer time to measure the end-to-end distance and the
position of the center of mass after every single move. Ideally, we wish to compute these
quantities for conﬁgurations that are approximately statistically independent. Because we
do not know a priori the mean number of Monte Carlo steps per bead needed to obtain conﬁgurations
that are statistically independent, it is a good idea to estimate this time in our
preliminary calculations. The correlation time τ is the time needed to obtain statistically
independent conﬁgurations and can be obtained by computing the equilibrium averaged
time-autocorrelation function for a chain of ﬁxed N:
C(t) =
R2(t + t)R2(t ) − R2 2
R4 − R2 2
. (7.48)
C(t) is deﬁned so that C(t = 0) = 1 and C(t) = 0 if the conﬁgurations are not correlated.
Because the conﬁgurations will become uncorrelated if the time t between the conﬁgurations
is suﬃciently long, we expect that C(t) → 0 for t 1. We expect that C(t) ∼ e−t/τ; that
is, C(t) decays exponentially with a decay or correlation time τ. Estimate τ from a plot of
CHAPTER 7. RANDOM PROCESSES 231
0 0
0.5
0
0.5
0
0 0 1
0.27
0
0.73
t = 1
0
0.73
0 1 1
0.27
t = 2
0 0 1
0.5
1
0.5
t = 3
t = 0
Figure 7.12: Example of the evolution of the true self-avoiding walk with g = 1 (see (7.49)). The
shaded site represents the location of the walker at time t. The number of visits to each site are
given within each site. and the probability of a step to a nearest neighbor site is given below it.
Note the use of periodic boundary conditions.
lnC(t) versus t. Another way of estimating τ is from the integral
∞
0
dt C(t), where C(t) is
normalized so that C(0) = 1. (Because we determine C(t) at discrete values of t, this integral
is actually a sum.) How do your two estimates of τ compare? A more detailed discussion
of the estimation of correlation times can be found in Section 15.7.
Another type of random walk that is less constrained than the self-avoiding random walk
is the “true” self-avoiding walk. This walk describes the path of a random walker that avoids
visiting a lattice site with a probability that is a function of the number of times the site has been
visited already. This constraint leads to a reduced excluded volume interaction in comparison
to the usual self-avoiding walk.
Problem 7.32. The true self-avoiding walk in one dimension
In one dimension the true self-avoiding walk corresponds to a walker that can jump to one of its
two nearest neighbors with a probability that depends on the number of times these neighbors
have already been visited. Suppose that the walker is at site i at step t. The probability that the
walker will jump to site i + 1 at time t + 1 is given by
pi+1 =
e−gni+1
e−gni+1 + e−gni−1
(7.49)
where ni±1 is the number of times that the walker has already visited site i ± 1. The probability
of a jump to site i − 1 is pi−1 = 1 − pi+1. The parameter g (g > 0) is a measure of the “desire”
of the path to avoid itself. The ﬁrst few steps of a typical true self-avoiding walk are shown in
Figure 7.12. The main quantity of interest is the exponent ν. We know that g = 0 corresponds to
the usual random walk with ν = 1/2, and that the limit g → ∞ corresponds to the self-avoiding
walk. What is the value of ν for a self-avoiding walk in one dimension? Is the value of ν for any
ﬁnite value of g diﬀerent than these two limiting cases?
Write a program to do a Monte Carlo simulation of the true self-avoiding walk in one dimension.
Use an array to record the number of visits to every site. At each step calculate the
CHAPTER 7. RANDOM PROCESSES 232
probability p of a jump to the right. Generate a random number r and compare it to p. If r ≤ p,
move the walker to the right; otherwise, move the walker to the left. Compute ∆x2 as a function
of the number of steps N, where x is the distance of the walker from the origin. Make a
log-log plot of ∆x2 versus N and estimate ν. Can you distinguish ν from its random walk and
self-avoiding walk values? Reasonable choices of parameters are g = 0.1 and N ∼ 103. Averages
over 103 trials yield qualitative results. For comparison, published results are for N ∼ 104 and
for 103 trials; extended results for g = 2 are given for N = 2 × 105 and 104 trials (see Bernasconi
and Pietronero).
7.8 Diﬀusion-Controlled Chemical Reactions
Imagine a system containing particles of a single species A. The particles diﬀuse, and when two
particles “collide,” a reaction occurs such that the two combine to form an inert species which
is no longer involved in the reaction. We can represent this chemical reaction as
A + A → 0. (7.50)
If we ignore the spatial distribution of the particles, we can describe the kinetics by a simple
rate equation:
dA(t)
dt
= −kA2
(t) (7.51)
where A is the concentration of A particles at time t and k is the rate constant. (In the chemical
kinetics literature, it is traditional to use the term concentration rather than the number density.)
For simplicity, we assume that all reactants are entered into the system at t = 0, and that
no reactants are added later (the system is closed). It is easy to show that the solution of the
ﬁrst-order diﬀerential equation (7.51) is
A(t) =
A(0)
1 + ktA(0)
. (7.52)
Hence, A(t) ∼ t−1 in the limit of long times.
Another interesting case is the bimolecular reaction
A + B → 0. (7.53)
If we neglect spatial ﬂuctuations in the concentration as before (this neglect yields what is
known as a mean-ﬁeld approximation), we can write the corresponding rate equation as
dA(t)
dt
=
dB(t)
dt
= −kA(t)B(t). (7.54)
We have also
A(t) − B(t) = constant (7.55)
because each reaction leaves the diﬀerence between the concentration of A and B particles unchanged.
For the special case of equal initial concentrations, the solution of (7.54) with (7.55) is
the same as (7.52). What is the solution for the case A(0) B(0)?
This derivation of the time dependence of A for the kinetics of the one and two species
annihilation process is straightforward, but is based on the assumption that the particles are
distributed uniformly. In the following two problems, we simulate the kinetics of these processes
and test this assumption.
CHAPTER 7. RANDOM PROCESSES 233
Problem 7.33. Diﬀusion-controlled chemical reactions in one dimension
(a) Assume that N particles do a random walk on a one-dimensional lattice of length L with
periodic boundary conditions. Every particle moves once in one unit of time. Use the array
site[j] to record the label of the particle, if any, at site j. Because we are interested in
the long time behavior of the system when the concentration A = N/L of particles is small,
it is eﬃcient to also maintain an array of particle positions x[i] such that site[x(i] = i.
For example, if particle 5 is located at site 12, then x[5] = 12 and site[12] = 5. We also
need an array, newSite, to maintain the new positions of the walkers as they are moved one
at a time. After each walker is moved, we check to see if two walkers have landed on the
same position k. If they have, we set newSite[k] = −1 and the value of x[i] for these two
walkers to −1. The value −1 indicates that no particle exists at the site. After all the walkers
have moved, we let site = newSite for all sites, and remove all the reacting particles in x
that have values equal to −1. This operation can be accomplished by replacing any reacting
particle in x by the last particle in the array. Begin with all sites occupied A(t = 0) = 1.
(b) Make a log-log plot of the quantity A(t)−1 − A(0)−1 versus the time t. The times should
be separated by exponential intervals, so that your data is equally spaced on a logarithmic
plot. For example, you might include data with times equal to 2p and p = 1, 2, 3, . . . .
Does your log-log plot yield a straight line for long times? If so, calculate its slope. Is the
mean-ﬁeld approximation for A(t) valid in one dimension? You can obtain crude results for
small lattices of order L = 100 and times of order t = 102. To obtain results to within 10%,
you will need lattices of order L = 104 and times of order t = 213.
(c) More insight into the origin of the time dependence of A(t) can be gained from the behavior
of the quantity P (r,t), the probability that the nearest neighbor distance is r at time t. The
nearest neighbor distance of a given particle is deﬁned as the minimum distance between
it and all other particles. The distribution of these distances changes dramatically as the
reaction proceeds, and this change can give information about the reaction mechanism.
Place the particles at random on a one-dimensional lattice and verify that the most probable
nearest neighbor distance is r = 1 (one lattice constant) for all concentrations. (This result is
true in any dimension.) Then verify that the distribution of nearest neighbor distances on a
one-dimensional lattice is given by
P (r,t = 0) = 2Ae−2A(r−1)
(random distribution). (7.56)
Is the form (7.56) properly normalized? Start with A(t = 0) = 0.1 and ﬁnd P (r,t) for t = 10,
100, and 1000. Average over all particles. How does P (r,t) change as the reaction proceeds?
Does it retain the same form as the concentration decreases?
(d)∗ Compute the quantity D(t), the number of distinct sites visited by an individual walker.
How does the time dependence of D(t) compare to the computed time dependence of
A(t)−1 − 1?
(e)∗ Write a program to simulate the reaction A + B = 0. For simplicity, assume that multiple
occupancy of the same site is not allowed; for example, an A particle cannot jump to a site
already occupied by an A particle. The easiest procedure is to allow a walker to choose
one of its nearest neighbor sites at random, but not to move the walker if the chosen site
is already occupied by a particle of the same type. If the site is occupied by a walker of
another type, then the pair of reacting particles is annihilated. Keep separate arrays for the
A and B particles with the value of the array denoting the label of the particle as before.
CHAPTER 7. RANDOM PROCESSES 234
One way to distinguish A and B walkers is to make the array element site(k) positive if
the site is occupied by an A particle and negative if the site is occupied by a B particle.
Start with equal concentrations of A and B particles and occupy the sites at random. Some
of the interesting questions are similar to those that we posed in parts (b)–(d). Color code
the particles and observe what happens to the relative positions of the particles.
∗Problem 7.34. Reaction diﬀusion in two dimensions
(a) Do a similar simulation as in Problem 7.33 on a two-dimensional lattice for the reaction A+
A → 0. In this case it is convenient to have one array for each dimension, for example, siteX
and siteY, or to store the lattice as a one-dimensional array (see Section 12.2). Set A(t = 0) =
1 and choose L = 50. Show the walkers after each Monte Carlo step per walker and describe
their distribution as they diﬀuse. Are the particles uniformly distributed throughout the
lattice for all times? Calculate A(t) and compare your results for A(t)−1 − A(0)−1 to the tdependence
of D(t), the number of distinct lattice sites that are visited in time t. (In two
dimensions, D(t) ∼ t/ logt.) How well do the slopes compare? Do a similar simulation with
A(t = 0) = 0.01. What slope do you obtain in this case? What can you conclude about the
initial density dependence? Is the mean-ﬁeld approximation valid in this case?
(b) Begin with A and B type random walkers initially segregated on the left and right halves (in
the x direction) of a square lattice. The process A + B → C exhibits a reaction front where
the production of particles of type C is nonzero. Some of the quantities of interest are the
time dependence of the mean position x (t) and the width w(t) of the reaction front. The
rules of this process are the same as in part (a) except that a particle of type C is added to a
site when a reaction occurs. A particular site can be occupied by one particle of type A or
type B as well as any number of particles of type C. If n(x,t) is the number of particles of
type C at a distance x from the initial boundary of the reactants, then x(t) and w(t) can be
written as
x (t) = x xn(x,t)
x n(x,t)
(7.57)
w(t)2
=
x x − x (t)
2
n(x,t)
x n(x,t)
. (7.58)
Choose lattice sizes of order 100 × 100 and average over at least 10 trials. The ﬂuctuations
in x(t) and w(t) can be reduced by averaging n(x,t) over the order of 100 time units centered
about t. More details can be found in Jiang and Ebner.
7.9 Random Number Sequences
So far we have used the random number generator supplied with Java to generate the desired
random numbers. In principle, we could have generated these numbers from a random physical
process, such as the decay of radioactive nuclei or the thermal noise from a semiconductor
device. In practice, random number sequences are generated from a physical process only for
speciﬁc purposes such as a lottery. Although we could store the outcome of a random physical
process so that the random number sequence would be both truly random and reproducible,
such a method would usually be inconvenient and ineﬃcient in part because we often require
very long sequences. In practice, we use a digital computer, a deterministic machine, to generate
CHAPTER 7. RANDOM PROCESSES 235
sequences of pseudorandom numbers. Although these sequences cannot be truly random, such a
distinction is unimportant if the sequence satisﬁes all our criteria for randomness. It is common
to refer to random number generators even though we really mean pseudorandom number
generators.
Most random number generators yield a sequence in which each number is used to ﬁnd
the succeeding one according to a well-deﬁned algorithm. The most important features of a
desirable random number generator are that its sequence satisﬁes the known statistical tests for
randomness, which we will explore in the following problems. We also want the generator to
be eﬃcient and machine independent and the sequence to be reproducible.
The most widely used random number generator is based on the linear congruential method.
One advantage of the linear congruential method is that it is very fast. For a given seed x0, each
number in the sequence is determined by the one-dimensional map
xn = (axn−1 + c)modm (7.59)
where a, c, and m as well as xn are integers. The notation y = zmodm means that m is subtracted
from z until 0 ≤ y < m. The map (7.59) is characterized by three parameters, the multiplier a,
the increment c, and the modulus m. Because m is the largest integer generated by (7.59), the
maximum possible period is m.
In general, the period depends on all three parameters. For example, if a = 3, c = 4, m = 32,
and x0 = 1, the sequence generated by (7.59) is 1, 7, 25, 15, 17, 23, 9, 31, 1, 7, 25, ..., and the
period is 8, rather than the maximum possible value of m = 32. If we choose a, c, and m carefully
such that the maximum period is obtained, then all possible integers between 0 and m−1 would
occur in the sequence. Because we usually wish to have random numbers r in the unit interval
0 ≤ r < 1, rather than random integers, random number generators usually return the ratio
xn/m which is always less than unity. Several rules have been developed (see Knuth) to obtain
the longest period. Some of the properties of the linear congruential method are explored in
Problem 7.35.
Another popular random number generator is the generalized feedback shift register method
which uses bit manipulation (see Sections 14.1 and 14.6). Every integer is represented as a series
of 1s and 0s called bits. These bits can be shuﬄed by using the bitwise exclusive or operator
⊕ (xor) deﬁned by a⊕b = 1 if the bits a b; a⊕b = 0 if a = b. The nth member of the sequence is
given by
xn = xn−p ⊕ xn−q (7.60)
where p > q, and p, q, and xn are integers. The ﬁrst p random integers must be supplied by
another random number generator. As an example of how the operator ⊕ works, suppose that
n = 6, p = 5, q = 3, x3 = 11, and x1 = 6. Then x6 = x1 ⊕ x3 = 0110 ⊕ 1011 = 1101 = 23 + 22 +
20 = 8 + 4 + 1 = 13. Not all values of p and q lead to good results. Some common pairs are
(p,q) = (31,3),(250,103), and (521,168).
In Java and C the exclusive or operation on the integers m and n is written as mˆn. The
algorithm for producing the random numbers after p integers have been produced is shown in
the following. Initially the index k can be set to 0.
1. If k < q, set j = k + q, else set j = k − p + q.
2. Set xk = xk ⊕ xj; xk is the desired random number for this iteration. If a random number
between 0 and 1 is desired, divide xk by the maximum possible integer that the computer
can hold.
CHAPTER 7. RANDOM PROCESSES 236
3. Increment k to (k + 1) mod p.
Because the exclusive or operator and bit manipulation is very fast, this random number
generator is very fast. However, the period may not be long enough for some applications, and
the correlations between numbers might not be as good as needed. The shuﬄing algorithm
discussed in Problem 7.36 should be used to improve this generator.
These two examples of random number generators illustrate their general nature. That
is, numbers in the sequence are used to ﬁnd the succeeding ones according to a well-deﬁned
algorithm. The sequence is determined by the seed, the ﬁrst number of the sequence, or the ﬁrst
p members of the sequence for the generalized feedback shift register and related generators.
Usually, the maximum possible period is related to the size of the computer word, for example,
32 or 64 bits. The choice of the constants and the proper initialization of the sequence is very
important, and thus these algorithms must be implemented with care.
There is no necessary and suﬃcient test for the randomness of a ﬁnite sequence of numbers;
the most that can be said about any ﬁnite sequence of numbers is that it is apparently random.
Because no single statistical test is a reliable indicator, we need to consider several tests. Some
of the best known tests are discussed in Problem 7.35. Many of these tests can be stated in terms
of random walks.
Problem 7.35. Statistical tests of randomness
(a) Period. An obvious requirement for a random number generator is that its period be much
greater than the number of random numbers needed in a speciﬁc calculation. One way to
visualize the period of the random number generator is to use it to generate a plot of the
displacement x of a random walker as a function of the number of steps N. When the period
of the random number is reached, the plot will begin to repeat itself. Generate such a plot
using (7.59) for a = 899, c = 0, and m = 32768, and for a = 16807, c = 0, and m = 231 −1 with
x0 = 12. What are the periods of the corresponding random number generators? Obtain
similar plots using diﬀerent values for the parameters a, c, and m. Why is the seed value
x0 = 0 forbidden for the choice c = 0? Do some combinations of a, c, and m give longer
periods than others?
(b) Uniformity. A random number sequence should contain numbers distributed in the unit
interval with equal probability. The simplest test of uniformity is to divide this interval
into M equal size subintervals or bins. For example, consider the ﬁrst N = 104 numbers
generated by (7.59) with a = 106, c = 1283, and m = 6075 (see Press et al.). Place each
number into one of M = 100 bins. Is the number of entries in each bin approximately
equal? What happens if you increase N?
(c) Chi-square test. Is the distribution of numbers in the bins of part (b) consistent with the laws
of statistics? The most common test of this consistency is the chi-square or χ2 test. Let yi be
the observed number in bin i and Ei be the expected value. The chi-square statistic is
χ2
=
M
i=1
(yi − Ei)2
Ei
. (7.61)
For the example in part (b) with N = 104 and M = 100, we have Ei = 100. The magnitude of
the number χ2 is a measure of the agreement between the observed and expected distributions;
chi2 should not be too big or too small. In general, the individual terms in the sum
CHAPTER 7. RANDOM PROCESSES 237
(7.61) are expected to be order one, and because there are M terms in the sum, we expect
χ2 ≤ M. As an example, we did ﬁve independent runs of a random number generator with
N = 104 and M = 100 and found χ2 ≈ 92, 124, 85, 91, and 99. These values of χ2 are consistent
with this expectation. Although we usually want χ2 to be as small as possible, we
would be suspicious if χ2 ≈ 0, because such a small value suggests that N is a multiple of
the period of the generator and that each value in the sequence appears an equal number of
times.
(d) Filling sites. Although a random number sequence might be distributed in the unit interval
with equal probability, the consecutive numbers might be correlated in some way. One test
of this correlation is to ﬁll a square lattice of L2 sites at random. Consider an array n(x,y)
that is initially empty, where 1 ≤ xi,yi ≤ L. A site is selected randomly by choosing its two
coordinates xi and yi from two consecutive numbers in the sequence. If the site is empty, it is
ﬁlled and n(xi,yi) = 1; otherwise it is not changed. This procedure is repeated t times, where
t is the number of Monte Carlo steps per site. That is, the time is increased by 1/L2 each
time a pair of random numbers is generated. Because this process is analogous to the decay
of radioactive nuclei, we expect that the fraction of empty lattice sites should decay as e−t.
Determine the fraction of unﬁlled sites using the random number generator that you have
been using for L = 10, 15, and 20. Are your results consistent with the expected fraction?
Repeat the same test using (7.59) with a = 231, c = 0, and m = 65,549. The existence of
triplet correlations can be determined by a similar test on a simple cubic lattice by choosing
the three coordinates xi, yi, and zi from three consecutive random numbers.
(e) Parking lot test. Fill sites as in part (d) and draw the sites that have been ﬁlled. Do the ﬁlled
sites look random, or are there stripes of ﬁlled sites? Try a = 65.549, c = 0, and m = 231.
(f) Hidden correlations. Another way of checking for correlations is to plot xi+k versus xi. If there
are any obvious patterns in the plot, then there is something wrong with the generator. Use
the generator (7.59) with a = 16,807,c = 0, and m = 231 − 1. Can you detect any structure in
the plotted points for k = 1 to k = 5? Test the random number generator that you have been
using. Do you see any evidence of lattice structure, for example, equidistant parallel lines?
Is the logistic map xn+1 = 4xn(1 − xn) a suitable random number generator?
(g) Short-term correlations. Another measure of short term correlations is the autocorrelation
function
C(k) =
xi+kxi − xi
2
xixi − xi xi
(7.62)
where xi is the ith term in the sequence. We have used the fact that xi+k = xi ; that is,
the choice of the origin of the sequence is irrelevant. The quantity xi+kxi is found for a
particular choice of k by forming all the possible products of xi+kxi and dividing by the
number of products. If xi+k and xi are not correlated, then xi+kxi = xi+k xi and C(k) = 0.
Is C(k) identically zero for any ﬁnite sequence? Compute C(k) for a = 106, c = 1283, and
m = 6075.
(h) Random walk. A test based on the properties of random walks has been proposed by Vattulainen
et al. Assume that a walker begins at the origin of the x-y plane and walks for N
steps. Average over M walkers and count the number of walks that end in each quadrant
qi. Use the χ2 test (7.61) with yi → qi, M = 4, and Ei = M/4. If χ2 > 7.815 (a 5% probability
if the random number generator is perfect), we say that the run fails. The random number
CHAPTER 7. RANDOM PROCESSES 238
generator fails if two out of three independent runs fail. The probability of a perfect generator
failing two out of three runs is approximately 3 × 0.95 × (0.05)2 ≈ 0.007. Test several
random number generators.
Problem 7.36. Improving random number generators
One way to reduce sequential correlation and to lengthen the period is to mix or shuﬄe the
random numbers produced by a random number generator. A standard procedure is to begin
with a list of N random numbers (between 0 and 1) using a given generator rng. The number N
is arbitrary but should be less than the period of rng. Also generate one more random number
rextra. Then for each desired random number use the following procedure:
(i) Calculate the integer k given by (int)(N*rextra). Use the kth random number rk from your
list as the desired random number.
(ii) Set rextra equal to the random number rk chosen in step (i).
(iii) Generate a new random number r from rng and use it to replace the number chosen in
step (i); that is, rk = r.
Consider a random number generator with a relatively short period and strong sequential correlation
and show that this shuﬄing scheme improves the quality of the random number se-
quence.
At least some of the statistical tests given in Problem 7.35 should be done whenever serious
calculations are contemplated. However, even if a random number generator passes all these
tests, there can still be problems in rare cases. Typically, these problems arise when a small
number of events have a large weight. In these cases a very small bias in the random number
generator might lead to systematic errors, and two generators, which appear equally good as
determined by various statistical tests, might give statistically diﬀerent results in a speciﬁc application
(see Project 15.34). For this reason it is important that the particular random number
generator used be reported along with the actual results. Conﬁdence in the results can also be
increased by repeating the calculation with another random number generator.
Because all random number generators are based on a deterministic algorithm, it is always
possible to construct a test generator for which a particular algorithm will fail. The success
of a random number generator in passing various statistical tests is necessary, but it is not a
suﬃcient condition for its use in all applications. In Project 15.34 we discuss an application of
Monte Carlo methods to the Ising model for which some popular random number generators
give incorrect results.
7.10 Variational Methods
Many problems in physics can be formulated in terms of a variational principle. In the following,
we consider examples of variational principles in geometrical optics and classical mechanics.
We then discuss how Monte Carlo methods can be applied to these problems. A more
sophisticated application of Monte Carlo methods to a variational problem in quantum mechanics
is discussed in Chapter 16.
Our everyday experience of light leads naturally to the concept of light rays. This description
of light propagation, called geometrical or ray optics, is applicable when the wavelength of
CHAPTER 7. RANDOM PROCESSES 239
light is small compared to the linear dimensions of any obstacles or openings. The path of a
light ray can be formulated in terms of Fermat’s principle of least time: A ray of light follows
the path between two points (consistent with any constraints) that requires the least amount
of time. Fermat’s principle can be adopted as the basis of geometrical optics. For example,
Fermat’s principle implies that light travels from a point A to a point B in a straight line in a homogeneous
medium. Because the speed of light is constant along any path within the medium,
the path of shortest time is the path of shortest distance; that is, a straight line from A to B.
What happens if we impose the constraint that the light must strike a mirror before reaching B?
The speed of light in a medium can be expressed in terms of c, the speed of light in a
vacuum, and the index of refraction n of the medium:
v =
c
n
. (7.63)
Suppose that a light ray in a medium with index of refraction n1 passes through a second
medium with index of refraction n2. The two media are separated by a plane surface. We
now show how we can use Fermat’s principle and a simple Monte Carlo method to ﬁnd the path
of the light. The analytic solution to this problem using Fermat’s principle is found in many
texts (cf. Feynman et al.).
Our strategy, as implemented in class Fermat, is to begin with a straight path and to make
changes in the path at random. These changes are accepted only if they reduce the travel time
of the light. Some of the features of Fermat and FermatApp include:
1. Light propagates from left to right through N regions. The index of refraction n[i] is uniform
in each region [i]. The index i increases from left to right. We have chosen units such that
the speed of light in vacuum equals unity.
2. Because the light propagates in a straight line in each medium, the path of the light is given
by the coordinates y[i] at each boundary.
3. The coordinates of the light source and the detector are at (0,y[0]) and (N,y[N]), respectively,
where y[0] and y[N] are ﬁxed.
4. The path is the connection of the set of points at the boundary of each region.
5. The path of the light is found by choosing the boundary i at random and generating a trial
value of y[i] that diﬀers from its previous value by a random number between -dy to dy.
If the trial value of y[i] yields a shorter travel time, this value becomes the new value for
y[i].
6. The path is redrawn whenever it is changed.
Listing 7.6: Fermat class.
package org . opensourcephysics . sip . ch07 ;
public class Fermat {
double y [ ] ; / / y coordinate of l i g h t ray , index i s x coordinate
/ / l i g h t speed of ray f o r medium s t a r t i n g at index value
double v [ ] ;
int N; / / number of media
/ / change in index of r e f r a c t i o n from one region to the next
double dn ;
double dy = 0 . 1 ; / / maximum change in y p o s i t i o n
CHAPTER 7. RANDOM PROCESSES 240
int steps ;
public void i n i t i a l i z e ( ) {
y = new double [N+1];
v = new double [N] ;
double indexOfRefraction = 1 . 0 ;
for ( int i = 0; i <=N; i ++) {
y [ i ] = i ; / / i n i t i a l path i s a s t r a i g h t l i n e
}
for ( int i = 0; i <N; i ++) {
v [ i ] = 1.0/ indexOfRefraction ;
indexOfRefraction += dn ;
}
steps = 0;
}
public void step ( ) {
int i = 1+( int ) (Math . random ( ) (N− 1 ) ) ;
double yTrial = y [ i ]+2.0 dy ( Math . random ( ) − 0 . 5 ) ;
double previousTime = Math . sqrt (Math .pow( y [ i −1]−y [ i ] , 2)+1)/v [ i −1]; / / l e f t medium
previousTime += Math . sqrt (Math .pow( y [ i +1]−y [ i ] , 2)+1)/v [ i ] ; / / r i g h t medium
double trialTime = Math . sqrt (Math .pow( y [ i −1]− yTrial , 2)+1)/v [ i −1]; / / l e f t medium
trialTime += Math . sqrt (Math .pow( y [ i +1]− yTrial , 2)+1)/v [ i ] ; / / r i g h t medium
i f ( trialTime <previousTime ) {
y [ i ] = yTrial ;
}
steps ++;
}
}
Listing 7.7: Target class for Fermat’s principle.
package org . opensourcephysics . sip . ch07 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . PlotFrame ;
public class FermatApp extends AbstractSimulation {
Fermat medium = new Fermat ( ) ;
PlotFrame path = new PlotFrame ( "x" , "y" , "Light path" ) ;
public FermatApp ( ) {
path . setAutoscaleX ( true ) ;
path . setAutoscaleY ( true ) ;
path . setConnected ( true ) ; / / draw l i n e s between points
}
public void i n i t i a l i z e ( ) {
medium . dn = control . getDouble ( "Change in index of refraction" ) ;
medium .N = control . getInt ( "Number of media segments" ) ;
medium . i n i t i a l i z e ( ) ;
path . clearData ( ) ;
}
public void doStep ( ) {
CHAPTER 7. RANDOM PROCESSES 241
Figure 7.13: Near the horizon, the apparent (exaggerated) position of the sun is higher than the
true position of the sun. Note that the light rays from the true sun are curved due to refraction.
medium . step ( ) ;
path . clearData ( ) ;
for ( int i = 0; i <=medium .N; i ++) {
path . append (0 , i , medium . y [ i ] ) ;
}
path . setMessage (medium . steps+" steps" ) ;
}
public void reset ( ) {
control . setValue ( "Change in index of refraction" , 0 . 5 ) ;
control . setValue ( "Number of media segments" , 2 ) ;
path . clearData ( ) ;
enableStepsPerDisplay ( true ) ;
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl . createApp (new FermatApp ( ) ) ;
}
}
Problem 7.37. The law of refraction
(a) Use Fermat and FermatApp to determine the angle of incidence θ1 and the angle of refraction
θ2 between two media with diﬀerent indices of refraction. The angles θ1 and θ2
are measured from the normal to the boundary. Set N = 2 and let the ﬁrst medium be air
(n1 ≈ 1) and the second medium be glass (n2 ≈ 1.5). Describe the path of the light after a
number of trial paths are attempted. Add statements to the program to determine θ1 and
θ2, the vertical position of the intersection of the light at the boundary between the two
media, and the total time for the light to go from (0,y[0]) to (2,y[2]).
(b) Modify the program so that the ﬁrst medium represents glass (n1 ≈ 1.5) and the second
medium represents water (n2 ≈ 1.33). Verify that your results are consistent with n2 sinθ2 =
n1 sinθ1.
CHAPTER 7. RANDOM PROCESSES 242
Problem 7.38. Inhomogeneous media
(a) The earth’s atmosphere is thin at the top and dense near the earth’s surface. We can model
this inhomogeneous medium by dividing the atmosphere into equal width segments each
of which is homogeneous. To simulate this atmosphere run your program with N = 10 and
dn = 0.1, and ﬁnd the path of least time. Use your results to explain why when we see the
sun set, the sun is already below the horizon (see Figure 7.13).
(b)∗ Modify your program to ﬁnd the appropriate distribution n(y) for a ﬁber optic cable, which
we take to be a ﬂat, long ribbon. In this case the ith region corresponds to a cross-sectional
slab through the cable. Although a real cable is three-dimensional, we consider a twodimensional
cable for simplicity. We want the cable to have the property that if a ray of
light starts from one side of the cable and ends at the other, the slope dy/dx of the path
should be near zero at the edges so that light does not escape from the cable.
Fermat’s principle is an example of an extremum principle. An extremum means that a
small change in an independent variable leads to a change in a function (more precisely, a
function of functions) that is proportional to 2 or a higher power of . An important extremum
principle in classical mechanics is based on the action S:
S =
tﬁnal
t0
Ldt (7.64)
where t0 and tﬁnal are the initial and ﬁnal times, respectively. The Lagrangian L in (7.64) is the
kinetic energy minus the potential energy. The extremum principle for the action is known as
the principle of least action or Hamilton’s action principle. The path where (7.64) is stationary
(either a minimum or a saddle point) satisﬁes Newton’s second law (for conservative forces).
One reason for the importance of the principle of least action is that quantum mechanics can
be formulated in terms of an integral over the action (see Section 16.10).
To use (7.64) to ﬁnd the motion of a single particle in one dimension, we ﬁx the position at
the chosen initial and ﬁnal times, x(t0) and x(tﬁnal), and then choose the velocities and positions
for the intermediate times t0 < t < tﬁnal to minimize the action. One way to implement this
procedure numerically is to convert the integral in (7.64) to a sum:
S ≈
N−1
i=1
L(ti)∆t (7.65)
where ti = t0 + i∆t. (The approximation used to obtain (7.65) is known as the rectangular approximation
and is discussed in Chapter 11.) For a single particle in one dimension moving in
an external potential u(x), we can write
Li ≈
m
2(∆t)2
(xi+1 − xi)2
− u(xi) (7.66)
where m is the mass of the particle and u(xi) is the potential energy of the particle at xi. The
velocity has been approximated as the diﬀerence in position divided by the change in time ∆t.
CHAPTER 7. RANDOM PROCESSES 243
Problem 7.39. Principle of least action
(a) Write a program to minimize the action S given in (7.64) for the motion of a single particle
in one dimension. Use the approximate form of the Lagrangian given in (7.66). One way
to write the program is to modify class Fermat so that the vertical coordinate for the light
ray becomes the position of the particle, and the horizontal region number i becomes the
discrete time interval of duration ∆t.
(b) Verify your program for the case of free fall for which the potential energy is u(y) = mgy.
Choose y(t = 0) = 2 m and y(t = 10s) = 8 m and begin with N = 20. Allow the maximum
change in the position to be 5 m.
(c) Consider the harmonic potential u(x) = 1
2 kx2. What shape do you expect the path x(t) to be?
Increase N to approximately 50 and estimate the path by minimizing the action.
It is possible to extend the principle of least action to more dimensions or particles. However,
it is necessary to begin with a path close to the optimum one to obtain a good approximation
to the optimum path in a reasonable time.
In Problems 7.37–7.39, a simple Monte Carlo algorithm that always accepts paths that reduce
the time or action is suﬃcient. However, for more complicated index of refraction distributions
or potentials, it is possible that such a simple algorithm will ﬁnd only a local minimum,
and the global minimum will be missed. The problem of ﬁnding the global minimum is
very general and is shared by all optimization algorithms if the system has many relative minima.
Optimization is a very active area of research in many ﬁelds of science and engineering.
Ideas from physics, biology, and computer science have led to many improved algorithms. We
will discuss some of these algorithms in Chapter 15. In most of these algorithms, paths that
are worse than the current path are sometimes accepted in an attempt to climb out of a local
minimum. Other algorithms involve ways of sampling over a wider range of possible paths.
Another approach is to convert the Monte Carlo algorithm into a deterministic algorithm. We
have already mentioned that an analytic variational calculation leads to Newton’s second law.
Passerone and Parrinello discuss an algorithm for looking for extrema in the action by maintaining
the discrete structure in (7.66) and then ﬁnding the extremum by taking the derivative with
respect to each coordinate, xi and setting the resulting equations equal to zero. This procedure
leads to a set of deterministic equations that need to be solved numerically. The performance
can be improved by enforcing energy conservation and using some other tricks.
7.11 Projects
Almost all of the problems in this chapter can be done using more eﬃcient programs, greater
number of trials, and larger systems. More applications of random walks and random number
sequences are discussed in subsequent chapters. Many more ideas for projects can be gained
from the references.
Project 7.40. Competition between diﬀusion and fragmentation
As we have discussed, random walks are useful for understanding diﬀusion in contexts more
general than the movement of a particle. Consider a particle in solution whose mass can grow
either by the absorption of particles or shrink by the loss of small particles, including fragmentation.
We can model this process as a random walk by replacing the position of the particle by
its mass. One diﬀerence between this case and the random walks we have studied so far is that
CHAPTER 7. RANDOM PROCESSES 244
the random variable, the mass, must be positive. The model of Ferkinghoﬀ–Berg et al. can be
summarized as follows:
(i) Begin with N objects with some distribution of lengths. Let the integer Li represent the
length of the ith object.
(ii) All the objects change their length by ±1. This step is analogous to a random walk. If the
length of an object becomes equal to 0, it is removed from the system. An easy way to
eliminate the ith object is to set its length equal to the length of the last object and reduce
N by unity.
(iii) Choose one object at random with a probability that is proportional to the length of the
object. Fragment this object into two objects, where the fraction of the mass going to each
object is random.
(iv) Repeat steps (ii) and (iii).
(a) Write a program to implement this algorithm in one dimension. One way to implement
step (iii) is given in the following code, where totalMass is the sum of the lengths of all the
objects.
int i = 0; / / l a b e l of o b j e c t
/ / length of f i r s t object , a l l l e n g t h s are i n t e g e r s
/ / choose o b j e c t to fragment so that c h o i c e i s p r o p o r t i o n a l to length
int sum = length [ 0 ] ;
int x = ( int ) ( Math . random ( ) totalMass ) ;
while (sum < x ) {
i ++;
sum += length [ i ] ;
}
/ / i f o b j e c t big enough to fragment , choose random f r a c t i o n
/ / f o r each part
i f ( length [ i ] > 1) {
int partA = 1 + ( int ) ( Math . random ( ) ( length [ i ] − 1 ) );
int partB = length [ i ] − partA ;
length [ i ] = partA ;
length [ numberOfObjects ] = partB ; / / new o b j e c t
numberOfObjects++;
}
The main quantity of interest is the distribution of lengths P (L). Explore a variety of initial
length distributions with a total mass of 5000 for which the distribution is peaked at about
20 mass units. Is the long time behavior of P (L) similar in shape for any initial distribution?
Compute the total mass (sum of the lengths) and output this value periodically. Although
the total mass will ﬂuctuate, it should remain approximately constant. Why?
(b) Collect data for three diﬀerent initial distributions with the same number of objects N, and
scale P (L) and L so that the three distributions roughly fall on the same curve. For example,
you can scale P (L) so that the maximum of the three distributions has the same value. Then
multiply each value of L by a factor so that the distributions overlap.
(c) The analytic results suggest that the universal behavior can be obtained by scaling L by the
total mass raised to the 1/3 power. Is this prediction consistent with your results? Test this
CHAPTER 7. RANDOM PROCESSES 245
hypothesis by adjusting the initial distributions so that they all have the same total mass.
Your results for the long time behavior of P (L) should fall on a universal curve. Why is this
universality interesting? How can this result be used to analyze diﬀerent systems? Would
you need to do a new simulation for each value of L?
(d) What happens if step (iii) is done more or less often than each random change of length.
Does the scaling change?
Project 7.41. Application of the pivot algorithm to self-avoiding walks
The algorithms that we have discussed for generating self-avoiding random walks are all based
on making local deformations of the walk (polymer chain) for a given value of N, the number
of bonds. As discussed in Problem 7.31, the time τ between statistically independent conﬁgurations
is nonzero. The problem is that τ increases with N as some power, for example, τ ∼ N3.
This power law dependence of τ on N is called critical slowing down and implies that it becomes
increasingly more time consuming to generate long walks. We now discuss an example of a
global algorithm that reduces the dependence of τ on N. Another example of a global algorithm
that reduces critical slowing down is discussed in Project 15.32.
(a) Consider the walk shown in Figure 7.14a. Select a site at random and one of the four possible
directions. The shorter portion of the walk is rotated (pivoted) to this new direction
by treating the walk as a rigid structure. The new walk is accepted only if the new walk is
self-avoiding; otherwise, the old walk is retained. (The shorter portion of the walk is chosen
to save computer time.) Some typical moves are shown in Figure 7.14. Note that if an end
point is chosen, the previous walk is retained. Write a program to implement this algorithm
and compute the dependence of the mean square end-to-end distance R2 on N. Consider
values of N in the range 10 ≤ N ≤ 80. A discussion of the results and the implementation of
the algorithm can be found in MacDonald et al. and Madras and Sokal, respectively.
(b) Compute the correlation time τ for diﬀerent values of N using the approach discussed in
Problem 7.31b.
Project 7.42. Pattern formation
In Problem 7.34 we saw that simple patterns can develop as a result of random behavior. The
phenomenon of pattern formation is of much interest in a variety of contexts ranging from the
large scale structure of the universe to the roll patterns seen in convection (for example, smoke
rings). In the following, we explore the patterns that can develop in a simple reaction diﬀusion
model based on the reactions, A+2B → 3B and B → C, where C is inert. Such a reaction is called
autocatalytic.
In Problem 7.34 we considered chemical reactions in a closed system where the reactions
can proceed to equilibrium. In contrast, open systems allow a continuous supply of fresh reactants
and a removal of products. These two processes allow steady states to be realized and
oscillatory conditions to be maintained indeﬁnitely. In this problem we assume that A is added
at a constant rate, and that both A and B are removed by the feed process. Pearson (see references)
modeled these processes by two coupled reaction diﬀusion equations:
∂A
∂t
= DA
2
A − AB2
+ f (1 − A) (7.67a)
∂B
∂t
= DB
2
B + AB2
− (f + k)B. (7.67b)
CHAPTER 7. RANDOM PROCESSES 246
(a) (b) (c)
(d) (e) (f)
Figure 7.14: Examples of the ﬁrst several changes generated by the pivot algorithm for a selfavoiding
walk of N = 10 bonds (11 sites). The open circle denotes the pivot point. This ﬁgure is
adopted from the article by MacDonald et al.
The AB2 term represents the reaction A + 2B → 3B. This term is negative in (7.67a) because the
reactant A decreases and is positive in (7.67b) because the reactant B increases. The term +f
represents the constant addition of A, and the terms −f A and −f B represent the removal process;
the term −kB represents the reaction B → C. All the quantities in (7.67) are dimensionless.
We assume that the diﬀusion coeﬃcients are DA = 2 × 10−5 and DB = 10−5, and the behavior of
the system is determined by the values of the rate constant k and the feed rate f .
(a) We ﬁrst consider the behavior of the reaction kinetics that results when the diﬀusion terms
in (7.67) are neglected. It is clear from (7.67) that there is a trivial steady state solution
with A = 1, B = 0. Are there other solutions, and if so, are they stable? The steady state
solutions can be found by solving (7.67) with ∂A/∂t = ∂B/∂t = 0. To determine the stability,
we can add a perturbation and determine whether the perturbation grows or not. However,
without the diﬀusion terms, it is more straightforward to solve (7.67) numerically using a
simple Euler algorithm. Choose a time step equal to unity and let A = 0.1 and B = 0.5 at
t = 0. Determine the steady state values for 0 < f ≤ 0.3 and 0 < k ≤ 0.07 in increments of
∆f = 0.02 and ∆k = 0.005. Record the steady state values of A and B. Then repeat this
exercise for the initial values A = 0.5 and B = 0.1. You should ﬁnd that for some values of
f and k, only one steady state solution can be obtained for the two initial conditions, and
for other initial values of A and B there are two steady state solutions. Try other initial
conditions. If you obtain a new solution, change the initial A or B slightly to see if your
new solution is stable. On an f versus k plot indicate, where there are two solutions and
where there are one. In this way you can determine the approximate phase diagram for this
process.
(b) There is a small region in f -k space where one of the steady state solutions becomes unstable
and periodic solutions occur (the mechanism is known as a Hopf bifurcation). Try f = 0.009,
k = 0.03, and set A = 0.1 and B = 0.5 at t = 0. Plot the values of A and B versus the time
t. Are they periodic? Try other values of f and k and estimate where the periodic solutions
occur.
CHAPTER 7. RANDOM PROCESSES 247
Figure 7.15: Evolution of the pattern starting from the initial conditions suggested in
Project 7.42c.
(c) Numerical solutions of the full equation with diﬀusion (7.67) can be found by making a
ﬁnite diﬀerence approximation to the spatial derivatives as in (3.16) and using a simple
Euler algorithm for the time integration. Adopt periodic boundary conditions. Although
it is straightforward to write a program to do the numerical integration, an exploration of
the dynamics of this system requires much computer resources. However, we can ﬁnd some
preliminary results with a small system and a coarse grid. Consider a 0.5 × 0.5 system with
a spatial mesh of 128 × 128 grid points on a square lattice. Choose f = 0.18, k = 0.057,
and ∆t = 0.1. Let the entire system be in the initial trivial state (A = 1, B = 0) except for a
20 × 20 grid located at the center of the system where the sites are A = 1/2, B = 1/4 with a
±1% random noise. The eﬀect of the noise is to break the square symmetry. Let the system
evolve for approximately 80,000 time steps and look at the patterns that develop. Color
code the grid according to the concentration of A, with red representing A = 1 and blue
representing A ≈ 0.2 and with several intermediate colors. Very interesting patterns have
been found by Pearson (see Figure 7.15).
Appendix 7A: Random Walks and the Diﬀusion Equation
To gain some insight into the relation between random walks and the diﬀusion equation, we
ﬁrst show that the latter implies that x(t) is zero and x2(t) is proportional to t. We rewrite
the diﬀusion equation (7.27) here for convenience:
∂P (x,t)
∂t
= D
∂2P (x,t)
∂x2
. (7.68)
CHAPTER 7. RANDOM PROCESSES 248
To derive the t dependence of x(t) and x2(t) from (7.68), we write the average of any function
of x as
f (x,t) =
∞
−∞
f (x)P (x,t)dx. (7.69)
The average displacement is given by
x(t) =
∞
−∞
xP (x,t)dx. (7.70)
To do the integral on the right-hand side of (7.70), we multiply both sides of (7.68) by x and
formally integrate over x:
∞
−∞
x
∂P (x,t)
∂t
dx = D
∞
−∞
x
∂2P (x,t)
∂x2
dx. (7.71)
The left-hand side can be expressed as
∞
−∞
x
∂P (x,t)
∂t
dx =
∂
∂t
∞
−∞
xP (x,t)dx =
d
dt
x . (7.72)
The right-hand side of (7.71) can be written in the desired form by doing an integration by
parts:
D
∞
−∞
x
∂2P (x,t)
∂x2
dx = D x
∂P (x,t)
∂x
x=∞
x=−∞
− D
∞
−∞
∂P (x,t)
∂x
dx. (7.73)
The ﬁrst term on the right hand side of (7.73) is zero because P (x = ±∞,t) = 0 and all the
spatial derivatives of P at x = ±∞ are zero. The second term is also zero because it integrates to
D[P (x = ∞,t) − P (x = −∞,t)]. Hence, we ﬁnd that
d x
dt
= 0 (7.74)
or x is a constant, independent of time. Because x = 0 at t = 0, we conclude that x = 0 for all
t.
To calculate x2(t) , we can use a similar procedure and perform two integrations by parts.
The result is
d
dt
x2
(t) = 2D (7.75)
or
x2
(t) = 2Dt. (7.76)
We see that the random walk and the diﬀusion equation have the same time dependence. In
d-dimensional space, 2D is replaced by 2dD.
The solution of the diﬀusion equation shows that the time dependence of x2(t) is equivalent
to the long time behavior of a simple random walk on a lattice. In the following, we show
directly that the continuum limit of the one-dimensional random walk model is a diﬀusion
equation.
If there is an equal probability of taking a step to the right or left, the random walk can be
written in terms of the simple master equation
P (i,N) =
1
2
[P (i + 1,N − 1) + P (i − 1,N − 1)] (7.77)
CHAPTER 7. RANDOM PROCESSES 249
where P (i,N) is the probability that the walker is at site i after N steps. To obtain a diﬀerential
equation for the probability density P (x,t), we identify t = Nτ, x = ia, and P (i,N) = aP (x,t),
where τ is the time between steps and a is the lattice spacing. This association allows us to
rewrite (7.77) in the equivalent form
P (x,t) =
1
2
[P (x + a,t − τ) + P (x − a,t − τ)]. (7.78)
We rewrite (7.78) by subtracting P (x,t − τ) from both sides of (7.78) and dividing by τ:
1
τ
P (x,t) − P (x,t − τ) =
a2
2τ
P (x + a,t − τ) − 2P (x,t − τ) + P (x − a,t − τ) a−2
. (7.79)
If we expand P (x,t − τ) and P (x ± a,t − τ) in a Taylor series and take the limit a → 0 and τ → 0
with the ratio D ≡ a2/2τ ﬁnite, we obtain the diﬀusion equation
∂P (x,t)
∂t
= D
∂2P (x,t)
∂x2
. (7.80a)
The generalization of (7.80a) to three dimensions is
∂P (x,y,z,t)
∂t
= D 2
P (x,y,z,t) (7.80b)
where 2 = ∂2/∂x2 + ∂2/∂y2 + ∂2/∂x2 is the Laplacian operator. Equation (7.80) is known as the
diﬀusion equation and is frequently used to describe the dynamics of ﬂuid molecules.
The direct numerical solution of the prototypical parabolic partial diﬀerential equation
(7.80) is a nontrivial problem in numerical analysis (cf. Press et al. or Koonin and Meredith).
An indirect method of solving (7.80) numerically is to use a Monte Carlo method; that is, replace
the partial diﬀerential equation (7.80) by a corresponding random walk on a lattice with
discrete time steps. Because the asymptotic behavior of the partial diﬀerential equation and the
random walk model are equivalent, this approach uses the Monte Carlo technique as a method
of numerical analysis. In contrast, if our goal is to understand a random walk lattice model directly,
the Monte Carlo technique is a simulation method. The diﬀerence between simulation
and numerical analysis is sometimes in the eyes of the beholder.
Problem 7.43. Biased random walk
Show that the form of the diﬀerential equation satisﬁed by P (x,t) corresponding to a random
walk with a drift; that is, a walk for p q, is
∂P (x,t)
∂t
= D 2
P (x,y,z,t) − v
∂P (x,t)
∂x
. (7.81)
How is v related to p and q?
References and Suggestions for Further Reading
Daniel J. Amit, G. Parisi, and L. Peleti, “Asymptotic behavior of the “true" self-avoiding walk,”
Phys. Rev. B 27, 1635–1645 (1983).
Panos Argyrakis, “Simulation of diﬀusion-controlled chemical reactions,” Computers in Physics
6, 525–579 (1992).
CHAPTER 7. RANDOM PROCESSES 250
G. T. Barkema, Parthapratim Biswas, and Henk van Beijeren, “Diﬀusion with random distribution
of static traps,” Phys. Rev. Lett. 87, 170601 (2001).
J. M. Bernardo and A. F. M. Smith, Bayesian Theory (John Wiley & Sons, 1994). Bayes theorem
is stated concisely on page 2.
J. Bernasconi and L. Pietronero, “True self-avoiding walk in one dimension,” Phys. Rev. B 29,
5196–5198 (1984). The authors present results for the exponent ν accurate to 1%.
Philip R. Bevington and D. Keith Robinson, Data Reduction and Error Analysis for the Physical
Sciences, 3rd ed. (McGraw–Hill, 2003).
I. Carmesin and Kurt Kremer, “The bond ﬂuctuation model: A new eﬀective algorithm for the
dynamics of polymers in all spatial dimensions,” Macromolecules 21, 2819–2823 (1988).
The bond ﬂuctuation model is an eﬃcient method for simulating the dynamics of polymer
chains and would be the basis of an excellent project.
S. Chandrasekhar, “Stochastic problems in physics and astronomy,” Rev. Mod. Phys. 15, 1–
89 (1943). This article is reprinted in M. Wax, Selected Papers on Noise and Stochastic
Processes (Dover, 1954).
William S. Cleveland and Robert McGill, “Graphical perception and graphical methods for
analyzing scientiﬁc data,” Science 229, 828–833 (1985). There is more to analyzing data
than least squares ﬁts.
Mohamed Daoud, “Polymers,” Chapter 6 in Armin Bunde and Shlomo Havlin, editors, Fractals
in Science, Springer-Verlag (1994).
Roan Dawkins and Daniel ben–Avraham, “Computer simulations of diﬀusion-limited reactions,"
Comput. Sci. Eng. 3 (1), 72–76 (2001).
R. Everaers, I. S. Graham, and M. J. Zuckermann, “End-to-end distance and asymptotic behavior
of self-avoiding walks in two and three dimensions,” J. Phys. A 28, 1271–1293 (1995).
Jesper Ferkinghoﬀ–Borg, Mogens H. Jensen, Joachim Mathiesen, Poul Olesen, and Kim Sneppen,
“Competition between diﬀusion and fragmentation: An important evolutionary process
of nature,” Phys. Rev. Lett. 91, 266103 (2003). The results of the model were compared
with experimental data on ice crystal sizes and the length distribution of α helices
in proteins.
Richard P. Feynman, Robert B. Leighton, and Matthew Sands, The Feynman Lectures on Physics
(Addison–Wesley, 1963). See Vol. 1, Chapter 26 for a discussion of the principle of least
time and Vol. 2, Chapter 19 for a discussion of the principle of least action.
Pierre–Giles de Gennes, Scaling Concepts in Polymer Physics (Cornell University Press, 1979). A
diﬃcult but important text.
Peter Grassberger, “Pruned-enriched Rosenbluth method: Simulations of θ polymers of chain
length up to 1 000 000,” Phys. Rev. E 56, 3682–3693 (1997).
Shlomo Havlin and Daniel ben–Avraham, “Diﬀusion in disordered media,” Adv. Phys. 36, 695
(1987). Section 7 of this review article discusses trapping and diﬀusion-limited reactions.
Also see Daniel ben–Avraham and Shlomo Havlin, Diﬀusion and Reactions in Fractals and
Disordered Systems (Cambridge University Press, 2001).
CHAPTER 7. RANDOM PROCESSES 251
Shlomo Havlin, George H. Weiss, James E. Kiefer, and Menachem Dishon, “Exact enumeration
of random walks with traps,” J. Phys. A: Math. Gen. 17, L347 (1984). The authors discuss
a method based on exact enumeration for calculating the survival probability of random
walkers on a lattice with randomly distributed traps.
Brian Hayes, “How to avoid yourself,” Am. Scientist 86 (4), 314–319 (1998).
Z. Jiang and C. Ebner, “Simulation study of reaction fronts,” Phys. Rev. A 42, 7483–7486
(1990).
Peter R. Keller and Mary M. Keller, Visual Cues (IEEE Press, 1993). A well-illustrated book on
data visualization techniques.
Donald E. Knuth, Seminumerical Algorithms, 2nd ed., Vol. 2 of The Art of Computer Programming
(Addison-Wesley, 1981). The standard reference on random number generators.
Bruce MacDonald, Naeem Jan, D. L. Hunter, and M. O. Steinitz, “Polymer conformations
through ‘wiggling’,” J. Phys. A 18, 2627–2631 (1985). A discussion of the pivot algorithm
summarized in Project 7.41. Also see Tom Kennedy, “A faster implementation of the pivot
algorithm for self-avoiding walks,” J. Stat. Phys. 106, 407–429 (2002).
Vishal Mehra and Peter Grassberger, “Trapping reaction with mobile traps,” Phys. Rev. E 65,
050101-1–4 (R) (2002). This paper discusses the model of a single walker moving on a
lattice of traps.
Elliott W. Montroll and Michael F. Shlesinger, “On the wonderful world of random walks,”
in Nonequilibrium Phenomena II: From Stochastics to Hydrodynamics, J. L. Lebowitz and E.
W. Montroll, editors (North-Holland Press, 1984). The ﬁrst part of this delightful review
article chronicles the history of the random walk.
M. E. J. Newman and G. T. Barkema, Monte Carlo Methods in Statistical Physics (Oxford University,
1999). This book has a good section on random number generators.
Daniele Passerone and Michele Parrinello, “Action-derived molecular dynamics in the study
of rare events,” Phys. Rev. Lett. 87, 108302 (2001). This paper describes a deterministic
algorithm for ﬁnding extrema of the action. Also see D. Passerone, M. Ceccarelli, and M.
Parrinello, J. Chem. Phys. 118, 2025–2032 (2003).
John E. Pearson, “Complex patterns in a simple ﬂuid,” Science 261, 189–192 (1993) or pattsol/9304003.
See also P. Gray and S. K. Scott, “Sustained oscillations and other exotic
patterns of behavior in isothermal reactions,” J. Phys. Chem. 89, 22–32 (1985).
Thomas Prellberg, “Scaling of self-avoiding walks and self-avoiding trails in three dimensions,”
J. Phys. A 34, L599–602 (2001). The author estimates that ν ≈ 0.5874(2) for the
self-avoiding walk in three dimensions.
Thomas Prellberg and Jaroslaw Krawczyk, “Flat histogram version of the pruned and enriched
Rosenbluth method,” Phys. Rev. Lett. 92, 120602 (2004). The authors discuss an improved
algorithm for simulating self-avoiding walks.
William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery, Numerical
Recipes, 2nd ed. (Cambridge University Press, 1992). This classic book is available online
at <www.nr.com/>. See Chapter 15 for a general discussion of the modeling of data, including
general linear least squares and nonlinear ﬁts and Chapter 19 for a discussion of
the Crank–Nicholson method for solving diﬀusion-type partial equations.
CHAPTER 7. RANDOM PROCESSES 252
Sidney Redner, A Guide to First-Passage Processes (Cambridge University Press, 2001).
Sidney Redner and Francois Leyvraz, “Kinetics and spatial organization of competitive reactions,”
Chapter 7 in Armin Bunde and Shlomo Havlin, editors, Fractals in Science (Springer–
Verlag, 1994).
F. Reif, Fundamentals of Statistical and Thermal Physics (McGraw–Hill, 1965). This popular text
on statistical physics has a good discussion on random walks (Chapter 1) and diﬀusion
(Chapter 12).
F. Reif, Statistical and Thermal Physics, Berkeley Physics, Vol. 5 (McGraw–Hill, 1965). Chapter
2 introduces random walks.
Marshall N. Rosenbluth and Arianna W. Rosenbluth, “Monte Carlo calculation of the average
extension of molecular chains,” J. Chem. Phys. 23, 356–359 (1955). One of the ﬁrst Monte
Carlo calculations of the self-avoiding walk.
Joseph Rudnick and George Gaspari, Elements of the Random Walk (Cambridge University
Press, 2004). A graduate level text, but parts are accessible to undergraduates.
David Ruelle, Chance and Chaos (Princeton Publishing Company, 1993). A nontechnical introduction
to chaos theory that discusses the relation of chaos to randomness.
Charles Ruhla, The Physics of Chance (Oxford University Press, 1992). A delightful book on
probability in many contexts.
Andreas Ruttor, Georg Reents, and Wolfgang Kinzel, “Synchronization of random walks with
reﬂecting boundaries,” J. Phys. A: Math. Gen. 37, 8609–8618 (2004).
G. L. Squires, Practical Physics, 4th ed. (Cambridge University Press, 2001). An excellent text
on the design of experiments and the analysis of data.
John R. Taylor, An Introduction to Error Analysis, 2nd ed. (University Science Books, Oxford
University Press (1997).
Edward R. Tuﬀe, The Visual Display of Quantitative Information, Graphics Press (1983) and
Envisioning Information (Graphics Press, 1990). Also see Tuﬀe’s Web site at <www.edwardtufte.com/>.
I. Vattulainen, T. Ala–Nissila, and K. Kankaala, “Physical tests for random numbers in simulations,”
Phys. Rev. Lett. 73, 2513 (1994) and “Physical models as tests of randomness,”
Phys. Rev. E 52, 3205–3214 (1995). Also see Vattulainen’s Web site which has some useful
programs: <www.physics.helsinki.fi/ vattulai/rngs.html>.
Peter H. Verdier and W. H. Stockmayer, “Monte Carlo calculations on the dynamics of polymers
in dilute solution,” J. Chem. Phys. 36, 227–235 (1962).
Frederick T. Wall and Frederic Mandel, “Macromolecular dimensions obtained by an eﬃcient
Monte Carlo method without sample attrition,” J. Chem. Phys. 63, 4592–4595 (1975). An
exposition of the reptation method.
George H. Weiss, “A primer of random walkology,” Chapter 5 in Armin Bunde and Shlomo
Havlin, editors, Fractals in Science (Springer–Verlag, 1994).
George H. Weiss and Shlomo Havlin, “Trapping of random walks on the line,” J. Stat. Phys.
37, 17–25 (1984). The authors discuss an analytic approach to the asymptotic behavior of
one-dimensional random walkers with randomly placed traps.
CHAPTER 7. RANDOM PROCESSES 253
George H. Weiss and Robert J. Rubin, “Random walks: Theory and selected applications,” Adv.
Chem. Phys. 52, 363–503 (1983). In spite of its research orientation, much of this review
article can be understood by well-motivated students.
Charles A. Whitney, Random Processes in Physical Systems (John Wiley and Sons, 1990). An
excellent introduction to random processes with many applications to astronomy.
Robert S. Wolﬀ and Larry Yaeger, Visualization of Natural Phenomena (Springer–Verlag, 1993).
Chapter 8
The Dynamics of Many-Particle
Systems
We simulate the dynamical behavior of many particle systems, such as dense gases, liquids, and
solids, and observe their qualitative features. Some of the basic ideas of equilibrium statistical
mechanics and kinetic theory are introduced.
8.1 Introduction
Given our knowledge of the laws of physics at the microscopic level, how can we understand
the behavior of gases, liquids, and solids and more complex systems such as polymers and
proteins? For example, consider two cups of water prepared under similar conditions. Each cup
contains approximately 1025 molecules which mutually interact and, to a good approximation,
move according to the laws of classical physics. Although the intermolecular forces produce a
complicated trajectory for each molecule, the observable properties of the water in each cup are
indistinguishable and are easy to describe. For example, the temperature of the water in each
cup is independent of time even though the positions and velocities of the individual molecules
are changing continually.
One way to understand the behavior of a classical many particle system is to simulate the
trajectory of each particle. This approach, known as molecular dynamics, has been applied to
systems of up to 109 particles and has given us much insight into a variety of systems in which
the particles obey the laws of classical dynamics.
A calculation of the trajectories of many particles would not be very useful unless we know
the right questions to ask. Saving these trajectories would quickly ﬁll up any storage medium,
and we do not usually care about the trajectory of any particular particle. What are the useful
quantities needed to describe these many particle systems? What are the essential characteristics
and regularities they exhibit? Questions such as these are addressed by statistical mechanics,
and some of the ideas of statistical mechanics are discussed in this chapter. However, the
only background needed for this chapter is a knowledge of Newton’s laws of motion.
254
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 255
8.2 The Intermolecular Potential
The ﬁrst step is to specify the model system we wish to simulate. We assume that the dynamics
can be treated classically, the molecules are spherical and chemically inert and their internal
structure can be ignored, and the interaction between any pair of particles depends only on
the distance between them. In this case the total potential energy U is a sum of two-particle
interactions:
U = u(r12) + u(r13) + ··· + u(r23) + ··· =
N−1
i=1
N
j=i+1
u(rij) (8.1)
where u(rij) depends only on the magnitude of the distance rij between particles i and j. The
pairwise interaction form (8.1) is appropriate for simple liquids such as liquid argon.
The form of u(r) for electrically neutral molecules can be constructed by a ﬁrst principles
quantum mechanical calculation. Such a calculation is very diﬃcult, and it is usually suﬃcient
to choose a simple phenomenological form for u(r). The most important features of u(r) are a
strong repulsion for small r and a weak attraction at large r. The repulsion for small r is a consequence
of the Pauli exclusion principle. That is, the electron wave functions of two molecules
must distort to avoid overlap, causing some of the electrons to be in diﬀerent quantum states.
The net eﬀect is an increase in kinetic energy and an eﬀective repulsive interaction between the
electrons, known as core repulsion. The dominant weak attraction at larger r is due to the mutual
polarization of each molecule; the resultant attractive potential is called the van der Waals
potential.
One of the most common phenomenological forms of u(r) is the Lennard–Jones potential:
u(r) = 4
σ
r
12
−
σ
r
6
. (8.2)
A plot of the Lennard–Jones potential is shown in Figure 8.1. The r−12 form of the repulsive
part of the interaction was chosen for convenience only and has no fundamental signiﬁcance.
The attractive 1/r6 behavior at large r corresponds to the van der Waals interaction.
The Lennard–Jones potential is parameterized by a length σ and an energy . Note that
u(r) = 0 at r = σ, and that u(r) is close to zero for r > 2.5σ. The parameter is the depth of the
potential at the minimum of u(r); the minimum occurs at a separation r = 21/6 σ.
Problem 8.1. Qualitative properties of the Lennard–Jones interaction
Write a short program to plot the Lennard–Jones potential (8.1) and the magnitude of the corresponding
force:
f(r) = − u(r) =
24
r
2
σ
r
)12
−
σ
r
6
ˆr. (8.3)
At what value of r is the force equal to zero? For what values of r is the force repulsive? What
is the value of u(r) for r = 0.8σ? How much does u increase if r is decreased to r = 0.72σ, a 10%
change in r? What is the value of u at r = 2.5σ?
8.3 Units
As usual, it is convenient to choose units so that the computed quantities are neither too small
nor too large. Because the values of the distance and the energy associated with typical liquids
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 256
u
r
ε
σ
Figure 8.1: Plot of the Lennard–Jones potential u(r). Note that the potential is characterized by
a length σ and an energy .
are very small in SI units, we choose the Lennard–Jones parameters σ and as the units of
distance and energy, respectively. We also choose the unit of mass to be the mass of one atom,
m. We can express all other quantities in terms of σ, , and m. For example, we measure
velocities in units of ( /m)1/2, and the time in units of σ(m/ )1/2. The values of σ, , and m for
argon are given in Table 8.1. If we use these values, we ﬁnd that the unit of time is 2.17×10−12 s.
The units of some of the other physical quantities of interest are also shown in Table 8.1.
All program variables are in reduced units; for example, the time in our molecular dynamics
program is expressed in units of σ(m/ )1/2. Suppose that we run our molecular dynamics program
for 2000 time steps with a time step ∆t = 0.01. The total time of our run is 2000×0.01 = 20
in reduced units or 4.34 × 10−11 s for argon (see Table 8.1). The duration of a typical molecular
dynamics simulation is in the range of 10–104 in reduced units, corresponding to a duration of
approximately 10−11–10−8 s. The longest practical runs are the order of 10−6 s.
8.4 The Numerical Algorithm
Now that we have speciﬁed the interaction between the particles, we need to introduce a numerical
method for computing the trajectory of each particle. As we have learned, the criteria for a
good numerical integration method include that it conserve the phase-space volume and is consistent
with the known conservation laws, is time reversible, and is accurate for relatively large
time steps to reduce the CPU time needed for the total time of the simulation. These requirements
mean that we should use a symplectic algorithm for the relatively long times of interest
in molecular dynamics simulations. We adopt the commonly used second-order algorithm:
xn+1 = xn + vn∆t +
1
2
an(∆t)2
(8.4a)
vn+1 = vn +
1
2
(an+1 + an)∆t. (8.4b)
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 257
Quantity Unit Value for Argon
length σ 3.4 × 10−10 m
energy 1.65 × 10−21 J
mass m 6.69 × 10−26 kg
time σ(m/ )1/2 2.17 × 10−12 s
velocity ( /m)1/2 1.57 × 102 m/s
force /σ 4.85 × 10−12 N
pressure /σ2 1.43 × 10−2 N · m−1
temperature /k 120 K
Table 8.1: The system of units used in the molecular dynamics simulations of particles interacting
via the Lennard–Jones potential. The numerical values of σ, , and m are for argon. The
quantity k is Boltzmann’s constant and has the value k = 1.38 × 10−23 J/K. The unit of pressure
is for a two-dimensional system.
To simplify the notation, we have written the algorithm for only one component of the particle’s
motion. The new position is used to ﬁnd the new acceleration an+1, which is used together
with an to obtain the new velocity vn+1. The algorithm represented by (8.4) is known as the
Verlet (or sometimes the velocity Verlet) algorithm (see Appendix 3A). We will use the Verlet
implementation of the ODESolver interface to implement the algorithm. Thus, the x, vx, y, and
vy values for the ith particle will be stored in the state array at state[4*i], state[4*i+1],
state[4*i+2], and state[4*i+3], respectively.
8.5 Periodic Boundary Conditions
A useful simulation must incorporate as many of the relevant features of the physical system
of interest as possible. Usually we want to simulate a gas, liquid, or a solid in the bulk, that is,
systems of at least N ∼ 1023 particles. In such systems the fraction of particles near the walls
of the container is negligibly small. The number of particles that can be studied in a molecular
dynamics simulation is typically 103–105, although we can simulate the order of 109 particles
using clusters of computers. For these relatively small systems, the fraction of particles near
the walls of the container is signiﬁcant, and hence, the behavior of such a system would be
dominated by surface eﬀects.
The most common way of minimizing surface eﬀects and to simulate more closely the properties
of a bulk system is to use what are known as periodic boundary conditions, although the
minimum image approximation would be a more accurate name. This boundary condition is familiar
to anyone who has played the Pacman computer game. Consider N particles that are
constrained to move on a line of length L. The application of periodic boundary conditions is
equivalent to considering the line to be a circle, and hence, the maximum separation between
any two particles is L/2 (see Figure 8.2). The generalization of periodic boundary conditions
to two dimensions is equivalent to imagining a box with opposite edges joined so that the box
becomes the surface of a torus (the shape of a doughnut or a bagel). The three-dimensional
version of periodic boundary conditions cannot be visualized easily, but the same methods can
be used.
The implementation of periodic boundary conditions is straightforward. If a particle leaves
the box by crossing a boundary in a particular direction, we add or subtract the length L of the
box in that direction to the position. One simple way is to use an if else statement as shown:
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 258
1
2
3
0
(a)
0 4
(b)
Figure 8.2: (a) Two particles at x = 0 and x = 3 on a line of length L = 4; the distance between the
particles is 3. (b) The application of periodic boundary conditions for short range interactions
is equivalent to thinking of the line as forming a circle of circumference L. In this case the
minimum distance between the two particles is 1.
Listing 8.1: Calculation of the position of particle in the central cell.
private double pbcPosition ( double s , double L) {
i f ( s > L) {
s −= L ;
} else i f ( s < 0) {
s += L ;
}
return s ;
}
To compute the minimum distance ds in a particular direction between two particles, we
can use the method pbcSeparation (see Figure 8.2):
Listing 8.2: Calculation of the minimum separation.
private double pbcSeparation ( double ds , double L) {
i f ( ds > 0.5 L) {
ds −= L ;
} else i f ( ds < −0.5 L) {
ds += L ;
}
return ds ;
}
The equivalent static methods, PBC.position and PBC.separation in the Open Source Physics
numerics package can also be used.
Exercise 8.2. Use of the % operator
(a) Another way to compute the position of a particle in the central cell is to use the % (modulus)
operator. For example, 17 % 5 equals 2 because 17 divided by 5 leaves a remainder of 2.
The % operator can also be used with ﬂoating point numbers. For example, 10.2 % 5 = 0.2.
Write a little test program to see how the % function works and determine the result of
10.2 % 3.3, -10.2 % 3.3, 10.2 % -3.3, and -10.2 % -3.3. In what way does % act like a
remainder operator?
(b) From the results of part (a) we might consider writing x = x % L as an alternative to Listing
8.1. What about negative values of x? In this case -17 % 5 = -2. Because we want the
resultant position to be positive, we write
return x<0 ? x%L+L : x%L ;
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 259
Explain this syntax and write a program to test if this statement works as claimed.
(c) Write a simple program to determine if the % operator is faster than the if-else construction
in Listing 8.1. Write another program that compares the speed of calling the PCB.position
method to that of inlining the PBC code. In other words, replace the method call by the
above statement.
We now discuss the nature of periodic boundary conditions. Imagine a set of N particles
in a two-dimensional box or cell. The use of periodic boundary conditions implies that the
central cell is duplicated an inﬁnite number of times to ﬁll the space. Figure 8.3 shows the
ﬁrst several image cells for N = 2. The shape of the central cell must be such that the cell ﬁlls
space under successive translations. Each image cell contains the original particles in the same
relative positions as the central cell. That is, periodic boundary conditions yield an inﬁnite
system, although the positions of the particles in the image cells are identical to the positions
of the particles in the central cell. These boundary conditions also imply that every point in the
cell is equivalent and that there is no surface.
As a particle moves in the original cell, its periodic images move in the image cells. Hence,
only the motion of the particles in the central cell needs to be followed. When a particle enters or
leaves the central cell, the move is accompanied by an image of that particle leaving or entering
a neighboring cell through the opposite face.
The total force on a given particle i is due to the force from every other particle j within
the central cell and from the periodic images of particle j. That is, if particle i interacts with
particle j in the central cell, then particle i interacts with all the periodic replicas of particle
j. Hence, in general, there are an inﬁnite number of contributions to the force on any given
particle. For long-range interactions such as the Coulomb potential, these contributions have to
be included using special methods. For short-range interactions, we can reduce the number of
contributions by adopting the minimum image approximation, which assumes that particle i in
the central cell interacts only with the nearest image of particle j; the interaction is set equal to
zero if the distance of the image from particle i is greater than L/2. An example of the minimum
image approximation is shown in Figure 8.3.
8.6 A Molecular Dynamics Program
In this section we develop a molecular dynamics program to simulate a two-dimensional system
of particles interacting via the Lennard–Jones potential. We choose two rather than three
dimensions because it is easier to visualize the results and the calculations are not as time con-
suming.
In principle, we could deﬁne a class for a particle and instantiate an object for each particle.
However, this use would be very ineﬃcient and would take up more memory and CPU time
than using one class to represent all N particles. Instead we will store the x- and y-components
of the positions and velocities in the state array and store the accelerations of the particles in
a separate array. As usual, we will develop two classes, LJParticles and LJParticlesApp.
Because the system is deterministic, the nature of the motion is determined by the initial
conditions. An appropriate choice of the initial conditions is more diﬃcult than might ﬁrst appear.
For example, how can we choose the initial conﬁguration (a set of positions and velocities)
to correspond to a liquid at a desired temperature? According to the equipartition theorem, the
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 260
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
Ly
Lx
1
2
Figure 8.3: Example of the minimum image approximation in two dimensions. The minimum
image distance convention implies that the separation between particles 1 and 2 is given by the
lesser of the two distances shown.
mean kinetic energy of a particle per degree of freedom is kT /2, where k is Boltzmann’s constant
and T is the temperature. We can generalize this relation to deﬁne the temperature at time t by
kT (t) =
2
d
K(t)
N
=
1
Nd
N
i=1
mivi(t) · vi(t) (8.5)
where K is the total kinetic energy of the system, vi is the velocity of particle i with mass mi,
and d is the spatial dimension of the system.
We can use (8.5) to choose an initial set of velocities. The following method gives the particles
a random set of velocities, sets the total velocity (momentum) to zero, and then rescales the
velocities so that the desired initial kinetic energy is achieved.
Listing 8.3: Method for choosing the initial velocities.
public void s e t V e l o c i t i e s ( ) {
double vxSum = 0 . 0 ;
double vySum = 0 . 0 ;
for ( int i = 0; i <N;++ i ) { / / assign random i n i t i a l v e l o c i t i e s
s t a t e [4 i +1] = Math . random ( ) − 0 . 5 ; / / vx
s t a t e [4 i +3] = Math . random ( ) − 0 . 5 ; / / vy
vxSum += s t a t e [4 i +1];
vySum += s t a t e [4 i +3];
}
/ / zero c e n t e r of mass momentum
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 261
double vxcm = vxSum/N; / / c e n t e r of mass momentum ( v e l o c i t y )
double vycm = vySum/N;
for ( int i = 0; i <N;++ i ) {
s t a t e [4 i +1] −= vxcm ;
s t a t e [4 i +3] −= vycm ;
}
/ / r e s c a l e v e l o c i t i e s to get d e s i r e d i n i t i a l k i n e t i c energy
double v2sum = 0;
for ( int i = 0; i <N;++ i ) {
v2sum += s t a t e [4 i +1] s t a t e [4 i +1] + s t a t e [4 i +3] s t a t e [4 i +3];
}
double kineticEnergyPerParticle = 0.5 v2sum/N;
double rescale = Math . sqrt ( initialKineticEnergy / kineticEnergyPerParticle ) ;
for ( int i = 0; i <N;++ i ) {
s t a t e [4 i +1] = rescale ;
s t a t e [4 i +3] = rescale ;
}
}
We will ﬁnd that setting the initial velocities so that the initial temperature is the desired value
does not guarantee that the system will maintain this temperature when it reaches equilibrium.
Determining an initial conﬁguration that satisﬁes the desired conditions is an iterative process.
If the system is a dilute gas, we can choose the initial positions of the particles by placing
them at random, making sure that no two particles are too close to one another. If two particles
were too close, they would exert a very large repulsive force F on one another, and any simple
ﬁnite diﬀerence integration method would break down because the condition (F/m)(∆t)2 σ
would not be satisﬁed. (In dimensionless units, this condition is F(∆t)2 1.) If we assume
that the separation between two particles is greater than 21/6σ, this condition is satisﬁed. The
following method places particles at random such that no two particles are closer than 21/6σ.
Note that we use a do/while statement to insure that the body of the loop is executed at least
once.
Listing 8.4: Method for choosing the initial positions at random.
public void setRandomPositions ( ) {
double rMinimumSquared = Math .pow(2.0 , 1 . 0 / 3 . 0 ) ;
boolean overlap ;
for ( int i = 0; i <N;++ i ) {
do {
overlap = false ;
s t a t e [4 i ] = Lx Math . random ( ) ; / / x
s t a t e [4 i +2] = Ly Math . random ( ) ; / / y
int j = 0;
while ( j <i&&!overlap ) {
double dx = s t a t e [4 i ]− s t a t e [4 j ] ;
double dy = s t a t e [4 i +2]− s t a t e [4 j +2];
i f ( dx dx+dy dy<rMinimumSquared ) {
overlap = true ;
}
j ++;
}
} while ( overlap ) ;
}
}
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 262
What is the maximum density that you can reasonably obtain in this way?
Finding a random conﬁguration of particles in which no two particles are closer than 21/6σ
becomes much too ineﬃcient if the system is dense. It is possible to choose the initial positions
randomly without regard to their separations if we include a ﬁctitious drag force proportional
to the square of the velocity. The eﬀect of such a force is to dampen the velocity of those particles
whose velocities become too large due to the large repulsive forces exerted on them. We then
would have to run for a while until all the velocities satisfy the condition v∆t 1. As the
velocities become smaller, we may gradually reduce the friction coeﬃcient.
In general, the easiest way of obtaining an initial conﬁguration with the desired density is
to place the particles on a regular lattice. If the temperature is high or if the system is dilute, the
system will “melt” and become a liquid or a gas; otherwise, it will remain a solid. If our goal is
to equilibrate the system at ﬂuid densities, it is not necessary to choose the correct equilibrium
symmetry of the lattice. The method setRectangularLattice in Listing 8.5 places the particles
on a rectangular lattice. To make the method simple, the user must specify the number of
particles per row nx and the number per column ny. The linear dimensions Lx and Ly are
adjustable parameters and can be varied after initialization to be as close as possible to their
desired values. In this way we can vary the density by varying the volume (area) without setting
up a new initial conﬁguration.
Listing 8.5: Placement of particles on a rectangular lattice.
/ / place p a r t i c l e s on a rectangular l a t t i c e
public void setRectangularLattice ( ) {
double dx = Lx/nx ; / / d i s t a n c e between columns
double dy = Ly/ny ; / / d i s t a n c e between rows
for ( int ix = 0; ix<nx ; ++ix ) { / / loop through p a r t i c l e s in a row
for ( int iy = 0; iy<ny ; ++iy ) { / / loop through rows
int i = ix + iy ny ;
s t a t e [4 i ] = dx ( ix +0.5);
s t a t e [4 i +2] = dy ( iy +0.5);
}
}
}
The most time consuming part of a molecular dynamics simulation is the computation of
the accelerations of the particles. The method computeAcceleration determines the total force
on each particle due to the other N − 1 particles and uses Newton’s third law to reduce the
number of calculations by a factor of two. Hence, for a system of N particles, there are a total
of N(N − 1)/2 possible interactions. Because of the short range nature of the Lennard–Jones
potential, we could truncate the force at r = rc and ignore the forces from particles whose separation
is greater than rc. However, for N 400, it is easier to include all possible interactions,
no matter how small. The quantity virial accumulated in computeAcceleration is discussed
in Section 8.7, where we will see that it is related to the pressure. It is convenient to also calculate
the potential energy in computeAcceleration. Note that in reduced units, the mass of a
particle is unity, and hence, the acceleration and force are equivalent.
Listing 8.6: Calculation of the acceleration.
public void computeAcceleration ( ) {
for ( int i = 0; i <N; i ++) {
ax [ i ] = 0;
ay [ i ] = 0;
}
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 263
for ( int i = 0; i <N−1; i ++) {
for ( int j = i +1; j <N; j ++) {
double dx = pbcSeparation ( s t a t e [4 i ]− s t a t e [4 j ] , Lx ) ;
double dy = pbcSeparation ( s t a t e [4 i +2]− s t a t e [4 j +2] , Ly ) ;
double r2 = dx dx+dy dy ;
double oneOverR2 = 1.0/ r2 ;
double oneOverR6 = oneOverR2 oneOverR2 oneOverR2 ;
double fOverR = 48.0 oneOverR6 ( oneOverR6 −0.5) oneOverR2 ;
double fx = fOverR dx ;
double fy = fOverR dy ;
ax [ i ] += fx ;
ay [ i ] += fy ;
ax [ j ] −= fx ;
ay [ j ] −= fy ;
totalPotentialEnergyAccumulator += 4.0 ( oneOverR6 oneOverR6−oneOverR6 ) ;
virialAccumulator += dx fx+dy fy ;
}
}
}
The methods needed for the ODE interface are given in Listing 8.7. Note that the getRate
method is invoked twice for every call to the step method because we are using the Verlet
algorithm. The ﬁrst rate call uses the current positions of the particles, and the second rate call
uses the new positions. Because a particle’s new position becomes its current position for the
next step, we would compute the same accelerations twice. To avoid this ineﬃciency, we query
the ODE solver using the getRateCounter method to determine if the position or the velocity
is being computed. We store the accelerations in an array during the second computation so
that we use these values the next time getRate is invoked. This trick is not general and should
only be used if you understand exactly how the particular ODE solver behaves. Study the
implementation of the step method in the Verlet class.
Listing 8.7: Methods needed for the ODE interface.
public void getRate ( double [ ] state , double [ ] rate ) {
/ / getRate i s c a l l e d twice f o r each c a l l to st ep
/ / a c c e l e r a t i o n s computed f o r every other c a l l to getRate because
/ / new v e l o c i t y i s computed from previous and current a c c e l e r a t i o n
/ / Previous a c c e l e r a t i o n i s saved in st ep method of Verlet
i f ( odeSolver . getRateCounter ()==1) {
computeAcceleration ( ) ;
}
for ( int i = 0; i <N; i ++) {
rate [4 i ] = s t a t e [4 i +1]; / / r a t e s f o r p o s i t i o n s are v e l o c i t i e s
rate [4 i +2] = s t a t e [4 i +3]; / / vy
rate [4 i +1] = ax [ i ] ; / / r a t e f o r v e l o c i t y i s a c c e l e r a t i o n
rate [4 i +3] = ay [ i ] ;
}
rate [4 N] = 1; / / dt / dt = 1
}
public double [ ] getState ( ) {
return s t a t e ;
}
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 264
public void step ( HistogramFrame xVelocityHistogram ) {
odeSolver . step ( ) ;
double totalKineticEnergy = 0;
for ( int i = 0; i <N; i ++) {
totalKineticEnergy += ( s t a t e [4 i +1] s t a t e [4 i +1]+ s t a t e [4 i +3] s t a t e [4 i + 3 ] ) ;
xVelocityHistogram . append ( s t a t e [4 i + 1 ] ) ;
s t a t e [4 i ] = pbcPosition ( s t a t e [4 i ] , Lx ) ;
s t a t e [4 i +2] = pbcPosition ( s t a t e [4 i +2] , Ly ) ;
}
totalKineticEnergy = 0 . 5 ;
steps ++;
totalKineticEnergyAccumulator += totalKineticEnergy ;
totalKineticEnergySquaredAccumulator += totalKineticEnergy totalKineticEnergy ;
t += dt ;
}
Note that we accumulate data for the histogram of the x-component of the velocity in the step
method.
Alternatively, we can implement the Verlet algorithm without the ODE interface. In the
following we show the code that would replace the call to the step method of the ODE solver.
We have used diﬀerent array names for clarity. This code uses about the same amount of CPU
time as the code using the ODE solver.
for ( int i = 0; i <N; i ++) { / / use old a c c e l e r a t i o n
x [ i ] += vx [ i ] dt + ax [ i ] halfdt2 ; / / halfdt2 = 0.5 dt dt
y [ i ] += vy [ i ] dt + ay [ i ] halfdt2 ;
vx [ i ] += ax [ i ] halfdt ; / / add old a c c e l e r a t i o n , h a l f d t = 0.5 dt
vy [ i ] += ay [ i ] halfdt ;
}
/ / computes v e l o c i t y in two s t e p s using old and new a c c e l e r a t i o n
computeAcceleration ( ) ;
for ( int i = 0; i <N; i ++) { / / add new a c c e l e r a t i o n
vx [ i ] += ax [ i ] halfdt ;
vy [ i ] += ay [ i ] halfdt ;
}
In Listing 8.8 we give some of the methods for computing the temperature, pressure (see
(8.8)), and the heat capacity (see (8.12)). The mean total energy should remain constant, but we
compute it to test how well the algorithm conserves the total energy.
Listing 8.8: Methods used to compute averages.
public double getMeanTemperature ( ) {
return totalKineticEnergyAccumulator /(N steps ) ;
}
public double getMeanEnergy ( ) {
return totalKineticEnergyAccumulator / steps+totalPotentialEnergyAccumulator / steps ;
}
public double getMeanPressure ( ) {
double meanVirial ;
meanVirial = virialAccumulator / steps ;
/ / quantity PA/NkT
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 265
return 1.0+0.5 meanVirial /(N getMeanTemperature ( ) ) ;
}
public double getHeatCapacity ( ) {
double meanTemperature = getMeanTemperature ( ) ;
double meanTemperatureSquared = totalKineticEnergySquaredAccumulator / steps ;
/ / heat c a p a c i t y r e l a t e d to f l u c t u a t i o n s of k i n e t i c energy
double sigma2 = meanTemperatureSquared−meanTemperature meanTemperature ;
double denom = sigma2 /(N meanTemperature meanTemperature ) − 1 . 0 ;
return N/denom ;
}
public void resetAverages ( ) {
steps = 0;
virialAccumulator = 0;
totalPotentialEnergyAccumulator = 0;
totalKineticEnergyAccumulator = 0;
totalKineticEnergySquaredAccumulator = 0;
}
The resetAverages method is used to set the accumulated averages to zero so that the
initial transient behavior can be removed from the computed averages.
We use the Drawable interface to display the trajectories of the individual particles.
Listing 8.9: The draw method.
public void draw ( DrawingPanel panel , Graphics g ) {
i f ( s t a t e==null ) {
return ;
}
int pxRadius = Math . abs ( panel . xToPix ( radius )− panel . xToPix ( 0 ) ) ;
int pyRadius = Math . abs ( panel . yToPix ( radius )− panel . yToPix ( 0 ) ) ;
g . setColor ( Color . red ) ;
for ( int i = 0; i <N; i ++) {
int xpix = panel . xToPix ( s t a t e [4 i ]) − pxRadius ;
int ypix = panel . yToPix ( s t a t e [4 i +2]) −pyRadius ;
g . f i l l O v a l ( xpix , ypix , 2 pxRadius , 2 pyRadius ) ;
} / / draw c e n t r a l c e l l boundary
g . setColor ( Color . black ) ;
int xpix = panel . xToPix ( 0 ) ;
int ypix = panel . yToPix ( Ly ) ;
int lx = panel . xToPix ( Lx)− panel . xToPix ( 0 ) ;
int ly = panel . yToPix (0) − panel . yToPix ( Ly ) ;
g . drawRect ( xpix , ypix , lx , ly ) ;
}
The beginning of the LJParticles class includes the usual import statements, instance
variables, and the initialize method. If you include all of the code we have discussed in a
single ﬁle, you will have a working LJparticles class. Alternatively, you can download the
source code from the ch08 directory.
Listing 8.10: Beginning of class LJParticles.
package org . opensourcephysics . sip . ch08 .md;
import java . awt . ;
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 266
import org . opensourcephysics . display . ;
import org . opensourcephysics . frames . HistogramFrame ;
import org . opensourcephysics . numerics . ;
public class L J P a r t i c l e s implements Drawable , ODE {
public double s t a t e [ ] ;
public double ax [ ] , ay [ ] ;
/ / number of p a r t i c l e s , number per row , number per column
public int N, nx , ny ;
public double Lx , Ly ;
public double rho = N/( Lx Ly ) ;
public double initialKineticEnergy ;
public int steps = 0;
public double dt = 0.01;
public double t ;
public double totalPotentialEnergyAccumulator ;
public double totalKineticEnergyAccumulator
public double totalKineticEnergySquaredAccumulator ;
public double virialAccumulator ;
public String initialConfiguration ;
public double radius = 0 . 5 ; / / radius of p a r t i c l e s on screen
ODESolver ode_solver = new Verlet ( this ) ;
public void i n i t i a l i z e ( ) {
N = nx ny ;
t = 0;
rho = N/( Lx Ly ) ;
resetAverages ( ) ;
s t a t e = new double[1+4 N] ;
ax = new double [N] ;
ay = new double [N] ;
i f ( initialConfiguration . equals ( "triangular" ) ) {
setTriangularLattice ( ) ;
} else i f ( initialConfiguration . equals ( "rectangular" ) ) {
setRectangularLattice ( ) ;
} else {
setRandomPositions ( ) ;
}
s e t V e l o c i t i e s ( ) ;
computeAcceleration ( ) ;
ode_solver . setStepSize ( dt ) ;
}
The target class is given in Listing 8.11. When the user presses the Stop button, various
thermodynamic averages are displayed in the message area of the control window. As you will
ﬁnd in Problem 8.8, a time consuming part of a molecular dynamics simulation is equilibrating
the system, especially at high densities. The quickest way to do so is to start with a conﬁguration
that is typical of the conﬁgurations at the desired energy and density. Hence, we will want to
be able to read the positions and velocities of the particles from a previously saved ﬁle. The
class LJParticlesLoader allows us to save a conﬁguration. This class is used in the getLoader
method in class LJParticlesApp. To save a given conﬁguration, open the File menu in the
control window and choose Save As. . . . A dialog box will open so that you can choose a name
for the ﬁle, which will have the extension xml. To read a previously saved conﬁguration, choose
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 267
Read in the File menu. Notice that in LJParticlesLoader, the loadObject makes a call to the
computeAcceleration and resetAverages methods of LJParticles, so that the simulation
can be run starting from the newly loaded conﬁguration by next clicking the Start button. The
syntax for saving a model’s conﬁguration is described in more detail in Appendix 8A.
Listing 8.11: The LJParticlesApp target class.
package org . opensourcephysics . sip . ch08 .md;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . ;
import org . opensourcephysics . display . GUIUtils ;
public class LJParticlesApp extends AbstractSimulation {
L J P a r t i c l e s md = new L J P a r t i c l e s ( ) ;
PlotFrame pressureData = new PlotFrame ( "time" , "PA/NkT" ,
"Mean pressure" ) ;
PlotFrame temperatureData = new PlotFrame ( "time" , "temperature" ,
"Mean temperature" ) ;
HistogramFrame xVelocityHistogram = new HistogramFrame ( "vx" , "H(vx)" ,
"Velocity histogram" ) ;
DisplayFrame display = new DisplayFrame ( "x" , "y" ,
"Lennard-Jones system" ) ;
public void i n i t i a l i z e ( ) {
md. nx = control . getInt ( "nx" ) ; / / number of p a r t i c l e s per row
md. ny = control . getInt ( "ny" ) ; / / number of p a r t i c l e s per column
md. initialKineticEnergy = control . getDouble (
"initial kinetic energy per particle" ) ;
md. Lx = control . getDouble ( "Lx" ) ;
md. Ly = control . getDouble ( "Ly" ) ;
md. initialConfiguration = control . getString ( "initial configuration" ) ;
md. dt = control . getDouble ( "dt" ) ;
md. i n i t i a l i z e ( ) ;
display . addDrawable (md) ;
/ / assumes vmax = 2 initalTemp and bin width = Vmax/N
display . setPreferredMinMax (0 , md. Lx , 0 , md. Ly ) ;
xVelocityHistogram . setBinWidth (2 md. initialKineticEnergy /md.N) ;
}
public void doStep ( ) {
md. step ( xVelocityHistogram ) ;
pressureData . append (0 , md. t , md. getMeanPressure ( ) ) ;
temperatureData . append (0 , md. t , md. getMeanTemperature ( ) ) ;
}
public void stop ( ) {
control . println ( "Density = "+decimalFormat . format (md. rho ) ) ;
control . println ( "Number of time steps = "+md. steps ) ;
control . println ( "Time step dt = "+decimalFormat . format (md. dt ) ) ;
control . println ( "<T>= "
+decimalFormat . format (md. getMeanTemperature ( ) ) ) ;
control . println ( "<E> = "+decimalFormat . format (md. getMeanEnergy ( ) ) ) ;
control . println ( "Heat capacity = "
+decimalFormat . format (md. getHeatCapacity ( ) ) ) ;
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 268
control . println ( "<PA/NkT> = "
+decimalFormat . format (md. getMeanPressure ( ) ) ) ;
}
public void startRunning ( ) {
md. dt = control . getDouble ( "dt" ) ;
double Lx = control . getDouble ( "Lx" ) ;
double Ly = control . getDouble ( "Ly" ) ;
i f ( ( Lx!=md. Lx ) | | ( Ly!=md. Ly ) ) {
md. Lx = Lx ;
md. Ly = Ly ;
md. computeAcceleration ( ) ;
display . setPreferredMinMax (0 , Lx , 0 , Ly ) ;
resetData ( ) ;
}
}
public void reset ( ) {
control . setValue ( "nx" , 8 ) ;
control . setValue ( "ny" , 8 ) ;
control . setAdjustableValue ( "Lx" , 2 0 . 0 ) ;
control . setAdjustableValue ( "Ly" , 1 5 . 0 ) ;
control . setValue ( "initial kinetic energy per particle" , 1 . 0 ) ;
control . setAdjustableValue ( "dt" , 0 . 0 1 ) ;
control . setValue ( "initial configuration" , "rectangular" ) ;
enableStepsPerDisplay ( true ) ;
/ / draw c o n f i g u r a t i o n s every 10 s t e p s
super . setStepsPerDisplay ( 1 0 ) ;
/ / so p a r t i c l e s w i l l appear as c i r c u l a r d i s k s
display . setSquareAspect ( true ) ;
}
public void resetData ( ) {
md. resetAverages ( ) ;
/ / c l e a r s old data from the p l o t frames
GUIUtils . clearDrawingFrameData ( false ) ;
}
public s t a t i c XML. ObjectLoader getLoader ( ) {
return new LJParticlesLoader ( ) ;
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl control = SimulationControl . createApp (
new LJParticlesApp ( ) ) ;
control . addButton ( "resetData" , "Reset Data" ) ;
}
}
Listing 8.12: The LJParticlesLoader class for saving conﬁgurations.
package org . opensourcephysics . sip . ch08 .md;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . display . GUIUtils ;
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 269
public class LJParticlesLoader implements XML. ObjectLoader {
public Object createObject ( XMLControl element ) {
return new LJParticlesApp ( ) ;
}
public void saveObject ( XMLControl control , Object obj ) {
LJParticlesApp model = ( LJParticlesApp ) obj ;
control . setValue ( "initial_configuration" ,
model .md. initialConfiguration ) ;
control . setValue ( "state" , model .md. s t a t e ) ;
}
public Object loadObject ( XMLControl control , Object obj ) {
/ / GUI has been loaded with the saved values ; now r e s t o r e the LJ s t a t e
LJParticlesApp model = ( LJParticlesApp ) obj ;
/ / reads values from the GUI into the LJ model
model . i n i t i a l i z e ( ) ;
model .md. initialConfiguration = control . getString (
"initial_configuration" ) ;
model .md. s t a t e = ( double [ ] ) control . getObject ( "state" ) ;
int N = ( model .md. s t a t e . length −1)/4;
model .md. ax = new double [N] ;
model .md. ay = new double [N] ;
model .md. computeAcceleration ( ) ;
model .md. resetAverages ( ) ;
/ / c l e a r s old data from the p l o t frames
GUIUtils . clearDrawingFrameData ( false ) ;
return obj ;
}
}
Problem 8.3. Approach to equilibrium
(a) Consider N = 64 particles interacting via the Lennard–Jones potential in a square central
cell of linear dimension L = 10. Start the system on a square lattice with an initial temperature
corresponding to T = 1.0. Let ∆t = 0.01 and run the application to make sure that it is
working properly. The total energy should be approximately conserved and the trajectories
of all 64 particles should be seen on the screen.
(b) The kinetic temperature of the system is given by (8.5). View the evolution of the temperature
of the system starting from the initial temperature. Does the temperature reach an
equilibrium value? That is, does it eventually ﬂuctuate about some mean value? What is
the mean value of the temperature for the given total energy of the system?
(c) Modify method setRectangularLattice so that all the particles are initially on the left side
of a box of linear dimensions Lx = 20 and Ly = 10. Does the system become more or less
random as time increases?
(d) Modify the program so it computes n(t), the number of particles in the left half of the cell,
and plot n(t) as a function of t. What is the qualitative behavior of n(t)? What is the mean
number of particles on the left half after the system has reached equilibrium? Compare
your qualitative results with the results you found in Problem 7.2.
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 270
Figure 8.4: Example of a special initial condition; the arrows represent the magnitude and the
direction of each particle’s velocity.
Problem 8.4. Sensitivity to initial conditions
(a) Modify your program to consider the following initial condition corresponding to N = 11
particles moving in the same direction with the same velocity (see Figure 8.4). Choose
Lx = Ly = 10 and ∆t = 0.01.
for ( int i = 0; i < N; i ++) {
x [ i ] = Lx /2;
y [ i ] = ( i − 0.5) Ly/N;
vx [ i ] = 1;
vy [ i ] = 0;
}
Does the system eventually reach equilibrium? Why or why not?
(b) Change the velocity of particle 6 so that vx(6) = 0.99999 and vy(6) = 0.00001. Is the behavior
of the system qualitatively diﬀerent than in part (a)? Does the system eventually reach
equilibrium? Are the trajectories of the particles sensitive to the initial conditions? Explain
why this behavior implies that almost all initial states lead to the same qualitative behavior
(for a given total energy).
(c) Modify LJParticlesApp so that the application runs for a predetermined time interval,
such as 100 time steps, and then continues with the time reversed process, that is, the motion
that would occur if the direction of time was reversed. This reversal is equivalent to
letting v → −v for all particles or letting ∆t → −∆t. Do the particles return to their original
positions? What happens if you reverse the velocities at a later time? What happens if you
choose a smaller value of ∆t?
(d) Explain why you can conclude that the system is chaotic. Are the computed trajectories the
same as the “true” trajectories?
From Problems 8.3 and 8.4, we see that from the microscopic point of view, the trajectories
appear rather complex. In contrast, from the macroscopic point of view, the system can be described
more simply. For example, in Problem 8.3 we described the approach of the system
to equilibrium by specifying n(t), the number of particles in the left half of the cell at time t.
Your observations of the macroscopic variable n(t) should be consistent with the following two
general properties of systems of many particles:
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 271
1. After the removal of an internal constraint, an isolated system changes in time from a
“less random” to a “more random” state.
2. A system whose macroscopic state is independent of time is said to be in equilibrium. The
equilibrium macroscopic state is characterized by relatively small ﬂuctuations about a
mean that is independent of time. The relative ﬂuctuations become smaller as the number
of particles becomes larger.
In Problems 8.3b and 8.3c we found that the particles ﬁlled the box and did not return to
their initial conﬁguration. Hence, we were able to deﬁne a direction of time. This direction
becomes better deﬁned if we consider more particles. Note that Newton’s laws of motion are
time reversible, and there is no a priori reason that gives the time a preferred direction.
Before we consider other macroscopic quantities, we need to monitor the total energy and
verify our claim that the Verlet algorithm maintains conservation of energy with a reasonable
choice of ∆t. We also introduce a check for momentum conservation.
Problem 8.5. Tests of the Verlet algorithm
(a) One essential check of a molecular dynamics program is that the total energy be conserved
to the desired accuracy. Determine the value of ∆t necessary for the total energy to be
conserved to a given accuracy over a time interval of t = 2. One way is to compute ∆Emax(t),
the maximum value of the diﬀerence, |E(t) − E(0)|, over the time interval t, where E(0) is
the initial total energy, and E(t) is the total energy at time t. Verify that ∆Emax(t) decreases
when ∆t is made smaller for ﬁxed t. If your application is working properly, ∆Emax(t) should
decrease as approximately (∆t)2 because the Verlet algorithm is a second-order algorithm.
(b) A simple way of monitoring how well the program is conserving the total energy is to use
a least squares ﬁt of the times series of E(t) to a straight line. The slope of the line can be
interpreted as the drift, and the root mean square deviation from the straight line can be
interpreted as the noise (σy in the notation of Section 7.6). How do the drift and the noise
depend on ∆t for a ﬁxed time interval t? Most research applications conserve the energy to
1 part in 104 or better over the duration of the run.
(c) Because of the use of periodic boundary conditions, all points in the central cell are equivalent
and the system is translationally invariant. As you might have learned, translational
invariance implies that the total linear momentum is conserved. However, ﬂoating point
error and the truncation error associated with a ﬁnite diﬀerence algorithm can cause the total
linear momentum to drift. Programming errors might also be detected by checking for
conservation of momentum. Hence, it is a good idea to monitor the total linear momentum
at regular intervals and reset the total momentum equal to zero if necessary. The method
setVelocities in Listing 8.3 chooses the velocities so that the total momentum is initially
zero. Add a method that resets the total momentum to zero and call it at regular intervals,
for example, every 1000–10,000 time steps. How well does class LJParticles conserve the
total linear momentum for ∆t = 0.01?
8.7 Thermodynamic Quantities
In the following, we discuss how some of the macroscopic quantities of interest, such as the
temperature and the pressure, can be related to time averages over the phase space trajectories
of the particles.
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 272
We have already introduced the deﬁnition of the kinetic temperature in (8.5). The temperature
that we measure in a laboratory experiment is the mean temperature, which corresponds to
the time average of T (t) over many conﬁgurations of the particles. For two dimensions (d = 2),
we write the mean temperature T as
kT =
1
2N
N
i=1
mivi(t) · vi(t) (two dimensions) (8.6)
where X denotes the time average of X(t). The relation (8.6) is an example of the relation of
a macroscopic quantity (the mean temperature) to a time average over the trajectories of the
particles. (This deﬁnition of temperature is not adequate for particles moving relativistically,
or if quantum mechanics is important.)
The relation (8.5) holds only if the momentum of the center of mass of the system is zero—
we do not want the motion of the center of mass to change the temperature. In a laboratory
system, the walls of the container ensure that the center of mass motion is zero (if the mean
momentum of the walls is zero). In our simulation, we impose the constraint that the center
of mass momentum (in each of the d directions) be zero. Consequently, the system has dN − d
independent velocity components rather than dN components, and we should replace (8.6) by
kT =
1
(N − 1)d
N
i=1
mivi(t) · vi(t) (correction for ﬁxed center of mass). (8.7)
The presence of the factor (N −1)d rather than Nd in (8.7) is an example of a ﬁnite size correction
that becomes unimportant for large N. We shall ignore this correction in the following.
Another macroscopic quantity of interest is the mean pressure. The pressure is related to
the force per unit area normal to an imaginary surface in the system. By Newton’s second law,
this force is related to the momentum that crosses the surface per unit time. We could use this
relation to determine the pressure, but this relation uses information only from the fraction of
particles that are crossing an arbitrary surface at a given time. Instead, we will use the relation
of the pressure to the virial, which involves all the particles in the system.
In general, the momentum ﬂux across a surface has two contributions. The contribution,
NkT /V , where V is the volume (area) of the system, is due to the motion of the particles and is
derived in many texts using simple kinetic theory arguments.
The other contribution to the momentum ﬂux arises from the momentum transferred across
the surface due to the forces between particles on diﬀerent sides of the surface. It can be shown
that the instantaneous pressure at time t, including both contributions to the momentum ﬂux,
is given by
P (t)V = NkT (t) +
1
d
i<j
rij(t) · Fij(t) (8.8)
where rij = ri − rj and Fij is the force on particle i due to particle j. The second term in (8.8) is
related to the virial and represents the correction to the ideal gas equation of state due to the
interactions between the particles. (In two and one dimensions, we replace V by the area and
length, respectively.)
The mean pressure, P = P (t), is found by computing the time average of the right-hand side
of (8.8). The computed quantity is not P , but the ratio
P V
NkT
− 1 =
1
dNkT
i<j
rij · Fij. (8.9)
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 273
In class LJParticles, the sum on the right-hand side of (8.9) is computed in computeAcceleration
and stored in the variable virialAccumulator.
The relation of information at the microscopic level to macroscopic quantities such as the
temperature and pressure is one of the fundamental results of statistical mechanics. In brief,
molecular dynamics allows us to compute various time averages of the trajectory in phase space
over ﬁnite time intervals. One practical question is whether our time intervals are suﬃciently
long enough to allow the system to explore phase space and yield meaningful averages. Calculations
in statistical mechanics are done by replacing time averages by ensemble averages over
all possible conﬁgurations. The quasi-ergodic hypothesis asserts that these two types of averages
give equivalent results if the same quantities are held ﬁxed. In statistical mechanics, the
ensemble of systems at ﬁxed E,V and N is called the microcanonical ensemble. Averages in
this ensemble correspond to the time averages that we use in molecular dynamics which are at
ﬁxed E, V and N. (Molecular dynamics also imposes an additional, but unimportant, constraint
on the center of mass motion.) Ensemble averages are explored using Monte Carlo methods in
Chapter 15. A test for determining if a molecular dynamics simulation is exploring a reasonable
amount of phase space is discussed in Project 8.23.
The goal of the following problems is to explore some of the qualitative features of gases,
liquids, and solids. Because we will consider relatively small systems and relatively short runs,
our results will only be qualitatively consistent with averages calculated in the thermodynamic
limit where N → ∞.
Problem 8.6. Distribution of speeds and velocities
(a) In Section 7.2 we discussed how to use the HistogramFrame class from the Open Source
Physics library. LJParticlesApp uses this class to compute the probability P (vx)∆vx that a
particle has a velocity in the x-direction between vx and vx + ∆vx. Add code to determine
P (vy), the probability density for the y component of the velocity. What are the most probable
values for the x and y velocity components? What are their average values? Plot the
probability densities P (vx) versus vx and P (vy) versus vy. Better results can be found by
plotting the average 1
2 [P (vx = u) + P (vy = u)] versus u. What is the qualitative form of P (v)?
(b) Write a method to compute the equilibrium probability P (v)∆v that a particle has a speed
between v and v + ∆v. What is the qualitative form of the probability density P (v)? Does it
have the same qualitative form as P (v), the probability density for the velocity? What is the
most probable value of v? What is the approximate width of P (v)? Compare your measured
result to the theoretical form (in two dimensions):
P (v)dv = Ae−mv2/2kT
v dv (8.10)
where A is a normalization constant. The form (8.10) of the distribution of speeds is known
as the Maxwell–Boltzmann probability distribution.
(c) Repeat part (b) for diﬀerent densities and temperatures. Does the form of P (v) depend on
the density or temperature?
Problem 8.7. Qualitative properties of a liquid and a gas
(a) Generate an initial conﬁguration using setRectangularLattice with N = 64 and Lx = Ly =
12 and an initial temperature of 2.0. What is the density? Modify your program so that
the values of the temperature and pressure are not stored until the system has reached
equilibrium. One criterion for equilibrium is to compute the average values of T and P over
ﬁnite time intervals and check that these averages do not drift with time.
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 274
(b) Choose a value of the time step ∆t so that the total energy is conserved to the desired accuracy
and run the simulation for a suﬃcient time to estimate the equilibrium pressure and
temperature. Compare your estimate for the ratio P V /NkT with its value for an ideal gas.
(We have written V for the area of the system, so that the ideal gas equation of state has a
familiar form.) Save the ﬁnal conﬁguration of your simulation in a ﬁle (see Appendix 8A).
(c) One way of starting a simulation is to use the positions saved from an earlier run. The
simplest way of obtaining an initial condition corresponding to a diﬀerent density, but the
same value of N, is to rescale the positions of the particles and the linear dimensions of the
cell. The following code shows one way to do so.
for ( int i = 0; i < N; i ++) {
x [ i ] = rescale ; / / add r e s c a l e as a c l a s s v a r i a b l e
y [ i ] = rescale ;
}
Lx = rescale Lx
Ly = rescale Ly
Incorporate this code into your program in a separate method and add a button that lets the
user call this method without initialization. This method must be used with care when increasing
the density. If the density is increased too quickly, it is likely that two particles will
become very close to each other, so that the force will become too large and the numerical
algorithm will break down. Allow the system to equilibrate after making a small density
change and then repeat until you reach the desired density. How do you expect P and T to
change when the system is compressed? Gradually increase the density and determine how
P V /NkT changes with increasing density. Can you distinguish the diﬀerent phases? (The
determination of the phase boundary between a gas, liquid, and a solid is nontrivial and is
discussed in Problem 15.26.)
Another useful thermal quantity is the heat capacity at constant volume, which is deﬁned
by the relation CV = (∂E/∂T )V . (The subscript V denotes that the partial derivative is taken
with the volume held ﬁxed.) CV is an example of a linear response function, that is, the response
of the temperature to a change in the energy of the system. One way to obtain CV is
to determine T (E), the temperature as a function of E. (Remember that a molecular dynamics
simulation yields T as a function of E.) The heat capacity is approximately given by ∆E/∆T for
two runs that have slightly diﬀerent temperatures. This method is straightforward, but requires
that simulations at diﬀerent energies be done. An alternative way of determining CV from the
ﬂuctuations of the kinetic energy is discussed in Problem 8.8c.
Problem 8.8. Energy dependence of the temperature and pressure
(a) We found in Problem 8.7 that the total energy is determined by the initial conditions, and
the temperature is a derived quantity found only after the system has reached thermal equilibrium.
For this reason it is diﬃcult to study the system at a particular temperature. The
temperature can be changed to the desired value by rescaling the velocities of the system,
but we have to be careful not to increase the velocities too quickly. Run your program to
create an equilibrium conﬁguration for Lx = Ly = 12 and N = 64 and determine T (E), the
energy dependence of mean temperature, in the range T = 1.0 to T = 1.2. Rescale the velocities
by the desired amount over some time interval. For example, multiply all the velocities
by a factor λ after each time step for a certain number of time steps. In general, the desired
temperature is reached by a series of velocity rescalings over a suﬃciently long time such
that the system remains close to equilibrium during the rescaling.
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 275
a
√3a/2
Figure 8.5: Each particle has six nearest neighbors in a triangular lattice.
(b) Use your data for T (E) found in part (a) to plot the total energy E as a function of T . Is T
a monotonically increasing function of E? What percentage of the contribution to the heat
capacity is due to the potential energy? Why is an accurate determination of CV diﬃcult to
achieve?
(c)∗ In Chapter 15 we will ﬁnd that CV is related to the ﬂuctuations of the total energy in the
canonical ensemble in which T , V , N are held ﬁxed. In molecular dynamics simulations,
the total energy is ﬁxed, but the kinetic and potential energies can ﬂuctuate. Another way
of determining CV is to relate it to the ﬂuctuations of the kinetic energy. It can be shown
that (cf. Ray and Graben)
T 2 − T
2
=
d
2N
(kT )2
1 −
dNk
2CV
(8.11)
or
CV =
dNk
2
1 −
2N
d
(T 2 − T
2
)
(kT )2
−1
. (8.12)
The relation (8.12) reduces to the ideal gas result if T 2 = T
2
. Method getHeatCapacity
determines CV from (8.12). Compare your results obtained using (8.12) with the determination
of CV in part (b). What are the advantages and disadvantages of determining CV
from the ﬂuctuations of the temperature compared to the method used in part (b)?
Problem 8.9. Ground state energy of two-dimensional lattices
To simulate a solid, we need to choose the shape of the central cell to be consistent with the
symmetry of the solid phase of the system. This choice is necessary even though we have used
periodic boundary conditions to minimize surface eﬀects. If the cell does not correspond to the
correct crystal structure, the particles cannot form a perfect crystal, and some of the particles
will wander around in an endless search for their “correct” positions. Consequently, a simulation
of a small system at a high density and low temperature would lead to spurious results.
In the following, we compute the energy of a Lennard–Jones solid in two dimensions for the
square and triangular lattices and determine which symmetry has lower energy.
(a) The symmetry of the triangular lattice can be seen from Figure 8.5. Each particle has six
nearest neighbors. Although it is possible to choose the central cell of the triangular lattice
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 276
to be a rhombus, it is more convenient to choose the cell to be rectangular as in Figure 8.5.
For a perfect crystal, the linear dimensions of the cell are Lx and Ly =
√
3Lx/2, respectively.
Use method setTriangularLattice in Listing 8.13 to generate the positions of the particles
in a triangular lattice. Then compute the potential energy per particle of a system of N = 64
particles interacting via the Lennard–Jones potential. Determine the potential energy for
Lx = 8 and Lx = 9.
(b) Determine the potential energy for a square lattice with L = LxLy, so that the triangular
and square lattices have the same density. Which lattice symmetry has a lower potential
energy for a given density? Explain your results in terms of the ability of the triangular
lattice to pack the particles closer together.
Listing 8.13: Method for generating a triangular lattice.
public void setTriangularLattice ( ) {
double dx = Lx/nx ; / / d i s t a n c e between p a r t i c l e s on same row
double dy = Ly/ny ; / / d i s t a n c e between rows
for ( int ix = 0; ix<nx ; ++ix ) {
for ( int iy = 0; iy<ny ; ++iy ) {
int i = ix + iy ny ;
s t a t e [4 i +2] = dy ( iy +0.5);
i f ( iy%2==0) {
s t a t e [4 i ] = dx ( ix +0.25);
} else {
s t a t e [4 i ] = dx ( ix +0.75);
}
}
}
}
Problem 8.10. Metastability
If we rapidly lower the temperature of a liquid below its freezing temperature, it is likely that
the resulting state will not be an equilibrium crystal, but rather a supercooled liquid. If the
properties of the supercooled state do not change with time for a time interval that is suﬃciently
long to obtain meaningful averages, we say that the system is in a metastable state. In general, we
must carefully prepare our system so as to minimize the probability that the system becomes
trapped in a metastable state. However, there is much interest in metastable states and how
they eventually evolve to a more stable state (see Problem 15.20).
(a) What happens if the initial positions of the particles are on the nodes of a square lattice.
As we found in Problem 8.9, this symmetry is not consistent with the lowest energy state
corresponding to a triangular lattice. If the initial velocities are set to zero, what happens
when you run the program? Choose N = 64 and Lx = Ly = 9.
(b) We can show that the system in part (a) is in a metastable state by giving the particles a
small random initial velocity in the interval [−0.5,+0.5]. Does the symmetry of the lattice
immediately change or is there a delay? When do you begin to see local structure that
resembles a triangular lattice?
(c) Repeat part (b) with random velocities in the interval [−0.1,+0.1].
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 277
a
2a
a
Figure 8.6: Initial conﬁguration for the model of friction discussed in Problem 8.12. The atoms
in the sliding object are placed in two rows of a triangular lattice with seven atoms in the bottom
row and six atoms in the top row. There is a damping force on the left-most atom in the bottom
row (the atom is shaded diﬀerently than the other atoms), and there is an external horizontal
spring attached to the right-most atom in the bottom row.
Problem 8.11. The solid state and melting
(a) Choose N = 64, Lx = 8, and Ly =
√
3Lx/2 and place the particles on the nodes of a triangular
lattice. Give each particle zero initial velocity. What is the total energy of the system? Do
a simulation and measure the temperature and pressure as a function of time. Does the
system remain a solid?
(b) Give each particle a random velocity in the interval [−0.5,+0.5]. What is the total energy?
Equilibrate the system and determine the mean temperature and pressure. Describe the
trajectories of the particles. Are the particles localized? Is the system a solid? Save an
equilibrium conﬁguration for use in part (c).
(c) Choose the initial conﬁguration to be an equilibrium conﬁguration from part (b) and gradually
increase the kinetic energy by a factor of two. What is the new total energy? Describe
the qualitative behavior of the motion of the particles. What is the equilibrium temperature
and pressure of the system? After equilibrium is reached, increase the temperature again by
rescaling the velocities in the same way. Repeat this rescaling and measure P (T ) and E(T )
for several diﬀerent temperatures.
(d) Use your results from part (c) to plot E(T )−E(0) and P (T ) as a function of T . Is the diﬀerence
E(T )−E(0) proportional to T ? What is the mean potential energy for a harmonic solid? What
is its heat capacity?
(e) Choose an equilibrium conﬁguration from part (b) and decrease the density by rescaling
Lx, Ly and the particle positions by a factor of 1.1. What is the nature of the trajectories?
Decrease the density of the system until the system melts. What is your qualitative criterion
for melting?
Problem 8.12. Microscopic model of friction
In introductory physics texts sliding friction is usually described by the empirical law
f = µFN (8.13)
where f is the magnitude of the friction force, FN is the normal force acting on the sliding
object, and µ is the coeﬃcient of friction. If the object is not moving, then (8.13), with µ equal to
the static coeﬃcient of friction, represents the frictional force needed to start the motion. If the
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 278
object is moving, then (8.13), with µ equal to the kinetic coeﬃcient of friction, represents the
kinetic frictional force, which is assumed to be independent of the speed of the sliding object.
In this problem we explore a simple model discussed by Ringlein and Robbins to investigate
the microscopic origin of friction. The stationary surface is modeled by a line of ﬁxed atoms
spaced a distance of a = 21/6 apart as shown in Figure 8.6. The sliding object is modeled by
two rows of atoms in a triangular lattice conﬁguration initially spaced a distance 2a from each
other. The bottom row of atoms in the sliding object is a vertical distance a from the line of
ﬁxed atoms. The interaction between all the atoms in the two objects occurs via the Lennard–
Jones potential. To keep the sliding object together, stiﬀ springs with a spring constant of 500
(in reduced units) connect each atom to its nearest neighbors on the triangular lattice. The leftmost
atom on the bottom row has a damping force equal to −10(vx ˆx + vy ˆy to help stabilize the
motion. In addition, there is an external horizontal spring with spring constant equal to unity
attached to the right-most atom on the bottom row. This spring is pulled at a constant rate
causing this force on the atom to increase linearly. When this spring force is suﬃciently large,
the atoms start to move and the spring force suddenly drops. The point at which this decrease
occurs deﬁnes the magnitude of the static frictional force.
(a) Modify your molecular dynamics program to simulate this model. Choose the sliding object
to consist of 13 atoms, 7 on the bottom row and 6 on the top row. Place this system of
13 atoms on the middle of a stationary surface of ﬁxed atoms (100 such atoms should be
more than enough). Your program should show the pulling force due to the spring on the
right-most atom as a function of time, and a visual display of the atoms in the system. A
reasonable rate for pulling the spring is 0.1; that is, the external horizontal spring force
is 0.1t − u, where u is the horizontal displacement of the right-most atom from its initial
position.
(b) As the system evolves, you should see the spring force suddenly drop when it reaches a
value of about 14. Try diﬀerent pulling rates and determine if the rate aﬀects your results
or the static friction force.
(c) Add a load that is equivalent to increasing the normal force. To add a load W to the system,
add a vertical force of −W /N to each of the N = 13 atoms in the sliding object. Find the
static friction force, fs as a function of W for W between −20 and +40. To what does a
negative load correspond? Determine the coeﬃcient of static friction from the slope of fs
versus W .
(d) Reduce the surface area by eliminating 4 atoms, 2 from each row. Rerun your simulations
and discuss the results. Repeat for an increased size of 17 atoms and ﬁt your results to the
form
fs = µsW + cA (8.14)
where A is the number of atoms in the bottom row, and c is a constant. The area dependence
in (8.14) is diﬀerent from what is usually assumed for sliding friction in introductory
physics textbooks. The surfaces of macroscopic objects are typically rough at the microscopic
level, and thus the eﬀective area of contact is much smaller than the surface area.
The eﬀective area can be proportional to the load, and thus both terms in (8.14) can be proportional
to the load, which is consistent with the usual assumption made in introductory
physics texts.
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 279
8.8 Radial Distribution Function
We can gain more insight into the structure of a many-body system by looking at how the positions
of the particles are correlated with one another due to their interactions. The radial
distribution function g(r) is a measure of this correlation and has the following properties. Suppose
that N particles are in a region of volume V with number density ρ = N/V . Choose one of
the particles to be the origin. Then the mean number of other particles between r and r + dr is
deﬁned to be ρg(r)dr. If the interparticle interaction is spherically symmetric and the system is
a gas or a liquid, then g(r) depends only on the separation r = |r|. The normalization condition
for g(r) is
ρ g(r)dr = N − 1 ≈ N (8.15)
where the volume element dr = 4πr2 dr (d = 3), 2πr dr (d = 2), and 2dr (d = 1). Equation (8.15)
implies that if we choose one particle as the origin and count all the other particles in the system,
we obtain N − 1 particles.
For an ideal gas, there are no correlations between the particles, and the normalization
condition (8.15) implies that g(r) = 1 for all r. For the Lennard–Jones interaction, we expect
that g(r) → 0 as r → 0, because the repulsive force between particles increases rapidly as r → ∞.
We also expect that g(r) → 1 as r → ∞, because the correlation of a given particle with the other
particles decreases as their separation increases.
Several thermodynamic properties can be obtained from g(r). Because ρg(r) can be interpreted
as the local density of particles about a given particle, the potential energy of interaction
between this particle and all other particles between r and r + dr is u(r)ρg(r)dr, if we assume
that only two-body interactions are present. The total potential energy is found by integrating
over all r and multiplying by N/2. The factor of N is included because any of the N particles
could be chosen as the particle at the origin, and the factor of 1/2 is included so that each pair
interaction is counted only once. The result is that the mean potential energy per particle can
be expressed as
U
N
=
ρ
2
g(r)u(r)dr. (8.16)
It can also be shown that the relation (8.9) for the mean pressure can be rewritten in terms of
g(r) so that the equation of state can be expressed as
P V
NkT
= 1 −
ρ
2kT d
g(r)r
du(r)
dr
dr. (8.17)
To determine g(r) for a particular conﬁguration of particles, we ﬁrst compute n(r,∆r), the
number of particles in a spherical (circular) shell of radius r and a small, but nonzero width ∆r,
with the center of the shell centered about each particle. A method for computing n(r) is given
in Listing 8.14.
Listing 8.14: Method to compute n(r).
public void computeRDF ( ) {
/ / accumulate data f o r n( r )
for ( int i = 0; i < N−1; i ++) {
for ( int j = i +1; i < N; j ++) {
double dx = PBC. separation ( x [ i ] − x [ j ] , Lx ) ;
double dy = PBC. separation ( y [ i ] − y [ j ] , Ly ) ;
double dy = ( dy + Ly ) % 0.5 Ly ;
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 280
double r2 = dx dx + dy dy ;
double r = Math . sqrt ( r2 ) ;
int bin = ( int ) ( r /dr ) ; / / dr = s h e l l width
RDFAccumulator [ bin ]++;
}
}
numberRDFMeasurements++;
}
The use of periodic boundary conditions in computeRDF implies that the maximum separation
between any two particles in the x and y directions is Lx/2 and Ly/2, respectively. Hence,
we can determine g(r) only for r ≤ 1
2 min(Lx,Ly).
To obtain g(r) from n(r), we note that for a given particle i, we consider only those particles
whose index j is greater than the index i (see computeRDF). Hence, there are a total of 1
2 N(N −1)
separations that are considered. In two dimensions we compute n(r,∆r) for a circular shell
whose area is 2πr∆r. These considerations imply that g(r) is related to n(r) by
ρg(r) =
n(r,∆r)
1
2 N 2πr∆r
(two dimensions). (8.18)
Note the factor of N/2 in the denominator of (8.18). Method normalizeRDF normalizes the array
RDFAccumulator and yields g(r).
Listing 8.15: Method for obtaining g(r) from n(r).
public void normalizeRDF ( PlotFrame dataRDF ) {
double density = N/( Lx Ly ) ;
double L = Math . min(Lx , Ly ) ;
/ / maximum index i s one l e s s than binMax
int binMax = ( int ) ( L/(2 dr ) ) ;
double normalization = density numberRDFMeasurements N/2;
for ( int bin = 0; bin < binMax ; bin++) {
double r = bin dr ;
double shellArea = Math . PI ( Math .pow( r+dr , 2 ) − Math .pow( r , 2 ) ) ;
double RDF = RDFAccumulator [ bin ] / ( normalization shellArea ) ;
dataRDF . append (0 , dr ( bin +0.5) ,RDF ) ; / / adds r e s u l t s to be p l o t t e d
}
}
The shell thickness ∆r needs to be suﬃciently small so that the important features of g(r)
are found, but large enough so that each bin has a reasonable number of contributions. The
value of ∆r should be a class variable. A reasonable choice for its magnitude is ∆r = 0.025.
Problem 8.13. The structure of g(r) for a dense liquid and a solid
(a) Write a test program that incorporates computeRDF and normalizeRDF and compute g(r)
for a system of N = 64 particles that are ﬁxed on a triangular lattice with Lx = 8 and Ly =
√
3Lx/2. What is the density of the system? What is the nearest neighbor distance between
sites? At what value of r does the ﬁrst maximum of g(r) occur? What is the next nearest
distance between sites? Does your calculated g(r) have any other peaks? If so, relate these
peaks to the structure of the triangular lattice.
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 281
σ
Figure 8.7: The closest distance between two hard disks is σ. The disks exert no force on one
another unless they touch.
(b) Modify your molecular dynamics program and compute g(r) for a dense ﬂuid (ρ > 0.6,
T ≈ 1.0) with N ≥ 64. How many peaks in g(r) can you observe? In what ways do they
change as the density is increased? How does the behavior of g(r) for a dense liquid compare
to that of a dilute gas and a solid?
8.9 Hard Disks
How can we understand the temperature and density dependence of the equation of state and
the structure of a dense liquid? One way to gain more insight into this dependence is to modify
the interaction and see how the properties of the system change. In particular, we would like
to understand the relative role of the repulsive and attractive parts of the interaction. For this
reason, we consider an idealized system of hard disks for which the interaction u(r) is purely
repulsive:
u(r) =



+∞, r < σ
0, r ≥ σ .
(8.19)
The length σ is the diameter of the hard disks (see Figure 8.7). In three dimensions the interaction
(8.19) describes the interaction of hard spheres (billiard balls); in one dimension (8.19)
describes the interaction of hard rods.
Because the interaction u(r) between hard disks is a discontinuous function of r, the dynamics
of hard disks is qualitatively diﬀerent than it is for a continuous interaction such as the
Lennard–Jones potential. For hard disks the particles move in straight lines at constant speed
between collisions and change their velocities instantaneously when a collision occurs. Hence
the problem becomes ﬁnding the next collision and computing the change in the velocities of
the colliding pair. The dynamics is event driven and can be computed exactly in principle; in
practice, it is limited only by roundoﬀ errors.
The dynamics of a system of hard disks can be treated as a sequence of two-body elastic
collisions. The idea is to consider all pairs of particles i and j and to ﬁnd the collision time tij
for their next collision, ignoring the presence of all other particles. In many cases the particles
will be going away from each other and the collision time is inﬁnite. From the collection of
collision times for all pairs of particles, we ﬁnd the minimum collision time. We then move all
the particles forward in time until the collision occurs and calculate the postcollision velocities
of the colliding pair. The main problem is dealing with the large number of possible collision
events.
We ﬁrst determine the particle velocities of the colliding pair. Consider a collision between
particles i and j. Let vi and vj be their velocities before the collision and vi and vj be their
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 282
velocities after the collision. Because the particles have equal mass, it follows from conservation
of energy and linear momentum that
vi
2
+ vj
2
= vi
2
+ vj
2
(8.20)
vi + vj = vi + vj. (8.21)
From (8.21) we have
∆vi = vi − vi = −(vj − vj) = −∆vj. (8.22)
When two hard disks collide, the force is exerted along the line connecting their centers,
rij = ri − rj. Hence, the components of the velocities parallel to rij are exchanged, and the
perpendicular components of the velocities are unchanged. It is convenient to write the velocity
of particles i and j as a vector sum of their components parallel and perpendicular to the unit
vector ˆrij = rij/|rij|. We write the velocity of particle i as
vi = vi, + vi,⊥ (8.23)
where vi, = (vi · ˆrij)ˆrij, and
vi, = vj, vj, = vi, (8.24a)
vi,⊥ = vi,⊥ vj,⊥ = vj,⊥. (8.24b)
Hence, we can write vi as
vi = vi, + vi,⊥ = vj, + vi,⊥
= vj, − vi, + vi, + vi,⊥
= (vj − vi) · ˆrij ˆrij + vi. (8.25)
The change in the velocity of particle i at a collision is given by
∆vi = vi − vi = − (vi − vj) · ˆrij ˆrij (8.26)
or
∆vi = −∆vj =
rij bij
σ2
contact
(8.27)
where bij = vij · rij, vij = vi − vj, and we have used the fact that |rij| = σ at contact.
Exercise 8.14. Velocity distribution of hard rods
Use (8.20) and (8.21) to show that vi = vj and vj = vi in one dimension; that is, two colliding hard
rods of equal mass exchange velocities. If you start a system of hard rods with velocities chosen
from a uniform random distribution, will the velocity distribution approach the equilibrium
Maxwell–Boltzmann distribution?
We now consider the criteria for a collision to occur. Consider disks i and j at positions ri
and rj at t = 0. If they collide at a time tij later, their centers will be separated by a distance σ:
|ri(tij) − rj(tij)| = σ. (8.28)
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 283
During the time tij, the disks move with constant velocities. Hence, we have
ri(tij) = ri(0) + vi(0)tij (8.29)
and
rj(tij) = r2(0) + v2(0)tij. (8.30)
If we substitute (8.29) and (8.30) into (8.28), we ﬁnd
[rij + vijtij]2
= σ2
(8.31)
where rij = ri(0) − rj(0), vij = vi(0) − vj(0), and
tij =
−vij · rij ± (vij · rij)2 − vij
2(rij
2 − σ2)
vij
2
. (8.32)
Because tij > 0 for a collision to occur, we see from (8.32) that the condition
vij · rij < 0 (8.33)
must be satisﬁed. That is, if vij · rij > 0, the particles are moving away from each other and there
is no possibility of a collision.
If the condition (8.33) is satisﬁed, then the discriminant in (8.32) must satisfy the condition
(vij · rij)2
− vij
2
(rij
2
− σ2
) ≥ 0. (8.34)
If the condition (8.34) is satisﬁed, then the quadratic in (8.32) has two roots. The smaller root
corresponds to the physically signiﬁcant collision because the disks are impenetrable. Hence,
the physically signiﬁcant solution for the time of a collision tij for particles i and j is given by
tij =
−bij − bij
2
− vij
2 (rij
2 − σ2)
1/2
vij
2
. (8.35)
Exercise 8.15. Calculation of collision times
Write a short program that determines the collision times (if any) of the following pairs of
particles. It would be a good idea to draw the trajectories to conﬁrm your results. Consider
the cases: r1 = (2,1), v1 = (−1,−2), r2 = (1,3), v2 = (1,1); r1 = (4,3), v1 = (2,−3), r2 = (3,1),
v2 = (−1,−1); and r1 = (4,2), v1 = (−2, 1
2 ), r2 = (3,1), v2 = (−1,1). As usual, choose units so
that σ = 1.
Our hard disk program implements the following steps. We ﬁrst ﬁnd the collision times
and the collision partners for all pairs of particles i and j. We then do the following.
1. locate the minimum collision time tmin;
2. advance all particles using a straight line trajectory until the collision occurs; that is, displace
particle i by vi tmin and update its next collision time;
3. compute the postcollision velocities of the colliding pair nextCollider and nextPartner;
4. calculate the physical quantities of interest and accumulate data;
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 284
5. update the collision partners of the colliding pair, nextCollider and nextPartner, and
all other particles that were to collide with either nextCollider or nextPartner if nextCollider
and nextPartner had not collided ﬁrst;
6. repeat steps 1–5 indeﬁnitely.
Methods for carrying out these steps are listed in the following:
Listing 8.16: Methods for each step of the hard disk system.
public void step ( ) {
/ / f i n d s minimum c o l l i s i o n time from l i s t of c o l l i s i o n times
minimumCollisionTime ( ) ;
/ / moves p a r t i c l e s f o r time equal to minimum c o l l i s i o n time
move ( ) ;
t += timeToCollision ;
/ / changes v e l o c i t i e s of two c o l l i d i n g p a r t i c l e s
contact ( ) ;
/ / s e t s c o l l i s i o n times to bigTime f o r those p a r t i c l e s s e t to c o l l i d e with
/ / two c o l l i d i n g p a r t i c l e s .
setDefaultCollisionTimes ( ) ;
/ / f i n d s new c o l l i s i o n times between a l l p a r t i c l e s and two c o l l i d i n g p a r t i c l e s
newCollisionTimes ( ) ;
numberOfCollisions ++;
}
public void minimumCollisionTime ( ) {
/ / s e t s c o l l i s i o n time very l a r g e so that can find minimum c o l l i s i o n time
timeToCollision = bigTime ;
for ( int k = 0; k<N; k++) {
i f ( collisionTime [ k]< timeToCollision ) {
timeToCollision = collisionTime [ k ] ;
nextCollider = k ;
}
}
nextPartner = partner [ nextCollider ] ;
}
public void move ( ) {
for ( int k = 0; k<N; k++) {
collisionTime [ k ] −= timeToCollision ;
x [ k ] = PBC. position ( x [ k]+vx [ k ] timeToCollision , Lx ) ;
y [ k ] = PBC. position ( y [ k]+vy [ k ] timeToCollision , Ly ) ;
}
}
public void contact ( ) {
/ / computes c o l l i s i o n dynamics between n e x t C o l l i d e r and nextPartner at contact
double dx = PBC. separation ( x [ nextCollider ]−x [ nextPartner ] , Lx ) ;
double dy = PBC. separation ( y [ nextCollider ]−y [ nextPartner ] , Ly ) ;
double dvx = vx [ nextCollider ]−vx [ nextPartner ] ;
double dvy = vy [ nextCollider ]−vy [ nextPartner ] ;
double factor = dx dvx+dy dvy ;
double delvx = −factor dx ;
double delvy = −factor dy ;
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 285
vx [ nextCollider ] += delvx ;
vy [ nextCollider ] += delvy ;
vx [ nextPartner ] −= delvx ;
vy [ nextPartner ] −= delvy ;
virialSum += delvx dx+delvy dy ;
}
public void setDefaultCollisionTimes ( ) {
collisionTime [ nextCollider ] = bigTime ;
collisionTime [ nextPartner ] = bigTime ;
/ / s e t s c o l l i s i o n times to bigTime f o r a l l p a r t i c l e s s e t to c o l l i d e
/ / with the two c o l l i d i n g p a r t i c l e s
for ( int k = 0; k<N; k++) {
i f ( partner [ k]== nextCollider ) {
collisionTime [ k ] = bigTime ;
} else i f ( partner [ k]== nextPartner ) {
collisionTime [ k ] = bigTime ;
}
}
}
public void newCollisionTimes ( ) {
/ / f i n d s new c o l l i s i o n times f o r a l l p a r t i c l e s which were s e t to c o l l i d e
/ / with two c o l l i d i n g p a r t i c l e s ; a l s o f i n d s new c o l l i s i o n
/ / times f o r two c o l l i d i n g p a r t i c l e s .
for ( int k = 0; k<N; k++) {
i f ( ( k!= nextCollider )&&(k!= nextPartner ) ) {
checkCollision (k , nextPartner ) ;
checkCollision (k , nextCollider ) ;
}
}
}
The colliding pair and the next collision time are found in method minimumCollisionTime,
and all the particles are moved forward in move until contact occurs. The collision dynamics
of the colliding pair is computed in method contact, where the contribution to the virial is
also found. In setDefaultCollisionTimes we set all the collisions times to an arbitrarily large
value, bigTime, for all pairs of particles that need to be updated. Then in newCollisionTimes
we update the collision times for those particles in step 5.
In method initialize we initialize various variables and most importantly compute the
minimum collision time for each particle using method checkCollision. The ith element in
the array, collisionTime, stores the minimum collision time for particle i with all the other
particles. The array element partner[i] stores the particle label of the collision partner corresponding
to this time. The collision time for each particle is initially set to an arbitrarily large
value, bigTime, to take into account that at any given time, some particles have no collision
partners. The methods for setting the initial positions and velocities are the same as those used
for simulating Lennard–Jones particles.
Listing 8.17: Method for generating the initial conﬁguration of hard disks.
public void i n i t i a l i z e ( String configuration ) {
resetAverages ( ) ;
x = new double [N] ;
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 286
y = new double [N] ;
vx = new double [N] ;
vy = new double [N] ;
collisionTime = new double [N] ;
partner = new int [N] ;
i f ( configuration . equals ( "regular" ) ) {
setRegularPositions ( ) ;
} else {
setRandomPositions ( ) ;
}
s e t V e l o c i t i e s ( ) ;
for ( int i = 0; i <N; ++i ) {
/ / s e t s unknown c o l l i s i o n times to a big number
collisionTime [ i ] = bigTime ;
}
/ / find i n i t i a l c o l l i s i o n times f o r a l l p a r t i c l e s
for ( int i = 0; i <N−1; i ++) {
for ( int j = i +1; j <N; j ++) {
checkCollision ( i , j ) ;
}
}
}
public void resetAverages ( ) {
t = 0;
virialSum = 0;
}
Method checkCollision uses the relations (8.33) and (8.35) to determine whether particles
i and j will collide and if so, the time tij until their collision. We check for collisions with
particle j in the central cell as well as with particle j in the eight image cells surrounding the
central cell as shown in Figure 8.8. For very dilute systems we might need to check further
periodic images. For the densities we will consider, such a check should not be necessary.
Listing 8.18: Method for checking the collision time and collision partners of each particle.
public void checkCollision ( int i , int j ) {
/ / consider c o l l i s i o n s between i and j and p e r i o d i c images of j
double dvx = vx [ i ]−vx [ j ] ;
double dvy = vy [ i ]−vy [ j ] ;
double v2 = dvx dvx+dvy dvy ;
for ( int xCell = −1; xCell <=1; xCell ++) {
for ( int yCell = −1; yCell <=1; yCell ++) {
double dx = x [ i ]−x [ j ]+ xCell Lx ;
double dy = y [ i ]−y [ j ]+ yCell Ly ;
double b i j = dx dvx+dy dvy ;
i f ( bij <0) {
double r2 = dx dx+dy dy ;
double discriminant = b i j bij −v2 ( r2 −1);
i f ( discriminant >0) {
double t i j = (− bij −Math . sqrt ( discriminant ) ) / v2 ;
i f ( t i j <collisionTime [ i ] ) {
collisionTime [ i ] = t i j ;
partner [ i ] = j ;
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 287
1 1 1
1 1
1 1 1
Ly
Lx
1
2
Figure 8.8: The positions and velocities of disks 1 and 2 in the ﬁgure are such that disk 1 collides
with an image of disk 2 that is not the image closest to disk 1. The periodic images of disk 2 are
not shown.
}
i f ( t i j <collisionTime [ j ] ) {
collisionTime [ j ] = t i j ;
partner [ j ] = i ;
}
}
}
}
}
}
The main thermodynamic quantity of interest for hard disks is the mean pressure P . Because
the forces act only when two disks are in contact, we have to modify the form of (8.9).
We write Fij(t) = Iij δ(t − tc), where tc is the time at which the collision occurs. This form of
Fij implies that the force is nonzero only when there is a collision between i and j. The delta
function δ(t) is inﬁnite for t = 0 and is zero otherwise; δ(t) is deﬁned by its use in an integral as
shown in (8.36). This form of the force yields
t
0
Iij δ(t − tc)dt = Iij = m∆vij (8.36)
where we have used Newton’s second law and assumed that a single collision has occurred
during the time interval t. The quantity ∆vij is given by ∆vij = vi −vi −(vj −vj). If we explicitly
include the time average to account for all collisions during the time interval t, we can write
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 288
(8.9) as
P V
NkT
− 1 =
1
dNkT
1
t
ij
t
0
rij · Iij δ(t − tc)dt
=
1
dNkT
1
t cij
m∆vij · rij. (8.37)
The sum in (8.37) is over all collisions cij between disks i and j in the time interval t; rij is the
vector between the centers of the disks at the time of a collision; the magnitude of rij in (8.37)
is σ.
Listing 8.19: Method for calculating the pressure.
public double pressure ( ) {
double area = Lx Ly ;
return 1 + virialSum /(2 t N temperature ) ;
}
As discussed in Problem 8.16, an important check on the calculated trajectories of a hard
disk system is that no two disks overlap. The following method tests for this condition.
public void checkOverlap ( ) {
for ( int i = 0; i < N−1; ++i ) {
for ( int j = i +1; j < N; ++j ) {
double dx = PBC. separation ( x [ i ] − x [ j ] , Lx ) ;
double dy = PBC. separation ( y [ i ] − y [ j ] , Ly ) ;
i f ( dx dx+dy dy < 1.0) {
System . out . println ( "Particles " + i + " and" + j + " overlap" ) ;
}
}
}
}
To complete class HardDisks, we need to add the class declarations, which we show in
Listing 8.20, and the draw method, which is the same as in class LJParticles. You can use a
slightly modiﬁed version of class LJParticlesApp as the target class for this application, but
note that you will need to modify the LJParticlesLoader class to store diﬀerent arrays. The
number of collisions, the time, and a plot of the pressure versus time should be displayed. We
will leave the task of writing the target class as an exercise.
Listing 8.20: Class declarations for HardDisks.
package org . opensourcephysics . sip . ch08 . hd ;
import java . awt . ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . numerics . ;
public class HardDisks implements Drawable {
public double x [ ] , y [ ] , vx [ ] , vy [ ] ;
public double collisionTime [ ] ;
public int partner [ ] ;
public int N;
public double Lx ;
public double Ly ;
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 289
public double keSum = 0 , virialSum = 0;
public int nextCollider , nextPartner ;
public double timeToCollision ;
public double t = 0;
public double bigTime = 1.0E10 ;
public double temperature ;
public int numberOfCollisions = 0;
Problem 8.16. Initial test of class HardDisks
(a) Because even a small error in computing the trajectories of the disks will eventually lead
to their overlap and hence to a fatal error, it is necessary to test class HardDisks carefully.
For simplicity, start from a lattice conﬁguration. The most important test of the program is
to monitor the computed positions of the hard disks for overlaps. If the distance between
the centers of any two hard disks is less then unity (distances are measured in units of σ),
there must be a serious error in the program. To check for the overlap of hard disks, include
method checkOverlap in method step while you are testing the program.
(b) The temperature for a system of hard disks is constant and can be deﬁned as in (8.6). Why
does the temperature not ﬂuctuate as it does for a system of particles interacting with a
continuous potential? The constancy of the temperature can be used as another check on
your program. What is the eﬀect of increasing all the velocities by a factor of two? What is
the natural unit of time? Explain why the state of the system is determined by the density
only and not by the temperature.
(c) Generate equilibrium conﬁgurations of a system of N = 64 disks in a square cell of linear
dimension L = 12. Suppose that at t = 0, the constraint that 0 ≤ x ≤ 12 is removed, and the
disks are allowed to move in a rectangular cell with Lx = 24 and Ly = 12. Does the system
become more or less random? What is the qualitative nature of the time dependence of n(t),
the number of disks on the left half of the cell?
(d) Modify your program so that averages are not computed until the system is in equilibrium.
Compute the virial (8.37) and make a rough estimate of the error in your determination of
the mean pressure due to statistical ﬂuctuations.
(e) Modify your program so that you can compute the velocity and speed distributions and
verify that the computed distributions have the expected forms.
Problem 8.17. Static properties of hard disks
(a) As we have seen in Section 8.7, a very time consuming part of the simulation is equilibrating
a system from an arbitrary initial conﬁguration. One way to obtain a set of initial positions
is to add the hard disks sequentially with random positions and reject an additional hard
disk if it overlaps any disks already present. Although this method is very ineﬃcient at
high densities, try it so that you will have a better idea of how diﬃcult it is to obtain a high
density conﬁguration in this way. A much better method is to place the disks on the sites of
a lattice.
(b) The largest number of hard disks that can be placed into a ﬁxed volume deﬁnes the maximum
density. What is the maximum density if the disks are placed on a square lattice?
What is the maximum density if the disks are placed on a triangular lattice? Suppose that
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 290
the initial condition is chosen to be a square lattice with N = 100 and L = 11 so that each
particle has four nearest neighbors. What is the qualitative nature of the system after several
hundred collisions have occurred? Do most particles still have four nearest neighbors,
or are there regions where most particles have six neighbors?
(c) The dependence of the mean pressure P on the density ρ is of interest, as it is for a system
with a continuous potential. Is P a monotonically increasing function of ρ? Is a system of
hard disks always a ﬂuid, or is there a ﬂuid to solid transition at higher densities? You will
not be able to ﬁnd deﬁnitive answers to these questions for N = 64. Most simulations in
the 1960s and 70s were done for systems of N = 108 hard disks. The largest simulations
were for several hundred particles and new insight of the properties of liquids was found.
Find the dependence of the pressure on the density, beginning at low densities and slowly
increasing the density starting from a conﬁguration from a lower density. At any given time,
the maximum density increase is given by the minimum distance between any two disks. To
increase the density, rescale all the positions and the cell size so that the minimum distance
is reduced by a factor of two. Repeat this procedure until you reach the desired density. You
will need to equilibrate the system between rescalings.
(d) Compute the radial distribution function g(r) for the same densities you considered for the
Lennard–Jones interaction. Computing g(r) is more subtle for the hard disk system than it
is for a system with a continuous potential. For the latter system, we can accumulate the
sums needed to compute g(r) at regular intervals and simply take the average of the computed
quantities. However, in event driven dynamics, the time does not evolve uniformly.
The simplest procedure is to keep track of the number of collisions and to compute the necessary
sums after a certain number of collisions has occurred. If the number of collisions
is suﬃciently large, the time elapsed will be approximately the same. The relation of the
pressure to g(r) for hard disks is discussed on page 620.
(e) Compare g(r) for the hard disk and Lennard–Jones interactions at the same density. On the
basis of your results, which part of the Lennard–Jones interaction plays the dominant role
in determining the structure of a dense Lennard–Jones liquid?
Simulations of systems of hard disks and hard spheres have shown that the structure of
these systems does not diﬀer signiﬁcantly from the structure of systems with more complicated
interactions. Given this insight, our present understanding of liquids is based on the use of the
hard sphere (disk) system as a reference system; the diﬀerences between the hard sphere interaction
and the more complicated interaction of interest are treated as a perturbation about this
reference system. Thus, even though the particles interact strongly in a dense gas and a liquid,
we now have a perturbation theory of liquids, thanks to the insight gained from simulations.
Another important insight that was obtained from simulations is that the solid phase does not
require an attractive part of the intermolecular potential. That is, hard spheres and disks have
a freezing or melting transition, although the nature of the latter is still a subject of current
interest.
In Problem 8.18 we consider two physical quantities associated with the dynamics of a
system of hard disks, namely the mean free time and the mean free path, quantities of interest
in kinetic theory.
Problem 8.18. Mean free path and collision time
(a) Class HardDisks provides the information needed to determine the mean free time tc, that
is, the average time a particle travels between collisions. For example, suppose we know that
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 291
40 collisions occurred in a time t = 2.5 for a system of N = 16 disks. Because two particles
are involved in each collision, there was an average of 80/16 collisions per particle. Hence,
tc = 2.5/(80/16) = 0.5. Write a method to compute tc and determine tc as a function of ρ.
(b) The mean free path is the mean distance a particle travels between collisions. In introductory
textbooks, the relation of to tc is given by the simple relation = vtc, where v is the
root-mean square velocity, v = v2. Write a method to compute the mean free path of the
particles. Note that the displacement of particle i during the time t is vit, where vi is the
speed of particle i. What relation do you ﬁnd between and tc?
(c) Write a method to determine the distribution of times between collisions. What is the qualitative
form of the distribution? How does the width of this distribution depend on ρ?
8.10 Dynamical Properties
The mean free time and the mean free path are well deﬁned for hard disks for which the meaning
of a collision is clear. From kinetic theory we know that both quantities are related to the
transport properties of a dilute gas. However, the concept of a collision is not well deﬁned for
systems with a continuous interaction, such as the Lennard–Jones potential. In the following,
we take a more general approach to the dynamics of a many-body system and discuss how the
transport of particles in a system near equilibrium is related to the equilibrium properties of the
system.
Consider the trajectory of a particular particle, for example, particle 1 in a system that is
in equilibrium. At some arbitrarily chosen time t = 0, its position is r1(0). At a later time t, its
displacement is r1(t) − r1(0). If there was no net force on the particle during this time interval,
then r1(t) − r1(0) would increase linearly with t. However, a particle in a ﬂuid undergoes many
collisions, and on the average its net displacement would be zero. A more interesting quantity
is the mean square displacement deﬁned as
R(t)2 = [r1(t) − r1(0)]2. (8.38)
The average in (8.38) is over all possible choices of the time origin. Because the system is in
equilibrium, the choice of t = 0 is arbitrary, and R1(t)2 depends only on the time diﬀerence t.
If the collisions of particle 1 with the other particles are random, then we would suspect
that particle 1 undergoes a random walk, and the t-dependence of R(t)2 would be given by (see
(7.76))
R(t)2 = 2dDt (t → ∞) (8.39)
where d is the spatial dimension. The coeﬃcient D in (8.39) is known as the self-diﬀusion coeﬃcient
and is an example of a transport coeﬃcient. Because the average behavior of all the
particles should be the same, we would ﬁnd better results if we average over all particles. The
relation (8.39) relates the macroscopic transport coeﬃcient D to a microscopic quantity R(t)2
and gives us a way of computing D.
The easiest way of computing R(t)2 is to save the position of a particle in a ﬁle at regular
time intervals. We later can use a separate program to read the data ﬁle and compute R(t)2. To
understand the procedure for computing R(t)2, we consider a simple example. Suppose that
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 292
the position of a particle in a one-dimensional system is given by x(t = 0) = 1.65, x(t = 1) = 1.62,
x(t = 2) = 1.84, and x(t = 3) = 2.22. If we average over all possible time origins, we obtain
R(t = 1)2 =
1
3
x(1) − x(0)
2
+ x(2) − x(1)
2
+ x(3) − x(2)
2
=
1
3
0.0009 + 0.0484 + 0.1444 v = 0.0646
R(t = 2)2 =
1
2
x(2) − x(0)
2
+ (x(3) − x(1)
2
=
1
2
0.0361 + 0.36 = 0.1981
R(t = 3)2 = x(3) − x(0)
2
= 0.3249.
Note that there are fewer combinations of the positions as the time diﬀerence increases.
In Listing 8.21 we show a method that computes R(t)2 assuming that the positions of all N
particles have been collected for n times in the arrays xSave[i][k] and ySave[i][k]; the time is
indexed by k. Because of periodic boundary conditions, we cannot ﬁnd the distance moved by a
particle by just keeping track of its position. Imagine that a particle moved in only one direction
and returned to its original position. If we just subtracted the coordinates of the position, we
would ﬁnd that the particle’s displacement was zero when in fact it really moved an amount
equal to the length of the simulation cell. To keep track of the movement of each particle, we
use two arrays, xWrap and yWrap. Every time particle i moves past the right boundary in the
time interval k*dk to (k+1)*dk, xWrap[i][k] is incremented by Lx. Similarly, each time the
particle moves past the left boundary xWrap[i][k] is decremented by Lx. A similar procedure
is used for yWrap.
Listing 8.21: Listing of method computeR2 for ﬁnding the mean square displacement.
public void computeR2 ( PlotFrame data ) {
for ( int dk = 1; dk < n−1; ++dk) { / / loops over time i n t e r v a l s
int norm = 0;
double r2 = 0;
for ( int i = 0; i < N; i ++) { / / loops over p a r t i c l e s
/ / time o r i g i n l a b e l e d by k0
for ( int k0 = 0; k0 < n−dk−1; ++k0 ) { / / loops over time o r i g i n s
double dx = ( xSave [ i ] [ k0+dk]+xWrap[ i ] [ k0+dk ] )
−(xSave [ i ] [ k0]+xWrap[ i ] [ k0+dk ] ) ;
double dy = ( ySave [ i ] [ k0+dk]+yWrap[ i ] [ k0+dk ] )
−(ySave [ i ] [ k0]+yWrap[ i ] [ k0+dk ] ) ;
r2 += dx dx + dy dy ;
norm++;
}
}
data . append (0 ,dk timeInterval , r2/norm ) ;
}
}
We show our results for R(t)2 for a system of Lennard–Jones particles in Figure 8.9. Note
that R(t)2 increases approximately linearly with t with a slope of roughly 0.61. From (8.39) the
corresponding self-diﬀusion coeﬃcient is D = 0.61/4 ≈ 0.15. In Problem 8.19 we use method
computeR2 to compute the self-diﬀusion coeﬃcient. An easier but less direct way of computing
D is discussed in Project 8.23.
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 293
0.0
1.0
2.0
3.0
4.0
0.0 2.0 4.0 6.0
t
R2
Figure 8.9: The time dependence of the mean square displacement R(t)2 for one particle in a
two-dimensional Lennard–Jones system with N = 16, L = 5, and E = 5.8115. The position of a
particle was saved at intervals of 0.5. Much better results can be obtained by averaging over all
particles and over a longer run. The least squares ﬁt was made between t = 1.5 and t = 5.5. As
expected, this ﬁt does not pass through the origin. The slope of the ﬁt is 0.61.
Problem 8.19. The self-diﬀusion coeﬃcient
(a) Use either your hard disk or molecular dynamics program and visually follow the motion of
a particular particle by “tagging” it, for example, by drawing its path with a diﬀerent color.
Describe its motion qualitatively.
(b) Modify your program so that the coordinates of the particles are saved at regular intervals
(see Appendix 8A). The optimum time interval needs to be determined empirically. If you
save the coordinates too often, the data ﬁle will become very large, and you will waste
time saving the coordinates. If you do not save the positions often enough, you will lose
information. Because the time step ∆t must be small compared to any interesting time
scale, we know that the time interval for saving the positions must be at least an order of
magnitude greater than ∆t. A good ﬁrst guess is to choose the time interval for saving the
coordinates to be the order of 10∆t. The easiest procedure for hard disks is to save the
coordinates at intervals measured in terms of the number of collisions. If you average over
a suﬃcient number of collisions, you can ﬁnd the relation between the elapsed time and the
number of collisions.
(c) For a ﬁnite system, the time diﬀerence t cannot be chosen to be too large because the displacement
of a particle is bounded. What is the maximum value of R2(t)?
(d) Compute R(t)2 for conditions that correspond to a dense ﬂuid. Does R(t)2 increase as t2 as
for a free particle or more slowly? Does R(t)2 increase linearly with t for longer times?
(e) Use the relation (8.39) to estimate the magnitude of D from the slope of R(t)2 for the time
interval for which R(t)2 is approximately linear. Obtain D for several diﬀerent temperatures
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 294
and densities. (A careful study of R(t)2 for much larger systems and much longer times
would show that R(t)2 is not proportional to t in two dimensions. Instead, R(t)2 has a term
proportional to t logt, which dominates the linear t term if t is suﬃciently large. We will
not be able to observe the eﬀects of this logarithmic term, and we can interpret our results
for R(t)2 in terms of an “eﬀective” diﬀusion coeﬃcient. No such problem exists for three
dimensions. See Problem 8.20d.)
(f) Estimate the accuracy of your determination of D. How sensitive is it to the value of ∆t?
How does this accuracy compare to your estimates of other physical quantities such as the
mean pressure?
(g) Compute R(t)2 for an equilibrium conﬁguration corresponding to a harmonic solid. What
is the qualitative behavior of R(t)2?
(h) Compute R(t)2 for an equilibrium conﬁguration corresponding to a dilute gas. Is R(t)2
proportional to t for small times? Do the particles diﬀuse over short time intervals?
Another physically important property is the velocity autocorrelation function C(t). Suppose
that particle i has velocity vi at time t = 0. If there was no net force on particle i, its velocity
would remain constant. However, its interactions with other particles in the ﬂuid will change
the particle’s velocity, and we expect that after several collisions, its velocity will not be strongly
correlated with its velocity at an earlier time. We deﬁne C(t) as
C(t) =
1
v2
0
vi(t) · vi(0) (8.40)
where v2
0 = vi(0) · vi(0) = kT d/m. We have deﬁned C(t) such that C(t = 0) = 1. As in our discussion
of the mean square displacement, the average in (8.40) is over all possible time origins.
Better results would be obtained by averaging over all particles. For large time diﬀerences t,
we expect vi(t) to be independent of vi(0), and hence, C(t) → 0 for t → ∞. (We have implicitly
assumed that vi(t) = 0.)
It can be shown that the self-diﬀusion coeﬃcient deﬁned by (8.39) can be related to the
integral of C(t):
D = v2
0
∞
0
C(t)dt. (8.41)
Other transport coeﬃcients such as the shear viscosity and the thermal conductivity can also be
expressed as an integral over a corresponding autocorrelation function. The qualitative properties
of the velocity autocorrelation function are explored in Problem 8.20.
Problem 8.20. The velocity autocorrelation function
(a) Modify your hard disk or molecular dynamics program so that the velocity of a particular
particle is saved at regular time intervals. Then modify method computeR2 so that you can
compute C(t). The following code might be useful.
for ( int timeDiff = 1; timeDiff < maxTimeDiff ; timeDiff ++) {
for ( int time0 = 0; time0 < maxTime0 − timeDiff ; time0++) {
correl [ timeDiff ] += vxSave [ time0 + timeDiff ] vx [ time0 ] ;
correl [ timeDiff ] += vySave [ time0 + timeDiff ] vy [ time0 ] ;
normalization [ timeDiff ]++;
}
}
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 295
First compute C(t) for a relatively low density system. Plot C(t) versus t and describe its
qualitative behavior. Does it more or less decay exponentially?
(b) Increase the density and compute C(t) again. How does the qualitative behavior of C(t)
change? Why does C(t) become negative after a relatively short time?
(c)∗ To obtain quantitative results, modify your program so that C(t) is averaged over all particles.
Compute C(t) for time diﬀerences in the range 10–40 mean collision times and densities
that are about a factor of two less than maximum close packing. Also choose N ≥ 256.
If you are careful, you will be able to observe that C(t) decays as t−1 for very long time differences.
This long-time tail is due to hydrodynamic eﬀects; that is, part of the velocity of a
particle is stored in a microscopic vortex that dies oﬀ very slowly. The existence of this tail
was ﬁrst found by simulations and implies that the self-diﬀusion constant is not deﬁned in
two dimensions because the integral (8.41) does not exist (the integral diverges for large t).
In three dimensions, C(t) ∼ t−3/2 and the self-diﬀusion coeﬃcient is well deﬁned.
(d) Compute C(t) for an equilibrium solid. Plot C(t) versus t and describe its qualitative behavior.
Explain your results in terms of the oscillatory motion of the particles about their
lattice sites.
(e) Contrast the behavior of the mean square displacement, the velocity autocorrelation function,
and the radial distribution function in the solid and ﬂuid phases and explain how these
quantities can be used to indicate the nature of the phase.
8.11 Extensions
The primary goals of this chapter have been to introduce the method of molecular dynamics,
some of the concepts of statistical mechanics and kinetic theory, and the qualitative behavior
of systems of many particles. Although we found that simulations of systems as small as 64
particles show some of the qualitative properties of macroscopic systems, we would need to
simulate larger systems to obtain quantitative results. Most simulations of systems with simple
interactions require only several hundred to several thousand particles to obtain reliable results
for equilibrium quantities such as the equation of state. How do we know if the size of our
system is suﬃcient to yield quantitative results? The simple answer is to repeat the simulation
for a diferent value of N. In the same spirit, you can determine if your runs are long enough to
give statistically meaningful averages.
In general, the most time consuming parts of a molecular dynamics simulation are generating
an appropriate initial conﬁguration and doing the bookkeeping necessary for the force and
energy calculations. If the force is short range, there are several ways to reduce the equilibration
time. For example, suppose we want to simulate a system of 864 particles in three dimensions.
We can ﬁrst simulate a system of 108 particles and allow the small system to come to equilibrium
at the desired temperature. After equilibrium has been established, the small system can
be replicated twice in each direction to generate the desired system of 864 particles. All of the
velocities are reassigned at random using the Maxwell–Boltzmann distribution. Equilibration
of the new system is usually established quickly.
The computer time required for our simple molecular dynamics program is order N2 for
each time step. The reason for this N2- dependence is that the energy and force calculations
require sums over all 1
2 N(N − 1) pairs of particles. If the interactions are short range, the time
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 296
required for these sums can be reduced to approximately order N. The idea is to take advantage
of the fact that many pairs of particles are separated by a distance much greater than the eﬀective
range rc of the interparticle interaction. For example, if the distance between two particles
interacting via the Lennard–Jones potential is suﬃciently large, the magnitude of the potential
is so small that it can be neglected. Popular choices for the cutoﬀ rc are 2.3σ and 2.5σ. The use
of a cutoﬀ is equivalent to assuming that u(r) in (8.2) is given by the usual Lennard–Jones form
for r < rc and is zero for r > rc. However, this use of a cutoﬀ implies that u(r) has a discontinuity
at r = rc, which means that whenever a particle pair “crosses” the cutoﬀ distance, the energy
jumps, thus aﬀecting the apparent energy conservation. To avoid this problem, it is a good idea
to modify the potential so as to eliminate the discontinuity in both u(r) and the force −du/dr.
Hence, we write
˜u(r) = u(r) − u(rc) −
du(r)
dr r=rc
(r − rc) (8.42)
where u(r) is the usual Lennard–Jones potential.
The use of the interparticle potential (8.42) to calculate the force and the energy requires
considering only those pairs of particles whose separation is less than rc. Because testing
whether each pair satisﬁes this criterion is an order N2 calculation, we have to limit the number
of pairs tested. One way is to divide the box into small cells and to only compute the distance
between particles that are in the same cell or in nearby cells. Another method is to maintain
a list for each particle of its neighbors whose separation is less than a distance rn, where rn is
chosen to be slightly greater than rc. The idea is to use the same list of neighbors for several
time steps (usually 10–20) so that the time consuming job of updating the list of neighbors does
not have to be done too often. The cell method and the neighbor list method do not become
eﬃcient until N is approximately a few hundred particles.
Usually, the neighbor list leads to the consideration of fewer particle pairs in the force
calculation than the cell list. We provide a method to compute the neighbor list below. A more
eﬃcient approach is to use cells to construct the neighbor list.
public void computeNeighborList ( ) {
for ( int i = 0; i < N−1; i ++) {
numberInList [ i ] = 0;
for ( int j = i +1; j < N; j ++) {
double dx = separation ( x [ i ] − x [ j ] , Lx ) ;
double dy = separation ( y [ i ] − y [ j ] , Ly ) ;
double r2 = dx dx + dy dy ;
i f ( r2 < r2ListCutoff ) {
l i s t [ i ] [ numberInList [ i ] ] = j ;
numberInList [ i ]++;
}
}
}
}
To use this list in method computeAcceleration, we replace the for loops by
for ( int i = 0; i < N−1; i ++) {
for ( int k = 0; k < numberInList [ i ] ; k++) {
int j = l i s t [ i ] [ k ] ;
}
}
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 297
The method computeNeighborList should be called before a particle may have moved a distance
equal to the diﬀerence rn − rc. This time depends on the density and the temperature. For
dense systems a reasonable value for rn is 2.7σ. Simulations of small systems can be used to
determine the time between calls of computeNeighborList.
Note that in method computeNeighborList, only particles j > i are included in list[i]. In
Section 15.10 we will consider Monte Carlo simulations where a particle is chosen at random,
and its potential energy of interaction must be computed. In this case we cannot take advantage
of Newton’s third law, and a neighbor list must be created for all particles that are within a
distance rn of particle i.
∗Problem 8.21. Neighbor lists
(a) Simulate a system of N = 64 Lennard–Jones particles in a square cell with L = 10 at a
temperature T = 2.0. After the system has reached equilibrium, determine the shortest
time for any particle to move a distance equal to 0.2. Use half this time in the rest of the
program as the time between updates of the neighbor list.
(b) Run your simulation with and without the neighbor list starting from identical initial conﬁgurations.
Choose rc = 2.3 and use the modiﬁed potential given in (8.42). Calculate g(r),
the pressure, the heat capacity (see Problem 8.8), and the temperature. Make sure your
results are identical. Compare the amount of CPU time with and without the use of the
neighbor list.
(c) Repeat part (b) with N = 256 but with the same density and total energy. You can adjust the
total energy by scaling the initial velocities. Increase N until the CPU time for the neighbor
list version is faster.
(d) Continue increasing the number of particles by a factor of four, but only use the program
with the neighbor list. Determine the CPU time required for one time step as a function of
N.
So far we have discussed molecular dynamics simulations at ﬁxed energy, volume, and
number of particles. Molecular dynamics simulations at ﬁxed temperature are discussed in
Project 8.24. It is also possible to modify the dynamics so as to do molecular dynamics simulations
at constant pressure and to do simulations in which the shape of the cell is determined by
the dynamics, rather than imposed by the program. Such a simulation is essential for the study
of solid-to-solid transitions where the major change is the shape of the crystal.
In addition to these algorithmic advances, there is much more to learn about the properties
of the system. For example, how are transport properties such as the viscosity and the thermal
conductivity related to the trajectories? We have also not discussed one of the most fundamental
properties of a many-body system, namely, its entropy. In brief, not all macroscopic properties
of a many-body system, including the entropy, can be deﬁned as a time average over some
function of the phase space coordinates of the particles (but see Ma). However, changes in the
entropy can be computed by using thermodynamic integration.
The fundamental limitation of molecular dynamics is the existence of multiple time scales.
We must choose the time step ∆t to be smaller than any physical time scale in the system. For
a solid, the smallest time scale is the period of the oscillatory motion of individual particles
about their equilibrium positions. If we want to know how the solid responds to the addition
of an interstitial particle or a vacancy, we would have to run for millions of small time steps
for the vacancy to move several interparticle distances. Although this particular problem can
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 298
be overcome by using a faster computer, there are many problems for which no imaginable
supercomputer would be suﬃcient. One of the biggest current challenges is the protein folding
problem. The biological function of a protein is determined by its three-dimensional structure
which is encoded by the sequence of amino acids in the protein. At present, we know little
about how the protein forms its three-dimensional structure. Such formidable computational
challenges remind us that we cannot simply put a problem on a computer and let the computer
tell us the answer. In particular, for many problems, molecular dynamics methods need to be
complemented by other simulation methods, especially Monte Carlo methods (see Chapter 15).
The emphasis in current applications of molecular dynamics is shifting from studies of
simple equilibrium ﬂuids to studies of more complex ﬂuids and nonequilibrium systems. For
example, how does a solid form when the temperature of a liquid is lowered quickly? How
does a crack propagate in a brittle solid? What is the nature of the glass transition? Molecular
dynamics and related methods will play an important role in aiding our understanding of these
and many other problems.
8.12 Projects
Many of the pioneering applications of molecular dynamics were done on relatively small systems.
It is interesting to peruse the research literature of the past three decades to see how
much physical insight was obtained from these simulations. Many research-level problems can
be generated by ﬁrst reproducing previously published work and then extending the work to
larger systems or longer run times to obtain better statistics. Many related projects are discussed
in Chapter 15.
Project 8.22. The classical Heisenberg model of magnetism
Magnetism is intrinsically a quantum phenomenon. One common model of magnetism is the
Heisenberg model which is deﬁned by the Hamiltonian or energy function:
H = −J
<ij>
Si · Sj (8.43)
where Si is the spin operator at the ith lattice site. The sum is over nearest neighbor sites of
the lattice, and the (positive) coupling constant J is a measure of the strength of the interaction
between spins. The negative sign indicates that the lowest energy state is ferromagnetic. The
magnetic moment of a particle on a site is proportional to the particle’s spin, and the proportionality
constant is absorbed into the constant J.
For many models of magnetism, such as the Ising model (see Section 15.5), there is no
obvious dynamics. However, for the Heisenberg model we can motivate a dynamics using the
standard rule for the time evolution of an operator given in quantum mechanics texts. For
simplicity, we will consider a one-dimensional lattice. The equation for the time development
becomes (see Slani˘c et al.)
dSi
dt
= JSi × (Si−1 + Si+1). (8.44)
In general, S in (8.44) is an operator. However, if the magnitude of the spin is suﬃciently large,
the system can be treated classically, and S can be interpreted as a three-dimensional unit vector.
The dynamics in (8.44) conserves the total energy given in (8.43) and the total magnetization,
M = i Si.
We can simulate the classical Heisenberg magnet using an ODE solver to solve the ﬁrstorder
diﬀerential equation (8.44).
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 299
(a) Explain why there is no obvious way to determine the mean temperature of this system.
(b) Write a program to simulate the Heisenberg model on a one-dimensional lattice using periodic
boundary conditions. Choose J = 1 and N ≥ 100. Use the RK4 ODE solver, and plot
the energy and magnetization as a function of time. These two quantities should be constant
within the accuracy of the ODE solver. Also, plot each component of the spin versus
position or draw a three-dimensional representation of the spin at each site so that you can
visualize the state of the system.
(c) Begin with all spins in the positive z direction, except for one spin pointing in the negative
z direction. Use N ≥ 1000. Deﬁne the energy of spin i as i = −Si · (Si−1 + Si+1)/2. Plot the
local energy as a function of i. Describe how the local energy diﬀuses. What patterns do
you observe? Do the locations of the peaks in the local energy move with a constant speed?
(d) One of the interesting dynamical phenomena we can explore is that of spin waves. Begin
with all Sz,i = 1 except for a group of 20 spins, where Sx,i = Acoski, Sy,i = Asinki, and
Sz,i = 1 − S2
x,i + S2
y,i. Choose A = 0.2 and k = 1. Describe the motion of the spins. Compute
the mean position of this spin wave deﬁned by x = i i(1−Sz,i). Show that x changes linearly
with time indicating a constant spin wave velocity. Vary k and A to determine what eﬀect
their values have on the speed of the spin wave.
(e) Read about sympletic algorithms in the article by Tsai, Lee, and Landau and write your
own ODE solver for one of them. Compare your results to the results you found for the RK4
algorithm. Is the total energy better conserved for the same value of the time step?
Project 8.23. Single particle metrics and ergodic behavior
As mentioned in Section 8.7, the quasi-ergodic hypothesis assumes that time averages and ensemble
averages are identical for a system in thermodynamic equilibrium. The assumption is
that if we run a molecular dynamics simulation for a suﬃciently long time, then the dynamical
trajectory will ﬁll the accessible phase space.
One way to conﬁrm the quasi-ergodic hypothesis is to compute an ensemble average by simulating
many independent copies of the system of interest using diﬀerent initial conﬁgurations.
Another way is to simulate a very large system and compare the behavior of diﬀerent parts. A
more direct measure of ergodicity (see Thirumalai and Mountain) is based on a comparison of
the time averaged quantity fi(t) of fi for particle i to its average for all other particles. If the
system is ergodic, then all particles see the same average environment, and the time average
fi(t) for each particle will be the same if t is suﬃciently long. Note that fi(t) is the average of
the quantity fi over the time interval t and not the value of fi at time t. The time average of fi is
deﬁned as
fi(t) =
1
t
t
0
f (t )dt (8.45)
and the average of fi(t) over all particles is written as
f (t) =
1
N
N
i=1
fi(t). (8.46)
One of the physical quantities of interest is the energy of a particle ei deﬁned as
ei =
p2
i
2mi
+
1
2
i j
u(rij). (8.47)
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 300
The factor of 1/2 is included in the potential energy term in (8.47) because the interaction
energy is shared between pairs of particles. The above considerations lead us to deﬁne the
energy metric, Ωe(t), as
Ωe(t) =
1
N
N
i=1
ei(t) − e(t)
2
. (8.48)
(a) Compute Ωe(t) for a system of Lennard–Jones particles at a relatively high temperature.
Determine ei(t) at time intervals of 0.5 or less and average Ωe over as many time origins as
possible. If the system is ergodic over the time interval t, then it can be shown that Ωe(t)
decreases as 1/t. Plot 1/Ωe(t) versus t. Do you ﬁnd that 1/Ωe(t) eventually behaves linearly
with t? Nonergodic behavior might be found by rapidly reducing the kinetic energy (a temperature
quench) and obtaining an amorphous solid or glass rather than a crystalline solid.
However, it would be necessary to consider three-dimensional rather than two-dimensional
systems because the latter system forms a crystalline solid very quickly.
(b) Another quantity of interest is the velocity metric Ωv:
Ωv(t) =
1
dN
N
i=1
vi(t) − v(t)
2
. (8.49)
The factor of 1/d in (8.49) is included because the velocity is a vector with d components. If
we choose the total momentum of the system to be zero, then v(t) = 0, and we can write
(8.49) as
Ωv(t) =
1
dN
N
i=1
vi(t) · vi(t). (8.50)
As we will see, the time dependence of Ωv(t) is not a good indicator of ergodicity, but can
be used to determine the diﬀusion coeﬃcient D. We write
vi(t) =
1
t
t
0
vi(t )dt =
1
t
ri(t) − ri(0) . (8.51)
If we substitute (8.51) into (8.50), we can express the velocity metric in terms of the mean
square displacement:
Ωv(t) =
1
dNt2
N
i=1
ri(t) − ri(0)
2
=
R2(t)
d t2
. (8.52)
The average in (8.52) is over all particles. If the particles are diﬀusing during the time
interval t, then R2(t) = 2dDt and
Ωv(t) = 2D/t. (8.53)
From (8.53) we see that Ωv(t) goes to zero as 1/t as claimed in part (a). However, if the
particles are localized (as in a crystalline solid and a glass), then R2 is bounded for all t and
Ωv(t) ∼ 1/t2. Because a crystalline solid is ergodic and a glass is not, the velocity metric
is not a good measure of the lack of ergodicity. Use the t-dependence of Ωv(t) in (8.53) to
determine D for the same conﬁgurations as in Problem 8.19.
Project 8.24. Constant temperature molecular dynamics
In the molecular dynamics simulations we have discussed so far, the energy is constant up to
truncation, discretization, and ﬂoating point errors, and the temperature ﬂuctuates. However,
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 301
sometimes it is more convenient to do simulations at constant temperature. In Chapter 15
we will see how to simulate systems at constant T , V , and N (the canonical ensemble) by using
Monte Carlo methods. However, we can also do constant temperature simulations by modifying
the dynamics.
A crude way of maintaining a constant temperature is to rescale the velocities after every
time step to keep the mean kinetic energy per particle constant. This approach is equivalent to
a constant temperature simulation when N → ∞. However, the ﬂuctuations of the kinetic energy
can be non-negligible in small systems. For such systems keeping the total kinetic energy
constant in this way is not equivalent to a constant temperature simulation.
A better way of maintaining a constant temperature is based on imagining that every particle
in the system is connected to a much larger system called a heat bath. The heat bath is
suﬃciently large so that it has a constant temperature even if it loses or gains energy. The particles
in the system of interest occasionally collide with particles in this heat bath. The eﬀect of
these collisions is to give the particles random velocities with the desired probability distribution
(see Problem 8.6). We ﬁrst list the algorithm and give its rationale later. Add the following
statements to method step after all the particles have been moved.
Listing 8.22: Andersen thermostat.
for ( int i = 0; i < N; i ++) {
i f (Math . random ( ) < c o l l i s i o n P r o b a b i l i t y ) {
double r1 = Math . random ( ) ;
double r2 = Math . random ( ) 2 . 0 Math . PI ;
/ / vx
s t a t e [4 i +1] = Math . sqrt ( −2.0 temperature Math . log ( r1 ) ) Math . cos ( r2 ) ;
/ / vy
s t a t e [4 i +3] = Math . sqrt ( −2.0 temperature Math . log ( r1 ) ) Math . sin ( r2 ) ;
}
}
The parameter collisionProbability is much less than unity and determines how often there
is a collision with the heat bath. This way of maintaining constant temperature is known as the
Andersen thermostat.
(a) Do a constant energy simulation as before, using an initial conﬁguration for which the desired
temperature is equal to 1.0. Make sure the total momentum is zero. Choose N = 64
and place the particles initially on a triangular lattice with Lx = 10 and Ly =
√
3Lx/2. Plot
the instantaneous temperature deﬁned as in (8.5) and compute the average temperature.
Estimate the magnitude of the temperature ﬂuctuations. Repeat your simulation for some
other initial conﬁgurations.
(b) Modify your program to use the Andersen thermostat at a constant temperature set equal to
1.0. Set collisionProbability = 0.0001. Repeat the calculations of part (a) and compare
them. Discuss the diﬀerences. Do the results change signiﬁcantly?
(c) Modify your program to do a simple constant kinetic energy ensemble where the velocities
are rescaled after every time step so that the total kinetic energy does not change. What is
the ﬁnal temperature now? How do your results compare with parts (a) and (b)? Are the
diﬀerences in the computed thermodynamic averages statistically signiﬁcant?
(d) Compute the velocity probability distribution for each case. How do they compare? Consider
collisionProbability = 0.001 and 0.00001.
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 302
(e) A deterministic algorithm for constant temperature molecular dynamics is the Nosé–Hoover
thermostat. The idea is to introduce an additional degree of freedom s that plays the role of
the heat bath. The derivation of the appropriate equations of motion is an excellent example
of the Lagrangian formulation of mechanics. The equations of motion of Nosé–Hoover
dynamics are
dpi
dt
= Fi(t) − spi (8.54)
ds
dt
=
1
M
i
p2
i
mi
− dNkT (8.55)
where T is the desired temperature, and M is a parameter that can be interpreted as the
mass associated with the extra degree of freedom. Equation (8.54) is similar to Newton’s
equations of motion with an additional friction term. However, the coeﬃcient s can be positive
or negative. Equation (8.55) deﬁnes the way s is changed to control the temperature.
Apply the Nosé–Hoover algorithm to simulate a simple harmonic oscillator at constant temperature.
Plot the phase space trajectory. If the energy was constant, the trajectory would
be an ellipse. How does the shape of the trajectory depend on M? Choose M so that the
period of any oscillations due to the ﬁnite value of M is much longer than the period of the
system.
Project 8.25. Simulations on the surface of a sphere
Because of the long-range nature of the Coulomb potential, we have to sum all the periodic
images of the particles to compute the force on a given particle. Although there are special
methods to do these sums so that they converge quickly (Ewald sums), the simulation of systems
of charged particles is more diﬃcult than systems with short-range potentials. An alternative
approach that avoids periodic boundary conditions is to not have any boundaries at all. For
example, if we wish to simulate a two-dimensional system, we can consider the motion of the
particles on the surface of a sphere. If the radius of the sphere is suﬃciently large, the curvature
of the surface can be neglected. Of course, there is a price—the coordinate system is no longer
Cartesian.
Although this approach can also be applied to systems with short-range interactions, it is
more interesting to apply it to charged particles. The simplest system of interest is a model of
charged particles moving in a uniform background of opposite charge to ensure overall charge
neutrality, the one-component plasma (OCP). In two dimensions this system is a simpliﬁed
model of electrons on the surface of liquid Helium. The properties of the OCP are determined
by the dimensionless parameter Γ given by the ratio of the potential energy between nearest
neighbor particles to the mean kinetic energy of a particle, Γ = (e2/a)/kT , where ρπa2 = 1 and
ρ is the number density. Systems with Γ >> 1 are called strongly coupled. For Γ ∼ 100 in
two dimensions, the system forms a solid. Strongly coupled one-component plasmas in three
dimensions are models of dense astrophysical matter.
Assume that the origin of the coordinate system is at the center of the sphere and that ui is
a unit vector from the origin to the position of particle i on the sphere. Then Rθij is the length
of the chord joining particle i and j, where cosθij = ui · uj. Newton’s equation of motion for the
ith electron has the form
m¨ui = −
e2
R2
j i
1
θ2
ij sinθij
[uj − (cosθijui]. (8.56)
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 303
Note that the unit vector wij = [uj − (cosψijui]/ sinθij is orthogonal to ui. In addition, we
must take into account that the particles must stay on the surface of the sphere, so there is
an additional force on particle i toward the center of magnitude m| ˙ui|2/R.
(a) What are the appropriate units for length, time, and the self-diﬀusion constant?
(b) Write a program to compute the velocity correlation function given by
C(t) =
1
v2
0
˙u(t) · ˙u(0) (8.57)
where v2
0 = u(0) · u(0). To compute the self-diﬀusion constant D, we let cosθ(t) = u(t) · u(0),
so that Rθ is the circular arc from the initial position of a particle to its position on the
sphere at time t. We then deﬁne
D(t) =
1
a2
θ2(t)
4t
(8.58)
where D and t are dimensionless variables. The self-diﬀusion constant D corresponds to
the limit t → ∞. Choose N = 104 and a radius R corresponding to Γ ≈ 36 as in the original
simulations by Hansen et al. and then consider bigger systems. Can you conclude that the
self-diﬀusion exists for the two-dimensional OCP?
(c)∗ Use a similar procedure to compute the velocity autocorrelation function and the selfdiﬀusion
constant D for a two-dimensional system of Lennard–Jones particles. Can you
conclude that the self-diﬀusion exists for this two-dimensional system?
Project 8.26. Granular matter
Recently, physicists have become very interested in granular matter such as sand. The key
diﬀerence between molecular systems and granular systems is that the interparticle interactions
in the latter are inelastic. The lost energy goes into the internal degrees of freedom of a grain
and ultimately is dissipated. From the point of view of the motion of the granular particles, the
energy is lost. Experimentalists have studied models of granular material composed of small
steel balls or glass beads using sophisticated imaging techniques that can track the motion of
individual particles. There have also been many complementary computer simulation studies.
What are some of the interesting properties of granular matter? Because the interactions
are inelastic, granular particles will ultimately come to rest unless there is an external source of
energy, usually a vibrating wall or gravity (for example, the fall of particles through a funnel).
When granular particles come to rest, they can form a granular solid that is diﬀerent than
molecular solids. One diﬀerence is that there frequently exists a complex network of force lines
within the solid. In addition, unlike ordinary liquids, the pressure does not increase with depth
because the walls of the container help support the grains. As a consequence, sand ﬂowing out
of an aperture ﬂows at a constant rate independent of the height of the sand above the aperture.
For this reason sand is used in hour glasses. Another interesting property is that under some
conditions, the large grains in a mixture of large and small grains can move to the top while the
container is being vibrated—the “Brazil nut” eﬀect. Under other conditions, the large grains
might move to the bottom. What happens depends on the size and density of the large grains
compared to the small grains (see Sanders et al.).
It is also known that there is a critical angle for the slope of a sand pile, above which the
sand pile is unstable. This slope is called the angle of repose. These and many other eﬀects have
been studied using theoretical, computational, and experimental techniques.
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 304
The ﬁrst step in simulating granular matter is to determine the eﬀective force law between
particles. For granular gases the details of the force do not inﬂuence the qualitative results, as
long as the force is purely repulsive and short range, and there is some mechanism for dissipating
energy. Common examples of force laws are spring-like forces with stiﬀ spring constants
and hard disks with inelastic collisions. For simplicity, we will consider the Lennard–Jones potential
with a cut oﬀ at rc = 21/6 so that the force is always repulsive. To remove energy during
a collision, we will introduce a viscous damping force given by
fij = −γ(vij · rij)
rij
r2
ij
(8.59)
where the viscous damping coeﬃcient γ equals 100 in reduced units. A more realistic force
model necessary for granular ﬂow problems is given in Hirchfeld et al.
(a) Modify class LJParticles so that the cutoﬀ is at 21/6. Is the total energy conserved. Include
a viscous damping force as in (8.59) and plot the kinetic energy per particle versus time. We
will deﬁne the kinetic temperature to be the mean kinetic energy per particle. Why does this
deﬁnition of temperature not have the same signiﬁcance as the temperature in molecular
systems in which the energy is conserved? Choose N = 64, L = 20, and ∆t = 0.001. Begin
with a random conﬁguration and initial kinetic temperature equal to 10. How long does it
take for the kinetic temperature to decrease to 10% of its initial value? Describe the spatial
distribution of the particles at this time.
(b) Compute the mean kinetic temperature versus time averaged over three runs. What functional
form describes your results for the mean kinetic temperature at long times?
(c) To prevent “granular collapse” where the particles ultimately come to rest, we need to add
energy to the system. The simplest way of doing so is to give random kicks to randomly
selected particles. You can use the same algorithm we used to set the initial velocities in
LJParticles:
int i = ( int ) (N math . random ( ) ) / / s e l e c t s random p a r t i c l e
/ / use to generate Gaussian d i s t r i b u t i o n
double r = Math . random ( ) ;
double a = −Math . log ( r ) ;
double theta = 2.0 Math . PI Math . random ( ) ;
/ / assign v e l o c i t i e s according to Maxwell−Boltzmann d i s t r i b u t i o n
/ / using Box−Muller method
s t a t e [4 i +1] = Math . sqrt (2.0 desiredKE a ) Math . cos ( theta ) ; / / vx
s t a t e [4 i +3] = Math . sqrt (2.0 desiredKE a ) Math . sin ( theta ) ; / / vy
(The Box–Muller method is described in Section 11.5.) Assume that at each time step one
particle is chosen at random and receives a random kick. Adjust desiredKE so that the mean
kinetic energy per particle remains roughly constant at about 5.0. Compute the velocity
distribution function for each component of the velocity. Compare this distribution on the
same plot to the Gaussian distribution:
p(vx) =
1
√
2πσ2
e−(vx− vx )2/2σ2
(8.60)
where σ2 = v2
x − vx
2. Is the velocity distribution function of the system a Gaussian? If
not, give a physical explanation for the diﬀerence.
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 305
Appendix 8A: Reading and Saving Conﬁgurations
For most of the problems in this chapter, qualitative results can be obtained fairly quickly.
However, in research applications, the time for running a simulation is likely to be much longer
than a few minutes and runs that require days or even months are not uncommon. In such cases
it is important to be able to save the intermediate conﬁgurations to prevent the potential loss of
data in the case of a computer crash or power failure.
Also, in many cases it is easier to save the conﬁgurations periodically and then use a separate
program to analyze the conﬁgurations and compute the quantities of interest. In addition,
if we wish to compute averages as a function of a parameter, such as the temperature, it is
convenient to make small changes in the temperature and use the last conﬁguration from the
previous run as the initial conﬁguration for the simulation at the new temperature.
The standard Java API has methods for reading and writing ﬁles. The usual way of saving a
conﬁguration is to use these methods to simply write all the positions and velocities as numbers
into a ﬁle. Additional simulation parameters and information about the conﬁguration would
be saved using a custom format. Although this approach is the traditional one for data storage,
the use of a custom format means that you might not remember the format later, and sharing
data between programs and other users becomes more diﬃcult.
An alternative is to use a more structured and widely shared format for storing data. The
Open Source Physics library has support for the Extensive Markup Language (XML). The XML
format oﬀers a number of advantages for computational physics: clear markup of input data
and results, standardized data formats, and easier exchange and archival stability of data. In
simple terms the main advantage of XML is that it is a human readable format; just by looking
at an XML ﬁle you can get an idea of the nature of the data.
The XML classes in the Open Source Physics library can be understood by reading the XMLExampleApp
example. The XML API is very similar to the control API. For example, we use
setValue to add data to an XML control, and we use getInt, getDouble, and getString to read
data. We start by importing the necessary deﬁnitions from the controls package and deﬁning
the main method for the ExampleXMLApp class. Note that XMLControl deﬁnes an interface and
XMLControlElement deﬁnes an implementation of this interface.
import org . opensourcephysics . controls . XMLControl ;
import org . opensourcephysics . controls . XMLControlElement ;
public class ExampleXMLApp {
public s t a t i c void main ( String [ ] args ) {
. . .
}
The following Java statements are placed in the body of the main method. An empty XML
document is created using an XMLControl object by calling the XMLControlElement constructor
without any parameters.
XMLControl xmlOut = new XMLControlElement ( ) ;
Invoking the control’s setValue method creates an XML element consisting of a tag and a data
value. The tag is the ﬁrst parameter and the data to be stored is the second. Data that can be
stored includes numbers, number arrays, and strings. Because the tag is unique, the data can
later be retrieved from the control using the appropriate get method.
xmlOut . setValue ( "comment" , "An XML description of an array." ) ;
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 306
xmlOut . setValue ( "x positions" , new double [ ] { 1 , 3 , 4 } ) ;
xmlOut . setValue ( "x velocities" , new double [ ] { 0 , − 1 , 1 } ) ;
Once the data has been stored in an XMLControl object, it can be exported to a ﬁle by calling
the write method. In this example, the name of the ﬁle is MDconfiguration.xml.
xmlOut . write ( "MDconfiguration.xml" ) ;
An XMLControl can also be used to read XML data from a ﬁle. In the next example, we will
read from the ﬁle that we just saved. We start by instantiating a new XMLControl named xmlIn.
XMLControl xmlIn = new XMLControlElement ( "particle_configuration.xml" ) ;
The new XMLControl object xmlIn contains the same data as the object we saved, xmlOut. Its
data can be accessed using a tag name. Note that the getObject method returns a generic
Object and must be cast to the appropriate data type.
System . out . println ( xmlIn . getString ( "comment" ) ) ;
double [ ] xPos = ( double [ ] ) xmlIn . getObject ( "x positions" ) ;
double [ ] xVel = ( double [ ] ) xmlIn . getObject ( "x velocities" ) ;
for ( int i = 0; i < xPos . length ; i ++) {
System . out . println ( "x[i] = " + xPos [ i ] +" vx[i] = " + xVel [ i ] ) ;
}
Exercise 8.27. Saving XML data
(a) Combine the above statements to create a working XMLControlApp class. Examine the saved
data using a text editor. Describe how the parameters are stored.
(b) Run HardDisksApp and save the control’s conﬁguration using the Save As item under the
ﬁle menu in the toolbar. Examine the saved ﬁle using a text editor and describe how this
ﬁle is diﬀerent from the ﬁle you generated in part (a).
(c) What is the minimum amount of information that must be stored in a conﬁguration ﬁle to
specify the current HardDisks state?
(d) Add custom buttons to HardDisksApp to store and load the current HardDisks state. Test
your code by showing that quantities, such as the temperature, remain the same if a conﬁguration
is stored and reloaded.
Open Source Physics user interfaces, such as a SimulationControl, store a program’s conﬁguration
in two steps. During the ﬁrst step, parameters from the graphical user interface
are stored. During the second step, the model is given the opportunity to store runtime data
using an ObjectLoader. Study the LJParticlesLoader class and note how storing and loading
are done in the saveObject and loadObject methods, respectively. You will adapt this
ObjectLoader to store HardDisks data in Problem 8.28. Additional information about how
Open Source Physics applications store XML-based conﬁgurations is provided in the Open
Source Physics Users Guide.
Problem 8.28. Hard disk conﬁguration
(a) Create a HardDisksLoader class that stores the HardDisks runtime data.
(b) Add the getLoader method to HardDisksApp and test the loader.
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 307
public s t a t i c XML. ObjectLoader getLoader ( ) {
return new HardDisksLoader ( ) ;
}
Method XML.ObjectLoader getLoader allows the SimulationControl to obtain the HardDisksLoader,
which will be used to store the runtime data. Data written by the loader’s
saveObject method will be included in the output ﬁle when the user saves a program conﬁguration.
Describe how the initialization parameters and the runtime data are separated
in the XML ﬁle.
Because XML allows for the creation of custom tags, various companies and professional organizations
have deﬁned other XML grammars, such as MathML. See <http://xml.comp-phys.org/>
for another example of the use of XML in computational physics.
References and Suggestions for Further Reading
One of the best ways of testing your programs is by comparing your results with known results.
The website, <www.cstl.nist.gov/lj>, maintained by the National Institute of Standards
and Technology (U.S.) provides some useful benchmark results for the Lennard–
Jones ﬂuid.
Farid F. Abraham, “Computational statistical mechanics: Methodology, applications and supercomputing,”
Adv. Phys. 35, 1–111 (1986). The author discusses both molecular dynamics
and Monte Carlo techniques.
B. J. Alder and T. E. Wainwright, “Phase transition for a hard sphere system,” J. Chem. Phys.
27, 1208–1209 (1957).
M. P. Allen and D. J. Tildesley, Computer Simulation of Liquids (Clarendon Press, 1987). A
classic text on molecular dynamics and Monte Carlo methods.
Jean–Louis Barrat and Jean–Pierre Hansen, Basic Concepts for Simple and Complex Liquids
(Cambridge University Press, 2003). Also see Jean–Pierre Hansen and Ian R. McDonald,
Theory of Simple Liquids, 2nd ed. (Academic Press, 1986). Excellent graduate level texts
that derive most of the theoretical results mentioned in this chapter.
Kurt Binder, Jürgen Horbach, Walter Kob, Wolfgang Paul, and Fathollah Varnik, “Molecular
dynamics simulations,” J. Phys.: Condens. Matter 16, S429–S453 (2004).
R. P. Bonomo and F. Riggi, “The evolution of the speed distribution for a two-dimensional ideal
gas: A computer simulation,” Am. J. Phys. 52, 54–55 (1984). The authors consider a system
of hard disks and show that the system evolves to the Maxwell–Boltzmann distribution.
J. P. Boon and S. Yip, Molecular Hydrodynamics (Dover Publications,1991). Their discussion
of transport properties is an excellent supplement to our brief discussion.
Giovanni Ciccotti and William G. Hoover, editors, Molecular-Dynamics Simulation of StatisticalMechanics
Systems (North–Holland, 1986).
Giovanni Ciccotti, Daan Frenkel, and Ian R. McDonald, editors, Simulation of Liquids and
Solids (North–Holland, 1987). A collection of reprints on the simulation of many body
systems. Of particular interest are B. J. Alder and T. E. Wainwright, “Phase transition in
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 308
elastic disks,” Phys. Rev. 127, 359–361 (1962) and earlier papers by the same authors; A.
Rahman, “Correlations in the motion of atoms in liquid argon,” Phys. Rev. 136, A405–
A411 (1964), the ﬁrst application of molecular dynamics to systems with continuous potentials;
and Loup Verlet, “Computer ‘experiments’ on classical ﬂuids. I. Thermodynamical
properties of Lennard–Jones molecules,” Phys. Rev. 159, 98–103 (1967).
Daan Frenkel and Berend Smit, Understanding Molecular Simulation: From Algorithms to Applications,
2nd ed. (Academic Press, 2002). This monograph is one of the best on molecular
dynamics and Monte Carlo simulations. It is particularly strong on simulations in
various ensembles and on methods for computing free energies.
J. M. Haile, Molecular Dynamics Simulation (John Wiley & Sons, 1992). A derivation of the
mean pressure using periodic boundary conditions is given in Appendix B.
J. P. Hansen, D. Levesque, and J. J. Weis, “Self-diﬀusion in the two-dimensional, classical electron
gas,” Phys. Rev. Lett. 43, 979–982 (1979).
D. Hirchfeld, Y. Radzyner, and D. C. Rapaport, “Molecular dynamics studies of granular ﬂow
through an aperture,” Phys. Rev. E 56, 4404–4415 (1997).
W. G. Hoover, Molecular Dynamics (Springer–Verlag, 1986) and W. G. Hoover, Computational
Statistical Mechanics (Elsevier, 1991).
K. Kadau, T. C. Germann and P. S. Lomdahl, “Large-scale molecular-dynamics simulation of
19 billion particles,” Int. J. Mod. Phys. C 15, 193–201 (2004).
J. Krim, “Friction at macroscopic and microscopic length scales,” Am. J. Phys. 70, 890–897
(2002).
J. Kushick and B. J. Berne, “Molecular dynamics methods: Continuous potentials” in Statistical
Mechanics Part B: Time-Dependent Processes, Bruce J. Berne, editor (Plenum Press, 1977).
Also see the article by Jerome J. Erpenbeck and William Wood on “Molecular dynamics
techniques for hard-core systems” in the same volume.
Shang–keng Ma, “Calculation of entropy from data of motion,” J. Stat. Phys. 26, 221 (1981).
Also see Chapter 25 of Ma’s graduate level text, Statistical Mechanics (World Scientiﬁc,
1985). Ma discusses a novel approach for computing the entropy directly from the trajectories.
Note that the coincidence rate in Ma’s approach is related to the recurrence time for
a ﬁnite system to return to an arbitrarily small neighborhood of almost any given initial
state. The approach is intriguing, but is practical only for small systems.
A. McDonough, S. P. Russo, and I. K. Snook, “Long-time behavior of the velocity autocorrelation
function for moderately dense, soft-repulsive, and Lennard–Jones ﬂuids,” Phys. Rev.
E 63, 026109-1–9 (2001).
S. Ranganathan, G. S. Dubey, and K. N. Pathak, “Molecular-dynamics study of two-dimensional
Lennard–Jones ﬂuids,” Phys. Rev. A 45, 5793–5797 (1992).
Dennis Rapaport, The Art of Molecular Dynamics Simulation, 2nd ed. (Cambridge University
Press, 2004). The most complete text on molecular dynamics written by one of its leading
practitioners.
John R. Ray and H. W. Graben, “Direct calculation of ﬂuctuation formulae in the microcanonical
ensemble,” Mol. Phys. 43, 1293 (1981).
CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 309
F. Reif, Fundamentals of Statistical and Thermal Physics (McGraw–Hill, 1965). An intermediate
level text on statistical physics with a more thorough discussion of kinetic theory
than found in most undergraduate texts. Statistical Physics, Vol. 5 of the Berkeley Physics
Course (McGraw–Hill, 1965), by Reif was one of the ﬁrst texts to use computer simulations
to illustrate the approach of macroscopic systems to equilibrium.
Marco Ronchetti and Gianni Jacucci, editors, Simulation Approach to Solids (Kluwer Academic
Publishers, 1990). Another excellent collection of classic reprints.
James Ringlein and Mark O. Robbins, “Understanding and illustrating the atomic origins of
friction,” Am. J. Phys. 72 (7), 884–891 (2004). A very readable paper on the microscopic
origins of sliding friction.
Duncan A. Sanders, Michael R. Swift, R. M. Bowley, and P. J. King, “Are Brazil nuts attractive?,”
Phys. Rev. Lett. 93, 208002 (2004). An example of a simulation of granular matter.
Tamar Schlick, Molecular Modeling and Simulation (Springer–Verlag, 2002). Although the
book is at the graduate level, it is an accessible introduction to computational molecular
biology.
Leonardo E. Silbert, Deniz Ertas, Gary S. Grest, Thomas C. Halsey, Dov Levine, and Steven J.
Plimpton, “Granular ﬂow down an inclined plane: Bagnold scaling and rheology,” Phys.
Rev. E 64, 051302-1–14 (2001). This paper discusses the contact force model, which captures
the major features of granular interactions.
R. M. Sperandeo Mineo and R. Madonia, “The equation of state of a hard-particle system: A
model experiment on a microcomputer,” Eur. J. Phys. 7, 124–129 (1986).
D. Thirumalai and Raymond D. Mountain, “Ergodic convergence properties of supercooled
liquids and glasses,” Phys. Rev. A 42, 4574–4587 (1990).
Shan-Ho Tsai, H. K. Lee, and D. P. Landau, “Molecular and spin dynamics simulations using
modern integration methods,” Am. J. Phys. 73, 615–624 (2005).
James H. Williams and Glenn Joyce, “Equilibrium properties of a one-dimensional kinetic
system,” J. Chem. Phys. 59, 741–750 (1973). Simulations in one dimension are even easier
than in two.
Zoran Slani˘c, Harvey Gould, and Jan Tobochnik, “Dynamics of the classical Heisenberg chain,”
Computers in Physics 5 (6), 630–635 (1991).
Chapter 9
Normal Modes and Waves
We discuss the physics of wave phenomena and the motivation and use of Fourier transforms.
9.1 Coupled Oscillators and Normal Modes
Terms such as period, amplitude, and frequency are used to describe both waves and oscillatory
motion. To understand the relation between waves and oscillatory motion, consider a ﬂexible
rope that is under tension with one end ﬁxed. If we ﬂip the free end, a pulse propagates along
the rope with a speed that depends on the tension and on the inertial properties of the rope.
At the macroscopic level, we observe a transverse wave that moves along the length of the rope.
In contrast, at the microscopic level we see discrete particles undergoing oscillatory motion in a
direction perpendicular to the motion of the wave. One goal of this chapter is to use simulations
to understand the relation between the microscopic dynamics of a simple mechanical model and
the macroscopic wave motion that the model can support.
For simplicity, we ﬁrst consider a one-dimensional chain of N particles each of mass m. The
particles are coupled by massless springs with force constant k. The equilibrium separation between
the particles is a. We denote the displacement of particle j from its equilibrium position
at time t by uj(t) (see Figure 9.1). For many purposes the most realistic boundary conditions are
to attach particles j = 1 and j = N to springs which are attached to ﬁxed walls. We denote the
walls by j = 0 and j = N + 1 and require that u0(t) = uN+1(t) = 0.
The force on an individual particle is determined by the compression or extension of its
adjacent springs. The equation of motion of particle j is given by
m
d2uj(t)
dt2
= −k uj(t) − uj+1(t) − k uj(t) − uj−1(t)
= −k 2uj(t) − uj+1(t) − uj−1(t) . (9.1)
Equation (9.1) couples the motion of particle j to its two nearest neighbors and describes longitudinal
oscillations; that is, motion along the length of the system. It is straightforward to show
that identical equations hold for the transverse oscillations of N identical mass points equally
spaced on a stretched massless string (cf. French).
Because the equations of motion (9.1) are linear, that is, only terms proportional to the
displacements appear, it is straightforward to obtain analytic solutions of (9.1). We ﬁrst discuss
310
CHAPTER 9. NORMAL MODES AND WAVES 311
a
0
a
1
a
2 3 N + 1
u1 u2 u3
Figure 9.1: A one-dimensional chain of N particles of mass m coupled by massless springs with
force constant k. The ﬁrst and last particles (0 and N + 1) are attached to ﬁxed walls. The top
chain shows the oscillators in equilibrium. The bottom chain shows the oscillators displaced
from equilibrium.
these solutions because they will help us interpret the nature of the numerical solutions. To ﬁnd
the normal modes, we look for oscillatory solutions for which the displacement of each particle
is proportional to sinωt or cosωt. We write
uj(t) = uj cosωt (9.2)
where uj is the amplitude of the displacement of the jth particle. If we substitute the form (9.2)
into (9.1), we obtain
− ω2
uj = −
k
m
[2uj − uj+1 − uj−1]. (9.3)
We next assume that the amplitude uj depends sinusoidally on the distance ja:
uj = C sinqja (9.4)
where the constants q and C will be determined. If we substitute (9.4) into (9.3), we ﬁnd the
following condition for ω:
− ω2
sinqja = −
k
m
2sinqja − sinq(j − 1)a − sinq(j + 1)a . (9.5)
We write sinq(j ± 1)a = sinqjacosqa ± cosqjasinqa and ﬁnd that (9.4) is a solution if
ω2
= 2
k
m
1 − cosqa . (9.6)
We need to ﬁnd the values of the wavenumber q that satisfy the boundary conditions u0 = 0
and uN+1 = 0. The former condition is automatically satisﬁed by assuming a sine instead of a
cosine solution in (9.4). The latter boundary condition implies that
q = qn =
πn
a(N + 1)
(ﬁxed boundary conditions) (9.7)
CHAPTER 9. NORMAL MODES AND WAVES 312
where n = 1, ..., N. The corresponding possible values of the wavelength λ are related to q by
q = 2π/λ, and the corresponding values of the angular frequencies are given by
ωn
2
= 2
k
m
[1 − cosqna] = 4
k
m
sin2 qna
2
, (9.8)
or
ωn = 2
k
m
sin
qna
2
. (9.9)
The relation (9.9) between ωn and qn is known as a dispersion relation.
A particular value of the integer n corresponds to the nth normal mode. We write the (timeindependent)
normal mode solutions as
uj,n = C sinqnja. (9.10)
The linear nature of the equation of motion (9.1) implies that the time dependence of the displacement
of the jth particle can be written as a superposition of normal modes:
uj(t) = C
N
n=1
An cosωnt + Bn sinωnt sinqnja. (9.11)
The coeﬃcients An and Bn are determined by the initial conditions:
uj(t = 0) = C
N
n=1
An sinqnja (9.12a)
vj(t = 0) = C
N
n=1
ωnBn sinqnja. (9.12b)
To solve (9.12) for An and Bn, we note that the normal mode solutions uj,n are orthogonal;
that is, they satisfy the condition
N
j=1
uj,n uj,m ∝ δn,m. (9.13)
The Kronecker δ symbol δn,m = 1 if n = m and is zero otherwise. It is convenient to normalize
the uj,n so that they are orthonormal; that is,
N
j=1
uj,n uj,m = δn,m. (9.14)
It is easy to show that the choice, C = 1/ (N + 1)/2, in (9.4) and (9.10) insures that (9.14) is
satisﬁed.
We now use the orthonormality condition (9.14) to determine the An and Bn coeﬃcients. If
we multiply both sides of (9.12) by C sinqmja, sum over j, and use the orthogonality condition
(9.14), we obtain
An = C
N
j=1
uj(0)sinqnja (9.15a)
Bn = C
N
j=1
(vj(0)/ωn)sinqnja. (9.15b)
CHAPTER 9. NORMAL MODES AND WAVES 313
For example, if the initial displacement of every particle is zero, and the initial velocity of every
particle is zero except for v1(0) = 1, we ﬁnd An = 0 for all n, and
Bn =
C
ωn
sinqna. (9.16)
The corresponding solution for uj(t) is
uj(t) =
2
N + 1
N
n=1
1
ωn
cosωnt sinqnasinqnja. (9.17)
What is the solution if the particles start in a normal mode; that is, uj(t = 0) ∝ sinq2ja?
The Oscillators class in Listing 9.1 displays the analytic solution (9.11) of the oscillator
displacements. The draw method uses a single circle that is repeatedly set equal to the appropriate
world coordinates. The initial positions are calculated and stored in the y array in the
Oscillators constructor. When an oscillator is drawn, the position array is multiplied by the
given mode’s sinusoidal (phase) factor to produce a time-dependent displacement.
Listing 9.1: The Oscillators class models the time evolution of a normal mode of a chain of
coupled oscillators.
package org . opensourcephysics . sip . ch09 ;
import java . awt . Graphics ;
import org . opensourcephysics . display . ;
public class O s c i l l a t o r s implements Drawable {
OscillatorsMode normalMode ;
Circle c i r c l e = new Circle ( ) ;
double [ ] x ; / / drawing p o s i t i o n s
double [ ] u ; / / displacement
double time = 0;
public O s c i l l a t o r s ( int mode, int N) {
u = new double [N+2]; / / i n c l u d e s the two ends of the chain
x = new double [N+2]; / / i n c l u d e s the two ends of the chain
normalMode = new OscillatorsMode (mode, N) ;
double xi = 0;
for ( int i = 0; i <N+2; i ++) {
x [ i ] = xi ;
u[ i ] = normalMode . evaluate ( xi ) ; / / i n i t i a l displacement
/ / increment x [ i ] by l a t t i c e spacing of one
xi ++;
}
}
public void step ( double dt ) {
time += dt ;
}
public void draw ( DrawingPanel drawingPanel , Graphics g ) {
normalMode . draw ( drawingPanel , g ) ; / / draw i n i t i a l condition
double phase = Math . cos ( time normalMode . omega ) ;
for ( int i = 0 , n = x . length ; i <n ; i ++) {
CHAPTER 9. NORMAL MODES AND WAVES 314
c i r c l e . setXY ( x [ i ] , u[ i ] phase ) ;
c i r c l e . draw ( drawingPanel , g ) ;
}
}
}
The OscillatorsMode class in Listing 9.2 instantiates a normal mode. This class stores the
mode frequency and implements the Function interface to evaluate the analytic solution. It
draws a light gray outline of the initial analytic solution using a FunctionDrawer.
Listing 9.2: The OscillatorsMode class models a normal mode of a chain of coupled oscillators.
package org . opensourcephysics . sip . ch09 ;
import java . awt . ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . numerics . ;
public class OscillatorsMode implements Drawable , Function {
s t a t i c final double OMEGA_SQUARED = 1; / / equals k /m
FunctionDrawer functionDrawer ; / / draws the i n i t i a l condition
double omega ; / / o s c i l l a t i o n frequency of mode
double wavenumber ; / / wavenumber = 2 pi / wavelength
double amplitude ;
OscillatorsMode ( int mode, int N) {
amplitude = Math . sqrt ( 2 . 0 / (N+ 1 ) ) ;
omega = 2 Math . sqrt (OMEGA_SQUARED)
Math . abs (Math . sin (mode Math . PI /(2 (N+ 1 ) ) ) ) ;
wavenumber = Math . PI mode/(N+1);
functionDrawer = new FunctionDrawer ( this ) ;
/ / draws the i n i t i a l displacement
functionDrawer . i n i t i a l i z e (0 , N+1, 300 , false ) ;
functionDrawer . color = Color .LIGHT_GRAY;
}
public double evaluate ( double x ) {
return amplitude Math . sin ( x wavenumber ) ;
}
public void draw ( DrawingPanel panel , Graphics g ) {
functionDrawer . draw ( panel , g ) ;
}
}
The OscillatorsApp target class extends AbstractSimulation, creates an Oscillators
object, and displays the particle displacements as transverse oscillations. The complete listing
is available in the ch09 code package, but it is not given here because it is similar to other
animations.
Problem 9.1. Normal modes
(a) How many modes are there for a chain of 16 oscillators? Predict the initial positions of the
oscillators for modes 1 and 14 and compare your prediction to the program’s output.
CHAPTER 9. NORMAL MODES AND WAVES 315
(b) Because a normal mode is a standing wave, it can be written as the sum of right and left
traveling sinusoidal waves, f+ = Asin(kx −ωt) and f− = Asin(kx +ωt), respectively. Does the
phase velocity v = ω/k depend on the mode (wavelength)? In other words, how does the
speed depend on the wavelength [see (9.9)]?
(c) Determine the wavelength and frequency for modes with the largest and smallest wavelengths.
Note that you can click-drag within the window to measure position. The time is
displayed in the yellow message box. Compare your measured values with (9.9).
(d) Are negative mode numbers acceptable? Give two diﬀerent listings of mode numbers that
contain a complete set of modes for N = 16.
(e) Write a short stand-alone program to verify that the normal mode solutions (9.10) are or-
thonormal.
(f) Modify your program to show the evolution of the superposition of two or more normal
modes. Compare this evolution to that of a single mode.
As seen in Problem 9.1, the number of unique nontrivial solutions is equal to the number of
oscillators. Modes with mode numbers 1,2,3,...,N form a complete set and all other modes are
indistinguishable from them. (The number of modes is related to the Nyquist frequency and is
discussed further in Section 9.3.)
The analytic solution (9.11), together with the initial conditions, represent the complete solution
of the displacement of the particles. We can use a computer to calculate the sum in (9.11)
and plot the time dependence of the displacements uj(t). There are many interesting extensions
that are amenable to an analytic solution. What is the eﬀect of changing the boundary conditions?
What happens if the spring constants are not all equal but are chosen from a probability
distribution? What happens if we vary the masses of the particles? For these cases we can follow
a similar approach and look for the eigenvalues ωn and eigenvectors uj,n of the matrix equation
Tu = ω2
u. (9.18)
The matrix elements Ti,j are zero except for
Ti,i =
1
mi
[ki,i+1 + ki,i−1] (9.19a)
Ti,i+1 = −
ki,i+1
mi
(9.19b)
Ti,i−1 = −
ki,i−1
mi
(9.19c)
where ki,j is the spring constant between particles i and j. The solution of matrix equations
such as (9.18) is a well- studied problem in linear programming, and an open source library
such as LINPAC available from NetLIB (<www.netlib.org>) or a stand-alone program such as
Octave (<www.octave.org>) can be used to obtain the solutions.
9.2 Numerical Solutions
Because we are also interested in the eﬀects of nonlinear forces between the particles, for which
the matrix approach is inapplicable, we study the numerical solution of the equations of motion
(9.1) directly.
CHAPTER 9. NORMAL MODES AND WAVES 316
To use the ODE interface, we need to remember that the ordering of the variables in the
coupled oscillator state array is important because the implementations of some ODE solvers,
such as Verlet and Euler–Richardson, make explicit assumptions about the ordering. Our standard
ordering is to follow a variable by its derivative. For example, the state vector of an N
oscillator chain is ordered as {u0,v0,u1,v1,...,uN ,vN ,uN+1,vN+1,t}. Note that the state array includes
variables for the chain’s end points although the velocity rate corresponding to the end
points is always zero. We include the time as the last variable because we will sometimes model
time-dependent external forces. With this ordering, the getRate method is implemented as
follows:
s t a t i c final double OMEGA_SQUARED = 1; / / equals k /m
public void getRate ( double [ ] state , double [ ] rate ) {
for ( int i = 1 , N = x . length −1; i <N; i ++) { / / skip ends
rate [2 i ] = s t a t e [2 i +1]; / / displacement r a t e
rate [2 i +1] = −OMEGA_SQUARED (2 s t a t e [2 i ]− s t a t e [2 i −2]− s t a t e [2 i + 2 ] ) ;
}
rate [ s t a t e . length −1] = 1;
}
Problem 9.2. Numerical solution
(a) Modify the Oscillators class to solve the dynamical equations of motion by implementing
the ODE interface. Compare the numerical and the analytic solution for N = 10 using an
algorithm that is well suited to oscillatory problems.
(b) What is the maximum deviation between the analytic and numeric solution of uj(t)? How
well is the total energy conserved in the numerical solution? How does the maximum deviation
and the conservation of the total energy change when the time step ∆t is reduced?
Justify your choice of numerical algorithm.
Problem 9.3. Dynamics of coupled oscillators
(a) Use your program for Problem 9.2 for N = 2 using units such that the ratio k/m = 1. Choose
the initial values of u1 and u2 so that the system is in one of its two normal modes, for
example, u1 = u2 = 0.5 and set the initial velocities equal to zero. Describe the displacement
of the particles. Is the motion of each particle periodic in time? To answer this question,
add code that plots the displacement of each particle versus the time. Then consider the
other normal mode, for example, u1 = 0.5, u2 = −0.5. What is the period in this case?
Does the system remain in a normal mode indeﬁnitely? Finally, choose the initial particle
displacements equal to random values between −0.5 and +0.5. Is the motion of each particle
periodic in this case?
(b) Consider the same questions as in part (a) but with N = 4 and N = 10. Consider the n = 2
mode for N = 4 and the n = 3 and n = 8 modes for N = 10. (See (9.10) for the form of the
normal mode solutions.) Also consider random initial displacements.
Problem 9.4. Diﬀerent boundary conditions
(a) Modify your program from Problem 9.3 so that periodic boundary conditions are used; that
is, u0 = uN and u1 = uN+1. Choose N = 10 and the initial condition corresponding to the
normal mode (9.10) with n = 2. Does this initial condition yield a normal mode solution
for periodic boundary conditions? (It might be easier to answer this question by plotting
CHAPTER 9. NORMAL MODES AND WAVES 317
ui versus time for two or more particles.) For ﬁxed boundary conditions, there are N + 1
springs, but for periodic boundary conditions, there are N springs. Why? Choose the initial
condition corresponding to the n = 2 normal mode, but replace N + 1 by N in (9.7). Does
this initial condition correspond to a normal mode? Now try n = 3 and other values of n.
Which values of n give normal modes? Only sine functions can be normal modes for ﬁxed
boundary conditions [see (9.4)]. Can there be normal modes with cosine functions if we use
periodic boundary conditions?
(b) Modify your program so that free boundary conditions are used, which means that the
masses at the end points are connected to only one nearest neighbor. A simple way to implement
this boundary condition is to set u0 = u1 and uN = uN+1. Choose N = 10, and use
the initial condition corresponding to the n = 3 normal mode found using ﬁxed boundary
conditions. Does this condition correspond to a normal mode for free boundary conditions?
Is n = 2 a normal mode for free boundary conditions? Are the normal modes purely sinu-
soidal?
(c) Choose free boundary conditions and N ≥ 10. Let the initial condition be a pulse of the form
u1 = 0.2,u2 = 0.6,u3 = 1.0,u4 = 0.6,u5 = 0.2, and all other uj = 0. After the pulse reaches
the right end, what is the phase of the reﬂected pulse; that is, are the displacements in the
reﬂected pulse in the same direction as the incoming pulse (a phase shift of zero degrees) or
in the opposite direction (a phase shift of 180 degrees)? What happens for ﬁxed boundary
conditions? Choose N to be as large as possible so that it is easy to distinguish the incident
and reﬂected waves.
(d) Choose N ≥ 20 and let the spring constants on the right half of the system be four times
greater than the spring constants on the left half. Use ﬁxed boundary conditions. Set up
a pulse on the left side. Is there a reﬂected pulse at the boundary between the two types
of springs? If so, what is its relative phase? Compare the amplitude of the reﬂected and
transmitted pulses. Consider the same questions with a pulse that is initially on the right
side.
Problem 9.5. Motion of coupled oscillators with external forces
(a) Modify your program from Problem 9.4 so that an external force Fext is exerted on the ﬁrst
particle
Fext/m = 0.5cosωt (9.20)
where ω is the angular frequency of the external force. Let the initial displacements and
velocities of all N particles be zero. Choose N = 3 and consider the response of the system
to an external force for ω = 0.5 to 4.0 in steps of 0.5. Record A(ω), the maximum amplitude
of any particle, for each value of ω. Repeat the simulation for N = 10.
(b) Choose ω to be one of the normal mode frequencies. Does the maximum amplitude remain
constant or does it increase with time? How can you use the response of the system to an
external force to determine the normal mode frequencies? Discuss your results in terms of
the power input Fextv1?
(c) In addition to the external force exerted on the ﬁrst particle, add a damping force equal to
−γvi to all the oscillators. Choose the damping constant γ = 0.05. How do you expect the
system to behave? How does the maximum amplitude depend on ω? Are the normal mode
frequencies changed when γ 0?
CHAPTER 9. NORMAL MODES AND WAVES 318
In Problem 9.5 we saw that a boundary produces reﬂections that lead to resonances. Although
it is not possible to eliminate all reﬂections, it can be shown analytically that it is possible
to absorb waves at a single frequency ω by using a free boundary and a judicious choice of
the mass and damping coeﬃcient for the last oscillator (see the text by Main). These conditions
are
mN−1 =
m
2
(9.21a)
γN−1 =
k
ω
sinqa (9.21b)
where a is the separation between oscillators, q is the wavenumber, and ω is the angular frequency.
Problem 9.6 shows that this choice leads to a transparent boundary (at the selected
frequency) and is akin to impedance matching in electrical transmission lines enabling us to
study traveling waves. In Section 9.7 we will study the classical wave equation in detail.
Problem 9.6. Traveling waves
(a) Modify the oscillator program by imposing a sinusoidal amplitude on the ﬁrst oscillator:
u1(t) = 0.5sinωt. (9.22)
Run the program with ω = 1.0 and observe the right traveling wave, but note that reﬂections
soon produce a left traveling wave.
(b) Implement transparent boundary conditions on the right-hand side. Is the ﬁrst wave crest
totally transmitted at the end, or is there some residual reﬂection? Explain. Add a small
overall damping to the chain to remove transient eﬀects.
(c) The dispersion relation (9.9) predicts a cutoﬀ frequency:
ω2
c = 4
k
m
. (9.23)
What happens if we apply an external driving force above this frequency?
Problem 9.7. Evanescent waves
Increase ω past the cutoﬀ frequency in the traveling wave simulation that was found in Problem
9.6. Do you still observe waves? Is energy being transported along the chain?
Waves above the cutoﬀ frequency are known as evanescent waves. In Problem 9.8 we show
how these waves lead to a classical counterpart of quantum mechanical tunneling.
Problem 9.8. Tunneling
(a) Model a traveling wave on a N = 64 particle chain with mass m = 1 and k = 1, but assign m =
4 to eight oscillators near the center. Drive the ﬁrst particle in the chain with a frequency
of 0.113. (This value is slightly higher than the frequency above which the wave amplitude
falls oﬀ exponentially.) Describe the steady-state motion in the left region, the central region
of heavier masses, and the right region.
(b) Lower the frequency in part (a) until you observe maximum transmission through the barrier.
Describe the steady-state motion in the left region, the central barrier, and the right
region. Explain how this system can be used as a frequency ﬁlter.
CHAPTER 9. NORMAL MODES AND WAVES 319
9.3 Fourier Series
In Section 9.1, we showed that the displacement of a single particle can be written as a linear
combination of normal modes, that is, a linear superposition of sinusoidal terms. In general, an
arbitrary periodic function f (t) of period T can be expressed as a sum of sines and cosines:
f (t) =
1
2
a0 +
∞
k=1
ak cosωkt + bk sinωkt (9.24)
where
ωk = kω0 and ω0 =
2π
T
. (9.25)
The quantity ω0 is the fundamental frequency. Such a sum is called a Fourier series. The
sine and cosine terms in (9.24) for k = 2,3, ... represent the second, third, ..., and higher order
harmonics. The Fourier coeﬃcients ak and bk are given by
ak =
2
T
T
0
f (t)cosωkt dt (9.26a)
bk =
2
T
T
0
f (t)sinωkt dt. (9.26b)
The constant term 1
2 a0 in (9.24) is the average value of f (t). The expressions in (9.26) for the
coeﬃcients follow from the orthogonality conditions:
2
T
T
0
sinωkt sinωk t dt = δk,k (9.27a)
2
T
T
0
cosωkt cosωk t dt = δk,k (9.27b)
2
T
T
0
sinωkt cosωk t dt = 0. (9.27c)
In general, an inﬁnite number of terms is needed to represent an arbitrary periodic function
exactly. In practice, a good approximation usually can be obtained by including a relatively
small number of terms. Unlike a power series, which can approximate a function only
near a particular point, a Fourier series can approximate a function at almost every point. The
Synthesize class in Listing 9.3 evaluates such a series given the Fourier coeﬃcients a and b.
Listing 9.3: A class that synthesizes a function using a Fourier series.
package org . opensourcephysics . sip . ch09 ;
import org . opensourcephysics . numerics . Function ;
public class Synthesize implements Function {
double [ ] cosCoefficients , sinCoefficients ;
/ / c o s i n e and sine c o e f f i c i e n t s
double a0 ; / / the constant term
double omega0 ;
public Synthesize ( double period , double a0 , double [ ] cosCoef ,
double [ ] sinCoef ) {
CHAPTER 9. NORMAL MODES AND WAVES 320
omega0 = Math . PI 2/ period ;
cosCoefficients = cosCoef ;
sinCoefficients = sinCoef ;
this . a0 = a0 ;
}
public double evaluate ( double x ) {
double f = a0 /2;
/ / sum the c o s i n e terms
for ( int i = 0 , n = cosCoefficients . length ; i <n ; i ++) {
f += cosCoefficients [ i ] Math . cos ( omega0 x ( i + 1 ) ) ;
}
/ / sum the sine terms
for ( int i = 0 , n = sinCoefficients . length ; i <n ; i ++) {
f += sinCoefficients [ i ] Math . sin ( omega0 x ( i + 1 ) ) ;
}
return f ;
}
}
The SynthesizeApp class creates a Synthesize object by deﬁning the values of the nonzero
Fourier coeﬃcients and draws the result of the Fourier series. Because the Synthesize class
implements the Function interface, we can plot the Fourier series and see how the sum can
represent an arbitrary periodic function. An easy way to do so is to create a FunctionDrawer
and add it to a drawing frame as shown in Listing 9.4. The FunctionDrawer handles the routine
task of generating a curve from the given function to produce a drawing.
Listing 9.4: A program that displays a Fourier series.
package org . opensourcephysics . sip . ch09 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . display . FunctionDrawer ;
import org . opensourcephysics . frames . DisplayFrame ;
public class SynthesizeApp extends AbstractCalculation {
DisplayFrame frame = new DisplayFrame ( "x" , "f(x)" , "Fourier Synthesis" ) ;
public void calculate ( ) {
double xmin = control . getDouble ( "xmin" ) ;
double xmax = control . getDouble ( "xmax" ) ;
int N = control . getInt ( "N" ) ;
double period = control . getDouble ( "period" ) ;
double [ ] sinCoefficients = ( double [ ] ) control . getObject (
"sin coefficients" ) ;
double [ ] cosCoefficients = ( double [ ] ) control . getObject (
"cos coefficients" ) ;
FunctionDrawer functionDrawer = new FunctionDrawer (new Synthesize
( period , 0 , cosCoefficients , sinCoefficients ) ) ;
functionDrawer . i n i t i a l i z e (xmin , xmax , N, false ) ;
frame . clearDrawables ( ) ; / / remove old function drawer
frame . addDrawable ( functionDrawer ) ; / / add new function drawer
}
public void reset ( ) {
CHAPTER 9. NORMAL MODES AND WAVES 321
control . setValue ( "xmin" , −1);
control . setValue ( "xmax" , 1 ) ;
control . setValue ( "N" , 300);
control . setValue ( "period" , 1 ) ;
control . setValue ( "sin coefficients" , new double [ ] {
1.0 , 0 , 1.0/3.0 , 0 , 1.0/5.0 , 0 , 0
} ) ;
control . setValue ( "cos coefficients" , new double [ ] {
0 , 0 , 0 , 0 , 0 , 0 , 0
} ) ;
calculate ( ) ;
}
public s t a t i c void main ( String [ ] args ) {
CalculationControl . createApp (new SynthesizeApp ( ) ) ;
}
}
Problem 9.9. Fourier synthesis
(a) The process of approximating a function by adding together a fundamental frequency and
harmonics of various amplitudes is Fourier synthesis. The SynthesizeApp class shows how a
sum of harmonic functions can represent an arbitrary periodic function. Consider the series
f (t) =
2
π
(sint +
1
3
sin3t +
1
5
sin5t + ···). (9.28)
Describe the nature of the plot of f (t) when only the ﬁrst three terms in (9.28) are retained.
Increase the number of terms until you are satisﬁed that (9.28) represents the desired function
with suﬃcient accuracy. What function is represented by the inﬁnite series?
(b) Modify SynthesizeApp so that you can create an initial state with an arbitrary number of
terms. (Do not use the control to input each coeﬃcient. Input only the total number of
terms.) Consider the series (9.28) with at least 32 terms. For what values of t does the ﬁnite
sum most faithfully represent the exact function? For what values of t does it not? Why is
it necessary to include a large number of terms to represent f (t) where it has sharp edges?
The small oscillations that increase in amplitude as a sharp edge is approached are known
as the Gibbs phenomenon.
(c) Modify SynthesizeApp and determine the function that is represented by the Fourier series
with coeﬃcients ak = 0 and bk = (2/kπ)(−1)k−1 for k = 1,2,3,... . Approximately how many
terms in the series are required?
So far we have considered how a sum of sines and cosines can approximate a known periodic
function. More typically, we generate a time series consisting of N data points, f (ti) where
ti = 0,∆,2∆,...,(N − 1)∆. The Fourier series approximation to this data assumes that the data
repeats itself with a period T given by T = N∆. Our goal is to determine the Fourier coeﬃcients
ak and bk. We will see that these coeﬃcients contain important physical information.
Because we know only a ﬁnite number of data points fi ≡ f (ti), it is possible to ﬁnd only a
ﬁnite set of Fourier coeﬃcients. For a given value of ∆, what is the largest frequency component
we can extract? In the following, we give a plausibility argument that suggests that the
CHAPTER 9. NORMAL MODES AND WAVES 322
maximum frequency we can analyze is given by
ωQ ≡
π
∆
or fQ ≡
1
2∆
(Nyquist frequency) (9.29)
where fQ is the Nyquist frequency.
One way to understand (9.29) is to analyze a sine wave, sinωt. If we choose ∆ = π/ω; that
is, sample at the Nyquist frequency, we would sample two points per cycle. If we sample at
or below this frequency, we sample during the sine’s positive and negative displacement. If
we sample above this frequency, we would sample during the positive displacement of one
crest and the positive displacement of another crest, thereby completely missing the negative
displacement. In Problem 9.10, we explore sampling at frequencies higher than the Nyquist
frequency.
Problem 9.10. Nyquist demonstration
Run OscillatorsApp with 16 particles in normal mode 27. What is the nature of the motion
that you observe? Because the particles sample the analytic function f (x) which sets the initial
position at intervals ∆ that are larger than 1/(2λ), the particle positions cannot accurately represent
the high spatial frequencies. What mode number corresponds to sampling at the Nyquist
frequency?
The Nyquist frequency is very important in signal processing because the sampling theorem
due to Nyquist and Shannon states that a continuous function is completely determined by N
sampling points if the signal has passed though a ﬁlter with a cutoﬀ frequency less than fQ.
This theorem is easy to understand. The ﬁlter will take out all the high frequency modes and
allow only N modes of the signal to pass through. With N sampling points we can solve for the
amplitudes of each mode using (9.35) (also see Section 9.6).
One consequence of (9.29) is that there are ωQ/ω0 + 1 independent coeﬃcients for ak (including
a0) and ωQ/ω0 independent coeﬃcients for bk, a total of N +1 independent coeﬃcients.
(Recall that ωQ/ω0 = N/2, where ω0 = 2π/T and T = N∆.) Because sinωQt = 0 for all values
of t that are multiples of ∆, we have bN/2 = 0 from (9.26b). Consequently, there are N/2 − 1
values for bk, and hence, a total of N Fourier coeﬃcients that can be computed. This conclusion
is reasonable because the number of meaningful Fourier coeﬃcients should be the same as the
number of data points.
The Analyze class computes the Fourier coeﬃcients ak and bk of a function f (t) deﬁned
between t = 0 and t = T at intervals of ∆. To compute the coeﬃcients, we do the integrals in
(9.26) numerically using the simple rectangular approximation (see Section 11.1):
ak ≈
2∆
T
N−1
i=0
f (ti)cosωkti (9.30a)
bk ≈
2∆
T
N−1
i=0
f (ti)sinωkti (9.30b)
where the ratio 2∆/T = 2/N.
Listing 9.5: The Analyze class calculates the Fourier coeﬃcients of a function.
package org . opensourcephysics . sip . ch09 ;
import org . opensourcephysics . numerics . Function ;
CHAPTER 9. NORMAL MODES AND WAVES 323
public class Analyze {
Function f ;
double period , delta ;
double omega0 ;
int N;
Analyze ( Function f , int N, double delta ) {
this . f = f ;
this . delta = delta ;
this .N = N;
period = N delta ;
omega0 = 2 Math . PI/ period ;
}
double getSineCoefficient ( int n) {
double sum = 0;
double t = 0;
for ( int i = 0; i <N; i ++) {
sum += f . evaluate ( t ) Math . sin (n omega0 t ) ;
t += delta ;
}
return 2 delta sum/ period ;
}
double getCosineCoefficient ( int n) {
double sum = 0;
double t = 0;
for ( int i = 0; i <N; i ++) {
sum += f . evaluate ( t ) Math . cos (n omega0 t ) ;
t += delta ;
}
return 2 delta sum/ period ;
}
}
The AnalyzeApp program reads a function and creates an Analyze object to plot the Fourier
coeﬃcients ak and bk versus k. Note how we use a parser to convert a string of characters into a
function using a ParsedFunction object. Because typing mistakes are common when entering
mathematical expressions, we return from the calculate method if a syntax error is detected.
Listing 9.6: A program that analyzes the Fourier components of a function.
package org . opensourcephysics . sip . ch09 ;
import java . awt . ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . frames . ;
import org . opensourcephysics . numerics . ;
public class AnalyzeApp extends AbstractCalculation {
PlotFrame frame = new PlotFrame ( "frequency" , "coefficients" ,
"Fourier analysis" ) ;
public AnalyzeApp ( ) {
frame . setMarkerShape (0 , Dataset .POST ) ;
CHAPTER 9. NORMAL MODES AND WAVES 324
frame . setMarkerColor (0 , new Color (255 , 0 , 0 , 1 28 )) ;
/ / semitransparent red
frame . setMarkerShape (1 , Dataset .POST ) ;
frame . setMarkerColor (1 , new Color (0 , 0 , 255 , 128));\
/ / semitransparent blue
frame . setXYColumnNames (0 , "frequency" , "cos" ) ;
frame . setXYColumnNames (1 , "frequency" , "sin" ) ;
}
public void calculate ( ) {
double delta = control . getDouble ( "delta" ) ;
int N = control . getInt ( "N" ) ;
int numberOfCoefficients = control . getInt ( "number of coefficients" ) ;
String f S t r = control . getString ( "f(t)" ) ;
Function f = null ;
try {
f = new ParsedFunction ( fStr , "t" ) ;
} catch ( ParserException ex ) {
control . println ( "Error parsing function string: "+f S t r ) ;
return ;
}
Analyze analyze = new Analyze ( f , N, delta ) ;
double f0 = 1.0/(N delta ) ;
for ( int i = 0; i <=numberOfCoefficients ; i ++) {
frame . append (0 , i f0 , analyze . getCosineCoefficient ( i ) ) ;
frame . append (1 , i f0 , analyze . getSineCoefficient ( i ) ) ;
}
/ / Data t a b l e s can be displayed by the user using
/ / the t o o l s menu but t h i s statemment does so e x p l i c i t l y .
frame . showDataTable ( true ) ;
}
public void reset ( ) {
control . setValue ( "f(t)" , "sin(pi*t/10)" ) ;
control . setValue ( "delta" , 0 . 1 ) ;
control . setValue ( "N" , 200);
control . setValue ( "number of coefficients" , 10);
calculate ( ) ;
}
public s t a t i c void main ( String [ ] args ) {
CalculationControl . createApp (new AnalyzeApp ( ) ) ;
}
}
In Problem 9.11 we compute the Fourier coeﬃcients for several functions. We will see that if
f (t) is a sum of sinusoidal functions with diﬀerent periods, it is essential that the period T = N∆
in the Fourier analysis program be an integer multiple of the periods of all the functions in the
sum. If T does not satisfy this condition, then the results for some of the Fourier coeﬃcients will
be spurious. In practice, the solution to this problem is to vary the sampling interval ∆ and the
total time over which the signal f (t) is sampled. Fortunately, the results for the power spectrum
(see Section 9.6) are less ambiguous than the values for the Fourier coeﬃcients themselves.
Problem 9.11. Fourier analysis
CHAPTER 9. NORMAL MODES AND WAVES 325
(a) Use the AnalyzeApp class with f (t) = sinπt/10. Determine the ﬁrst three nonzero Fourier
coeﬃcients by doing the integrals in (9.26) analytically before running the program. Choose
the number of data points to be N = 200 and the sampling time ∆ = 0.1. Which Fourier
components are nonzero? Repeat your analysis for N = 400,∆ = 0.1; N = 200,∆ = 0.05;
N = 205,∆ = 0.1; and N = 500,∆ = 0.1, and other combinations of N and ∆. Explain your
results by comparing the period of f (t) with N∆, the assumed period. If the combination of
N and ∆ are not chosen properly, do you ﬁnd any spurious results for the coeﬃcients?
(b) Consider the functions f1(t) = sinπt/10 + sinπt/5, f2(t) = sinπt/10 + cosπt/5, and f3(t) =
sinπt/10 + 1
2 cosπt/5, and answer the same questions as in part (a) for each function. What
combinations of N and ∆ give reasonable results for each function?
(c) Consider a function that is not periodic, but goes to zero for |t| large. For example, try
f (t) = t4e−t2
and f (t) = t3e−t2
. Interpret the diﬀerence between the Fourier coeﬃcients of
these two functions.
As shown in Appendix 9A, sine and cosine functions in a Fourier series can be combined
into exponential functions with complex coeﬃcients and complex exponents. We express f (t)
as
f (t) =
∞
k=−∞
ck eiωkt
(9.31)
where
ωk = kω0 and ω0 =
2π
T
(9.32)
and use (9.24) to express the complex coeﬃcients ck in terms of ak and bk:
ck =
1
2
(ak − ibk) (9.33a)
c0 =
1
2
a0 (9.33b)
c−k =
1
2
(ak + ibk). (9.33c)
The coeﬃcients ck can be expressed in terms of f (t) by using (9.33) and (9.26) and the fact that
e±iωkt = cosωkt ± i sinωkt. The result is
ck =
1
T
T /2
−T /2
f (t)e−iωkt
dt. (9.34)
As in (9.30), we can approximate the integral in (9.34) using the rectangular approximation.
We write
g(ωk) ≡ ck
T
∆
≈
N/2
j=−N/2
f (j∆)e−iωkj∆
=
N/2
j=−N/2
f (j∆)e−i2πkj/N
. (9.35)
If we multiply (9.35) by ei2πkj /N , sum over k, and use the orthogonality condition
N/2
k=−N/2
ei2πkj/N
e−i2πkj /N
= Nδj,j (9.36)
CHAPTER 9. NORMAL MODES AND WAVES 326
we obtain the inverse Fourier transform
f (j∆) =
1
N
N/2
k=−N/2
g(ωk)ei2πkj/N
=
1
N
N/2
k=−N/2
g(ωk)eiωktj . (9.37)
The frequencies ωk for k > N/2 are greater than the Nyquist frequency ωQ. We can interpret
the frequencies for k > N/2 as negative frequencies equal to (k − N)ω0 (see Problem 9.13). The
occurrence of negative frequency components is a consequence of the use of the exponential
functions rather than sines and cosines. Note that f (t) is real if g(−ωk) = g(ωk) because the
sinωk terms in (9.37) cancel due to symmetry.
The calculation of a single Fourier coeﬃcient using (9.30) requires approximately O(N)
multiplications. Because the complete Fourier transform contains N complex coeﬃcients, the
calculation requires O(N2) multiplications and may require hours to complete if the sample
contains just a few megabytes of data. Because many of the calculations are redundant, it is
possible to organize the calculation so that the computational time is order N logN. Such an
algorithm is called a fast Fourier transform (FFT) and is discussed in Appendix 9B. The improvement
in speed is dramatic. A dataset containing 106 points requires ≈ 6 × 106 multiplications
rather than ≈ 1012. Because we will use this algorithm to study diﬀraction and other phenomena,
and because coding this algorithm is nontrivial, we have provided an implementation of
the FFT in the Open Source Physics numerics package. We can use this FFT class to transform
between time and frequency or position and wavenumber. The FFTApp program shows how the
FFT class is used.
Listing 9.7: The FFTApp program computes the fast Fourier transform of a function and displays
the coeﬃcients.
package org . opensourcephysics . sip . ch09 ;
import java . text . DecimalFormat ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . numerics . FFT ;
public class FFTApp extends AbstractCalculation {
public void calculate ( ) {
/ / output format
DecimalFormat decimal = new DecimalFormat ( "0.0000" ) ;
/ / number of Fourier c o e f f i c i e n t s
int N = 8;
/ / array that w i l l be transformed
double [ ] z = new double [2 N] ;
/ / FFT implementation f o r N points
FFT f f t = new FFT(N) ;
/ / mode or harmonic of e ^( i x )
int mode = control . getInt ( "mode" ) ;
double x = 0 , delta = 2 Math . PI/N;
for ( int i = 0; i <N; i ++) {
z [2 i ] = Math . cos (mode x ) ; / / r e a l component of e ^( i mode x )
z [2 i +1] = Math . sin (mode x ) ; / / imaginary component of e ^( i mode x )
x += delta ; / / i n c r e a s e x
}
/ / transform data ; data w i l l be in wrap−around order
f f t . transform ( z ) ;
for ( int i = 0; i <N; i ++) {
CHAPTER 9. NORMAL MODES AND WAVES 327
System . out . println ( "index = "+i+"\t real = "
+decimal . format ( z [2 i ])+ "\t imag = "
+decimal . format ( z [2 i + 1 ] ) ) ;
}
}
public void reset ( ) {
control . setValue ( "mode" , −1);
}
public s t a t i c void main ( String [ ] args ) {
CalculationControl . createApp (new FFTApp ( ) ) ;
}
}
The FFT class replaces the data in the input array with transformed values. (Make an array
copy if you need to retain the original data.) Because the transformation assumes N complex
data points and Java does not support a primitive complex data type, the input array has length
2N. The real part of the nth data point is located in array element 2n and the imaginary part
is in element 2n + 1. The transformed data maintains this same ordering of real and imaginary
parts. We test the FFT class in Problem 9.12.
Problem 9.12. The FFT class
(a) The FFTApp initializes the array that will be transformed by evaluating the following complex
function:
fn = f (n∆) = ein∆
= cosn∆ + i sinn∆ (9.38)
where n is an integer and ∆ = 2π/N is the interval between data points. We refer to n as the
mode variable in the program. What happens to the Fourier component if the phase of the
complex exponential is shifted by α? In other words, what happens if the data is initialized
using fn = f (n∆) = ein∆+α?
(b) Modify and run FFTApp to show that the fast Fourier transform produces a single component
only if the grid contains an integer number of wavelengths.
(c) Change the number of grid points N and show that the value of the nonzero Fourier coeﬃcient
is equal to N if the input function is (9.38). Note that some other FFT implementations
normalize the result by dividing by N.
(d) Show that the original function can be recovered by invoking the FFT.inverse method.
Problem 9.13. Negative frequencies
(a) Use (9.38) as the input function and show that FFTApp produces the same Fourier coeﬃcients
if ω = 2π/(N∆) or ω = −2π/(N∆). Repeat for ω = 4π/(N∆) and ω = −4π/(N∆).
(b) Compute the Fourier coeﬃcients using fn = f (n∆) = cosn∆, where n = 0,1,2,...,N − 1. Repeat
using a sine function. Interpret your results using
cosθ =
eiθ + e−iθ
2
(9.39a)
sinθ =
eiθ − e−iθ
2
. (9.39b)
CHAPTER 9. NORMAL MODES AND WAVES 328
As shown in Problem 9.13, the Fourier coeﬃcient indices for n ≥ N/2 should be interpreted
as negative frequencies. The transformed data still contains N frequencies, and these frequencies
are still separated by 2π/(N∆).
Because the frequencies switch signs at the array’s midpoint index N/2, we refer to the
transformed data as being in wrap-around order. The toNaturalOrder method can be used to
sort and normalize the Fourier components in order of increasing frequency starting at ωQ =
−π/∆ and continuing to ωQ = π/∆((N − 2)/N) if N is even and continuing to ωQ = π/∆ if N is
odd.
Exercise 9.14. Natural order
Invoke the toNaturalOrder method after performing the FFT in the FFTApp program. Modify
the print statement so that the natural frequency is shown and repeat Problem 9.13. If N is
even, the Fourier components have a frequency separation ∆ω given by
∆ω =
2π
N∆
. (9.40)
What is the frequency separation if N is odd?
As we have seen, computing Fourier transformations is straightforward but requires a fair
amount of bookkeeping. To simplify the process, we have deﬁned the FFTFrame class in the
frames package to perform a FFT and display the coeﬃcients. This utility class accepts either
data arrays or functions as input parameters in the doFFT method. The code shown in Listing 9.8
transforms an input array. We use the FFTFrame in Problem 9.12.
Listing 9.8: The FFTCalculationApp displays the coeﬃcients of the function e2πnx.
package org . opensourcephysics . sip . ch09 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . FFTFrame ;
public class FFTCalculationApp extends AbstractCalculation {
FFTFrame frame = new FFTFrame ( "frequency" , "amplitude" ,
"FFT Frame Test" ) ;
public void calculate ( ) {
double xmin = control . getDouble ( "xmin" ) ;
double xmax = control . getDouble ( "xmax" ) ;
int n = control . getInt ( "N" ) ;
double
xi = xmin , delta = (xmax−xmin )/n ;
double [ ] data = new double [2 n ] ;
int mode = control . getInt ( "mode" ) ;
for ( int i = 0; i <n ; i ++) {
data [2 i ] = Math . cos (mode xi ) ;
data [2 i +1] = Math . sin (mode xi ) ;
xi += delta ;
}
frame . doFFT ( data , xmin , xmax ) ;
frame . showDataTable ( true ) ;
}
CHAPTER 9. NORMAL MODES AND WAVES 329
public void reset ( ) {
control . setValue ( "mode" , 1 ) ;
control . setValue ( "xmin" , 0 ) ;
control . setValue ( "xmax" , "2*pi" ) ;
control . setValue ( "N" , 32);
calculate ( ) ;
}
public s t a t i c void main ( String [ ] args ) {
CalculationControl . createApp (new FFTCalculationApp ( ) ) ;
}
}
Problem 9.15. Spatial Fourier transforms and phase
So far we have considered only nonnegative values of t for functions f (t). Spatial Fourier transforms
are of interest in many contexts, and these transforms usually involve both positive and
negative values of x.
(a) Write a program using a CalculationControl that computes the real and imaginary parts
of the Fourier transform φ(q) of a complex function ψ(x) = f (x) + ig(x), where f (x) and g(x)
are real and x has both positive and negative values. Note that the wavenumber q = 2π/L is
analogous to the angular frequency ω = 2π/T .
(b) Compute the Fourier transform of the Gaussian function ψ(x) = e−bx2
in the interval [−5,5].
Examine ψ(x) and φ(q) for at least three values of b such that the Gaussian is contained
within the interval. Does φ(q) appear to be a Gaussian? Choose a reasonable criterion for
the half-width of ψ(x) and measure its value. Use the same criterion to measure the halfwidth
of φ(q). How do these widths depend on b? How does the width of φ(q) change as the
width of ψ(x) increases?
(c) Repeat part (b) with the function ψ(x) = Ae−b(x−x0)2
for various values of x0. What eﬀect
does shifting the peak have on φ(q)?
(d) Repeat part (b) with the function ψ(x) = Ae−bx2
eiq0x for various values of q0. What eﬀect
does the phase oscillation have on φ(q)?
9.4 Two-Dimensional Fourier Series
The extension of the ideas of Fourier analysis to two dimensions is simple and direct. We will
use two-dimensional FFTs when we study diﬀraction in Section 9.9.
If we assume a function of two variables f (x,y), then a two-dimensional series is constructed
using harmonics of both variables. The basis functions are the products of one dimensional
basis functions eixqx eiyqy , and the Fourier series is written as a sum of these harmonics:
f (x,y) =
N/2
n=−N/2
M/2
m=−M/2
cn,m eiqnx
eiqmy
(9.41)
CHAPTER 9. NORMAL MODES AND WAVES 330
where
qn =
2πn
X
and qm =
2πm
Y
. (9.42)
The function f (x,y) is assumed to be periodic in both x and y with periods X and Y , respectively.
The Fourier coeﬃcients are again calculated by integrating the product of the function with a
basis function:
cn,m =
X/2
−X/2
Y /2
−Y /2
f (x,y)ei(qnx+qmy)
dxdy. (9.43)
Because of the large number of coeﬃcients cn,m, the discrete two-dimensional Fourier transform
is best implemented using the FFT algorithm. The FFT2DCalculationApp program shows
how to compute a two-dimensional FFT using the FFT2DFrame utility class.
Listing 9.9: The FFT2DCalculationApp program computes the two-dimensional fast Fourier
transform of a function and shows the resulting coeﬃcients using a grid plot.
package org . opensourcephysics . sip . ch09 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . FFT2DFrame ;
public class FFT2DCalculationApp extends AbstractCalculation {
FFT2DFrame frame = new FFT2DFrame( "k_x" , "k_y" , "2D FFT" ) ;
public void calculate ( ) {
int xMode = control . getInt ( "x mode" ) ,
yMode = control . getInt ( "y mode" ) ;
double xmin = control . getDouble ( "xmin" ) ;
double xmax = control . getDouble ( "xmax" ) ;
int nx = control . getInt ( "Nx" ) ;
double ymin = control . getDouble ( "ymin" ) ;
double ymax = control . getDouble ( "ymax" ) ;
int ny = control . getInt ( "Ny" ) ;
/ / data s t o r e d in row−major format
double [ ] zdata = new double [2 nx ny ] ;
double y = 0 , yDelta = 2 Math . PI/ny ;
for ( int iy = 0; iy<ny ; iy ++) { / / loop over rows in array
/ / o f f s e t to beginning of a row ; each row i s nx long
int o f f s e t = 2 iy nx ;
double x = 0 , xDelta = 2 Math . PI/nx ;
for ( int ix = 0; ix<nx ; ix ++) {
/ / z function i s e ^( i xmode x ) e ^( i ymode y )
zdata [ o f f s e t +2 ix ] = / / r e a l part
Math . cos (xMode x ) Math . cos (yMode y )
−Math . sin (xMode x ) Math . sin (yMode y ) ;
zdata [ o f f s e t +2 ix +1] = / / imaginary part
Math . sin (xMode x ) Math . cos (yMode y )
+Math . cos (xMode x ) Math . sin (yMode y ) ;
x += xDelta ;
}
y += yDelta ;
}
frame . doFFT ( zdata , nx , xmin , xmax , ymin , ymax ) ;
}
CHAPTER 9. NORMAL MODES AND WAVES 331
public void reset ( ) {
control . setValue ( "x mode" , 0 ) ;
control . setValue ( "y mode" , 1 ) ;
control . setValue ( "xmin" , 0 ) ;
control . setValue ( "xmax" , "2*pi" ) ;
control . setValue ( "ymin" , 0 ) ;
control . setValue ( "ymax" , "2*pi" ) ;
control . setValue ( "Nx" , 16);
control . setValue ( "Ny" , 16);
}
public s t a t i c void main ( String [ ] args ) {
CalculationControl . createApp (new FFT2DCalculationApp ( ) ) ;
}
}
The FFT2DFrame is based on FFT routines contributed to the GNU Scientiﬁc Library (GSL)
by Brian Gough and adapted to Java by Bruce Miller at NIST. We initialize our data array to
conform to the GSL API using a one-dimensional array such that rows follow sequentially. This
ordering is known as row-major format. Because the input function is assumed to be complex,
the array has dimension 2NxNy, where Nx and Ny are the number of grid points in the x and
y direction, respectively. The FFT2DFrame object transforms and displays the data when the
doFFT method is invoked.
Exercise 9.16. Two-dimensional FFT
Write a program to transform a two-dimensional Gaussian using the FFT2DFrame class. Note
that color is used to represent the complex phase. What happens if the Gaussian is not centered
on the grid?
9.5 Fourier Integrals
Fourier analysis can be extended to approximate waveforms that do not repeat themselves by
converting the Fourier sum over discrete frequency components to an integral. The Fourier
integral transforms a continuous function of time f (t) into a continuous function of frequency
g(ω) as follows:
g(ω) =
∞
−∞
f (t)eiωt
dt. (9.44)
The inverse transformation reverses this process:
f (t) =
1
2π
∞
−∞
g(ω)eiωt
dω. (9.45)
Equations (9.44) and (9.45) are known as a Fourier transform pair.
Because we need to store functions such as f (t) and g(ω) at a discrete number of points,
Fourier integrals are usually approximated using Fourier series with a large number of terms
and a large sampling time. In other words, the time over which the signal is measured is large
compared to the period of interest. Exercise 9.17 illustrates how a Fourier series for a pulse
train approaches a continuous frequency spectrum as the period approaches inﬁnity.
CHAPTER 9. NORMAL MODES AND WAVES 332
(a)
(c)
(b)
T
Figure 9.2: A series of pulses with increasing periods.
Exercise 9.17. Fourier integral
Write a program to plot the frequency spectrum of the waveforms shown in Figure 9.2. Use
the FFT algorithm. How does the frequency spectrum change as the waveform becomes less
periodic? How does the ﬁnite time interval aﬀect the result?
9.6 Power Spectrum
The power output of a periodic electrical signal f (t) is proportional to the integral of the signal
squared:
P =
1
T
T
0
|f (t)|2
dt. (9.46)
Another way to look at the power is to calculate the power P associated with the Fourier components
ωk of a signal that has been sampled at regular intervals. If we substitute (9.31) into
(9.46), rearrange terms, and apply the orthogonality condition (9.14), we obtain
P =
N/2
j=−N/2
N/2
k=−N/2
c∗
jck
1
T
T
0
ei(ωk−ωj )t
dt =
N/2
k=−N/2
|ck|2
. (9.47)
From (9.47) we conclude that the average power of |f (t)|2 is the sum of the power in each frequency
component P (ωk) = |ck|2. This result is one form of Parseval’s theorem.
In most measurements, the function f (t) corresponds to an amplitude, and the power or
intensity is proportional to the square of this amplitude, or for complex functions, the modulus
squared. The power spectrum P (ωk) is proportional to the power associated with a particular
frequency component embedded in the quantity of interest. Well-deﬁned peaks in P (ωk) often
correspond to normal mode frequencies.
CHAPTER 9. NORMAL MODES AND WAVES 333
What happens to the power associated with frequencies greater than the Nyquist frequency?
To answer this question, consider two choices of the Nyquist frequency ωQa
and ωQb
> ωQa
and
the corresponding sampling times ∆b < ∆a. The calculation with ∆ = ∆b is more accurate because
the sampling time is smaller. Suppose that this calculation of the spectrum yields the
result that P (ω > ωa) > 0. What happens if we compute the power spectrum using ∆ = ∆a? The
power associated with ω > ωa must be “folded” back into the ω < ωa frequency components.
For example, the frequency component at ω+ωa is added to the true value at ω−ωa to produce
an incorrect value at ω −ωa in the computed power spectrum. This phenomenon is called aliasing
and leads to spurious results. Aliasing occurs in calculations of P (ω) if the latter does not
vanish above the Nyquist frequency. To avoid aliasing, it is necessary to sample more frequently
or to remove the high frequency components from the signal before sampling the data.
Although the power spectrum can be computed by a simple modiﬁcation of AnalyzeApp, it
is a good idea to use the FFT for many of the following problems.
Problem 9.18. Aliasing
Sample the sinusoidal function, sin2πt, and display the resulting power spectrum using sampling
frequencies above and below the Nyquist frequency. Start with a sampling time of ∆ = 0.1
and increase the time until ∆ = 10.0.
(a) Is the power spectrum sharp? That is, is all the power located in a single frequency? Does
your answer depend on the ratio of the period to the sampling time?
(b) Explain the appearance of the power spectrum for ∆ = 1.25, ∆ = 1.75, and ∆ = 2.5.
(c) What is the power spectrum if you sample at the Nyquist frequency or twice the Nyquist
frequency?
Problem 9.19. Examples of power spectra
(a) Create a data set with N points corresponding to f (t) = 0.3cos(2πt/T ) + r, where r is a
uniform random number between 0 and 1 and T = 4. Plot f (t) versus t in time intervals
of ∆ = 4T /N for N = 128. Can you visually detect the periodicity? Compute the power
spectrum using the same sampling interval ∆ = 4T /N. Does the frequency dependence of
the power spectrum indicate that there are any special frequencies? Repeat with T = 16.
Are high or low frequency signals easier to pick out from the random background?
(b) Simulate a one-dimensional random walk and compute the time series x2(t), where x(t)
is the distance from the origin of the walk after t steps. Average x2(t) over several trials.
Compute the power spectrum for a walk of t ≤ 256. In this case ∆ = 1, the time between
steps. Do you observe any special frequencies?
(c) Let fn be the nth member of a random number sequence. As in part (b), ∆ = 1. Compute the
power spectrum of the random number generator. Do you detect any periodicities? If so, is
the random number generator acceptable?
Problem 9.20. Power spectrum of coupled oscillators
(a) Modify your program developed in Problem 9.2 so that the power spectrum of the position
of one of the N particles is computed at the end of the simulation. Set ∆ = 0.1 so that
CHAPTER 9. NORMAL MODES AND WAVES 334
the Nyquist frequency is ωQ = π/∆ ≈ 31.4. Choose the time of the simulation equal to
T = 25.6 and let k/m = 1. Plot the power spectrum P (ω) at frequency intervals equal to
∆ω = ω0 = 2π/T . First choose N = 2 and choose the initial conditions so that the system is
in a normal mode. What do you expect the power spectrum to look like? What do you ﬁnd?
Then choose N = 10 and choose initial conditions corresponding to various normal modes.
Is the power spectrum the same for all particles?
(b) Repeat part (a) for N = 2 and N = 10 with random initial particle displacements between
−0.5 and +0.5 and zero initial velocities. Can you detect all the normal modes in the power
spectrum? Repeat for a diﬀerent set of random initial displacements.
(c) Repeat part (a) for initial displacements corresponding to the equal sum of two normal
modes. Do the power spectrum show two peaks? Are these peaks of equal height?
(d) Recompute the power spectrum for N = 10 with T = 6.4. Is this time long enough? How
can you tell?
Problem 9.21. Quasiperiodic power spectra
(a) Write a program to compute the power spectrum of the circle map (6.62). Begin by exploring
the power spectrum for K = 0. Plot lnP (ω) versus ω, where P (ω) is proportional
to the modulus squared of the Fourier transform of xn. Begin with 256 iterations. How do
the power spectra diﬀer for rational and irrational values of the parameter Ω? How are the
locations of the peaks in the power spectra related to the value of Ω?
(b) Set K = 1/2 and compute the power spectra for 0 < Ω < 1. Does the power spectra diﬀer
from the spectra found in part (a)?
(c) Set K = 1 and compute the power spectra for 0 < Ω < 1. How does the power spectra
compare to those found in parts (a) and (b)?
In Problem 9.20 we found that the peaks in the power spectrum yield information about
the normal mode frequencies. In Problems 9.22 and 9.23 we compute the power spectra for
a system of coupled oscillators with disorder. Disorder can be generated by having random
masses or random spring constants (or both). We will see that one eﬀect of disorder is that
the normal modes are no longer simple sinusoidal functions. Instead, some of the modes are
localized, meaning that only some of the particles move signiﬁcantly while the others remain
essentially at rest. This eﬀect is known as Anderson localization. Typically, we ﬁnd that modes
above a certain frequency are localized, and those below this threshold frequency are extended.
The threshold frequency is well deﬁned for large systems. All states are localized in the limit
of an inﬁnite chain with any amount of disorder. The dependence of localization on disorder in
systems of coupled oscillators in higher dimensions is more complicated.
Problem 9.22. Localization with a single defect
(a) Modify your program developed in Problem 9.2 so that the mass of one oscillator is equal
to one fourth that of the others. Set N = 20 and use ﬁxed boundary conditions. Compute
the power spectrum over a time T = 51.2 using random initial displacements between −0.5
and +0.5 and zero initial velocities. Sample the data at intervals of ∆ = 0.1. The normal
mode frequencies correspond to the well-deﬁned peaks in P (ω). Consider at least three
diﬀerent sets of random initial displacements to insure that you ﬁnd all the normal mode
frequencies.
CHAPTER 9. NORMAL MODES AND WAVES 335
(b) Apply an external force Fe = 0.3sinωt to each particle. (The steady-state behavior occurs
sooner if we apply an external force to each particle instead of just one particle.) Because
the external force pumps energy into the system, it is necessary to add a damping force
to prevent the oscillator displacements from becoming too large. Add a damping force
equal to −γvi to all the oscillators with γ = 0.1. Choose random initial displacements and
zero initial velocities and use the frequencies found in part (a) as the driving frequencies
ω. Describe the motion of the particles. Is the system driven to a normal mode? Take a
“snapshot” of the particle displacements after the system has run for a suﬃciently long time,
so that the patterns repeat themselves. Are the particle displacements simple sinusoidal
functions? Sketch the approximate normal mode patterns for each normal mode frequency.
Which of the modes appear localized and which modes appear to be extended? What is the
approximate cutoﬀ frequency that separates the localized from the extended modes?
Problem 9.23. Localization in a disordered chain of oscillators
(a) Modify your program so that the spring constants can be varied by the user. Set N = 10 and
use ﬁxed boundary conditions. Consider the following set of 11 spring constants: 0.704,
0.388, 0.707, 0.525, 0.754, 0.721, 0.006, 0.479, 0.470, 0.574, 0.904. To help you determine
all the normal modes, we provide two of the normal mode frequencies: ω ≈ 0.28 and 1.15.
Find the power spectrum using the procedure outlined in Problem 9.22a.
(b) Apply an external force Fe = 0.3sinωt to each particle and ﬁnd the normal modes as outlined
in Problem 9.22b.
(c) Repeat parts (a) and (b) for another set of random spring constants for N = 40. Discuss
the nature of the localized modes in terms of the speciﬁc values of the spring constants.
For example, is the edge of a localized mode at a spring that has a relatively large or small
spring constant?
(d) Repeat parts (a) and (b) for uniform spring constants but random masses between 0.5 and
1.5. Is there a qualitative diﬀerence between the two types of disorder?
In 1955 Fermi, Pasta, and Ulam used the Maniac I computer at Los Alamos to study a chain
of oscillators. Their surprising discovery might have been the ﬁrst time a qualitatively new
result, instead of a more precise number, was found from a simulation. To understand their
results, we need to discuss an idea from statistical mechanics that was discussed in Project 8.23.
A fundamental assumption of statistical mechanics is that an isolated system of particles
is quasi-ergodic; that is, the system will evolve through all conﬁgurations consistent with the
conservation of energy. A system of linearly coupled oscillators is not quasi-ergodic, because if
the system is initially in a normal mode, it stays in that normal mode forever. Before 1955 it
was believed that if the interaction between the particles is weakly nonlinear (and the number
of particles is suﬃciently large), the system would be quasi-ergodic and evolve through the
diﬀerent normal modes of the linear system. In Problem 9.24 we will ﬁnd, as did Fermi, Pasta,
and Ulam, that the behavior of the system is much more complicated. The question of ergodicity
in this system is known as the FPU problem.
CHAPTER 9. NORMAL MODES AND WAVES 336
Problem 9.24. Nonlinear oscillators
(a) Modify your program so that cubic forces between the particles are added to the linear
spring forces. That is, let the force on particle i due to particle j be
Fij = −(ui − uj) − α(ui − uj)3
(9.48)
where α is the amplitude of the nonlinear term. Choose the masses of the particles to be
unity. Consider N = 10 and choose initial displacements corresponding to a normal mode
of the linear (α = 0) system. Compute the power spectrum over a time T = 51.2 with ∆ = 0.1
for α = 0, 0.1, 0.2, and 0.3. For what value of α does the system become ergodic; that is, for
what value of α are the heights of all the normal mode peaks approximately the same?
(b) Repeat part (a) for the case where the displacements of the particles are initially random.
Use the same set of random displacements for each value of α.
(c)∗ We now know that the number of oscillators is not as important as the magnitude of the
nonlinear interaction. Repeat parts (a) and (b) for N = 20 and 40 and discuss the eﬀect of
increasing the number of particles.
9.7 Wave Motion
Our simulations of coupled oscillators have shown that the microscopic motion of the individual
oscillators leads to macroscopic wave phenomena. To understand the transition between
microscopic and macroscopic phenomena, we reconsider the oscillations of a linear chain of N
particles with equal spring constants k and equal masses m. As we found in Section 9.1, the
equations of motion of the particles can be written as [see (9.1)]
d2uj(t)
dt2
= −
k
m
2uj(t) − uj+1(t) − uj−1(t) (j = 1,...,N). (9.49)
We consider the limits N → ∞ and a → 0 with the length of the chain Na ﬁxed. We will ﬁnd
that the discrete equations of motion (9.49) can be replaced by the continuous wave equation
∂2u(x,t)
∂t2
= c2 ∂2u(x,t)
∂x2
(9.50)
where c has the dimension of velocity.
We obtain the wave equation (9.50) as follows. First we replace uj(t), where j is a discrete
variable, by the function u(x,t), where x is a continuous variable, and rewrite (9.49) in the form
∂2u(x,t)
∂t2
=
ka2
m
1
a2
u(x + a,t) − 2u(x,t) + u(x − a,t) . (9.51)
We have written the time derivative as a partial derivative because the function u depends on
two variables. If we use the Taylor series expansion
u(x ± a) = u(x) ± a
du
dx
+
a2
2
d2u
dx2
+ ... (9.52)
CHAPTER 9. NORMAL MODES AND WAVES 337
it is easy to show that as a → 0, the quantity
1
a2
u(x + a,t) − 2u(x,t) + u(x − a,t) →
∂2u(x,t)
∂x2
. (9.53)
(We have written a spatial derivative as a partial derivative for the same reason as before.)
The wave equation (9.50) is obtained by substituting (9.53) into (9.51) with c2 = ka2/m. If we
introduce the linear mass density µ = M/a and the tension T = ka, we can express c in terms of
µ and T and obtain the familiar result c2 = T /µ.
It is straightforward to show that any function of the form f (x ± ct) is a solution to (9.50).
Among these many solutions to the wave equation are the familiar forms:
u(x,t) = Acos
2π
λ
(x ± ct) (9.54a)
u(x,t) = Asin
2π
λ
(x ± ct). (9.54b)
Because the wave equation is linear and hence, satisﬁes a superposition principle, we can understand
the behavior of a wave of arbitrary shape by representing its shape as a sum of sinusoidal
waves.
One way to solve the wave equation (9.50) numerically is to retrace our steps back to the
discrete equations (9.49) to ﬁnd a discrete form of the wave equation that is convenient for numerical
calculations. The conversion of a continuum equation to a physically motivated discrete
form frequently leads to useful numerical algorithms. From (9.53) we see how to approximate
the second derivative by a ﬁnite diﬀerence. If we replace a by ∆x and take ∆t to be the time
step, we can rewrite (9.49) by
1
(∆t)2
[u(x,t + ∆t) − 2u(x,t) + u(x,t − ∆t)] =
c2
(∆x)2
[u(x + ∆x,t) − 2u(x,t) + u(x − ∆x,t)]. (9.55)
The quantity ∆x is the spatial interval. The result of solving (9.55) for u(x,t + ∆t) is
u(x,t + ∆t) = 2(1 − b)u(x,t)
+ b[u(x + ∆x,t) + u(x − ∆x,t)] − u(x,t − ∆t) (9.56)
where b = (c∆t/∆x)2. Equation (9.56) expresses the displacements at time t + ∆t in terms of the
displacements at the current time t and at the previous time t − ∆t.
Problem 9.25. Solution of the discrete wave equation
(a) Write a program to compute the numerical solutions of the discrete wave equation (9.56).
Three spatial arrays corresponding to u(x) at times t + ∆t, t, and t − ∆t are needed. Denote
the displacement u(j∆x) by the array element u[j] where j = 0,...,N + 1. Use periodic
boundary conditions so that u0 = uN and u1 = uN+1. Draw lines between the displacements
at neighboring values of x. Note that the initial conditions require the speciﬁcation of u at
t = 0 and at t = −∆t. Let the waveform at t = 0 and t = −∆t be u(x,t = 0) = exp(−(x−10)2) and
u(x,t = −∆t) = exp(−(x − 10 + c∆t)2), respectively. What is the direction of motion implied
by these initial conditions?
CHAPTER 9. NORMAL MODES AND WAVES 338
(b) Our ﬁrst task is to determine the optimum value of the parameter b. Let ∆x = 1 and N ≥ 100
and try the following combinations of c and ∆t: c = 1,∆t = 0.1; c = 1,∆t = 0.5; c = 1,∆t = 1;
c = 1,∆t = 1.5; c = 2,∆t = 0.5; and c = 2,∆t = 1. Verify that the value b = (c∆t)2 = 1 leads to
the best results; that is, for this value of b, the initial form of the wave is preserved.
(c) It is possible to show that the discrete form of the wave equation with b = 1 is exact up to
numerical roundoﬀ error (cf. DeVries). Hence, we can replace (9.56) by the simpler algo-
rithm
u(x,t + ∆t) = u(x + ∆x,t) + u(x − ∆x,t) − u(x,t − ∆t). (9.57)
That is, the solutions of (9.57) are equivalent to the solutions of the original partial differential
equation (9.50). Try several diﬀerent initial waveforms and show that if the displacements
have the form f (x ± ct), the waveform maintains its shape with time. For the
remaining problems, we will use (9.57) corresponding to b = 1. Unless otherwise speciﬁed,
choose c = 1, ∆x = ∆t = 1, and N ≥ 100 in the following problems.
Problem 9.26. Velocity of waves
(a) Use the waveform given in Problem 9.25a and verify that the speed of the wave is unity by
determining the distance traveled in a given amount of time. Because we have set ∆x = ∆t =
1 and b = 1, the speed c = 1. (A way of incorporating diﬀerent values of c is discussed in
Problem 9.27c.)
(b) Replace the waveform considered in part (a) by a sinusoidal wave that ﬁts exactly; that is,
choose u(x,t) = sin(qx − ωt) such that sinq(N + 1) = 0. Measure the period T of the wave
by measuring the time it takes for successive maxima to pass a given point. What is the
wavelength λ of the wave? Does it depends on the value of q? The frequency of the wave is
given by f = 1/T . Verify that λf = c.
Problem 9.27. Reﬂection of waves
(a) Consider a wave of the form u(x,t) = e−(x−10−ct)2
. Use ﬁxed boundary conditions so that
u0 = uN+1 = 0. What happens to the reﬂected wave?
(b) Modify your program so that free boundary conditions are incorporated: u0 = u1 and uN =
uN+1. Compare the phase of the reﬂected wave to your result from part (a).
(c) What happens to a pulse at the boundary between two media? Set c = 1 and ∆t = 1 on the
left side of your grid and c = 2 and ∆t = 0.5 on the right side. These choices of c and ∆t
imply that b = 1 on both sides, but that the right side is updated twice as often as the left
side. What happens to a pulse that begins on the left side and moves to the right? Is there
both a reﬂected and transmitted wave at the boundary between the two media? What is
their relative phase? Find a relation between the amplitude of the incident pulse and the
amplitudes of the reﬂected and transmitted pulses. Repeat for a pulse starting from the
right side.
Problem 9.28. Superposition of waves
(a) Consider the propagation of the wave determined by u(x,t = 0) = sin(4πx/N). What must
u(x,−∆t) be so that the wave moves in the positive x direction? Test your answer by doing a
simulation. Use periodic boundary conditions. Repeat for a wave moving in the negative x
direction.
CHAPTER 9. NORMAL MODES AND WAVES 339
(b) Simulate two waves moving in opposite directions each with the same spatial dependence
given by u(x,0) = sin(4πx/N). Describe the resultant wave pattern. Repeat the simulation
for u(x,0) = sin(8πx/N).
(c) Assume that u(x,0) = sinq1x + sinq2x with q1 = 10π/N and q2 = 12π/N. Describe the qualitative
form of u(x,t) for ﬁxed t. What is the distance between modulations of the amplitude?
Estimate the wavelength associated with the ﬁne ripples of the amplitude. Estimate the
wavelength of the envelope of the wave. Find a simple relation for these two wavelengths
in terms of the wavelengths of the two sinusoidal terms. This phenomena is known as beats.
(d) Consider the motion of the two Gaussian pulses moving in opposite directions, u1(x,0) =
e−(x−10)2
and u2(x,0) = e−(x−90)2
. Choose the array at t = −∆t as in Problem 9.25. What
happens to the two pulses when they overlap or partially overlap? Do they maintain their
shape? While they are going through each other, is the displacement u(x,t) given by the
sum of the displacements of the individual pulses?
Problem 9.29. Standing waves
(a) In Problem 9.28c we considered a standing wave, the continuum analog of a normal mode
of a system of coupled oscillators. As is the case for normal modes, each point of the wave
has the same time dependence. For ﬁxed boundary conditions, the displacement is given
by u(x,t) = sinqxcosωt, where ω = cq, and the wavenumber q is chosen so that sinqN = 0.
Choose an initial condition corresponding to a standing wave for N = 100. Describe the
motion of the particles and compare it with your observations of standing waves on a rope.
(b) Establish a standing wave by displacing one end of a system periodically. The other end
is ﬁxed. Let u(x,0) = u(x,−∆t) = 0, and u(x = 0,t) = Asinωt with A = 0.1. How long must
the simulation run before you observe standing waves? How large is the standing wave
amplitude?
We have seen that the wave equation can support pulses that propagate indeﬁnitely without
distortion. In addition, because the wave equation is linear, the sum of any two solutions is also
a solution, and the principle of superposition is satisﬁed. As a consequence, we know that two
pulses can pass through each other unchanged. We have also seen that similar phenomena exist
in the discrete system of linearly coupled oscillators. What happens if we create a pulse in a
system of nonlinear oscillators? As an introduction to nonlinear wave phenomena, we consider
a system of N coupled oscillators with the potential energy of interaction given by
V =
1
2
N
j=1
e−(uj −uj−1)
− 1
2
. (9.58)
This form of the interaction is known as the Morse potential. All parameters in the potential
(such as the overall strength of the potential) have been set to unity. The force on the jth particle
is
Fj = −
∂V
∂uj
= Qj(1 − Qj) − Qj+1(1 − Qj+1) (9.59a)
where
Qj = e−(uj −uj−1)
. (9.59b)
CHAPTER 9. NORMAL MODES AND WAVES 340
In linear systems it is possible to set up a pulse of any shape and maintain the shape of the
pulse indeﬁnitely. In a nonlinear system, there also exist solutions that maintain their shape,
but we will ﬁnd in Problem 9.30 that not all pulse shapes do so. The pulses that maintain their
shape are called solitons.
Problem 9.30. Solitons
(a) Modify the program developed in Problem 9.2 so that the force on particle j is given by
(9.59). Use periodic boundary conditions. Choose N ≥ 60 and an initial pulse of the form
u(x,t) = 0.5e−(x−10)2
. You should ﬁnd that the initial pulse splits into two pulses plus some
noise. Describe the motion of the pulses (solitons). Do they maintain their shape, or is this
shape modiﬁed as they move? Describe the motion of the particles far from the pulse. Are
they stationary?
(b) Save the displacements of the particles when the peak of one of the solitons is located near
the center of your display. Is it possible to ﬁt the shape of the soliton to a Gaussian? Continue
the simulation and after one of the solitons is relatively isolated, set u(j) = 0 for all j
far from this soliton. Does the soliton maintain its shape?
(c) Repeat part (b) with a pulse given by u(x,0) = 0 everywhere except for u(20,0) = u(21,0) = 1.
Do the resulting solitons have the same shape as in part (b)?
(d) Begin with the same Gaussian pulse as in part (a) and run until the two solitons are well
separated. Then change at random the values of u(j) for particles in the larger soliton by
about 5% and continue the simulation. Is the soliton destroyed? Increase the perturbation
until the soliton is no longer discernible.
(e) Begin with a single Gaussian pulse as in part (a). The two resultant solitons will eventually
“collide.” Do the solitons maintain their shape after the collision? The principle of superposition
implies that the displacement of the particles is given by the sum of the displacements
due to each pulse. Does the principle of superposition hold for solitons?
(f) Compute the speeds, amplitudes, and width of the solitons produced from a single Gaussian
pulse. Take the amplitude of a soliton to be the largest value of its displacement and the
half-width to correspond to the value of x at which the displacement is half its maximum
value. Repeat these calculations for solitons of diﬀerent amplitudes by choosing the initial
amplitude of the Gaussian pulse to be 0.1, 0.3, 0.5, 0.7, and 0.9. Plot the soliton speed and
width versus the corresponding soliton amplitude.
(g) Change the boundary conditions to free boundary conditions and describe the behavior of
the soliton as it reaches a boundary. Compare this behavior with that of a pulse in a system
of linear oscillators.
(h) Begin with an initial sinusoidal disturbance that would be a normal mode for a linear system.
Does the sinusoidal mode maintain its shape? Compare the behavior of the nonlinear
and linear systems.
9.8 Interference
Interference is one of the most fundamental characteristics of all wave phenomena. The term
interference is used when there are a small number of sources, and the term diﬀraction when the
CHAPTER 9. NORMAL MODES AND WAVES 341
Figure 9.3: The computed energy density in the vicinity of two point sources.
number of sources is large and can be treated as a continuum. Because it is relatively easy to
observe interference and diﬀraction phenomena with light, we discuss these phenomena in this
context.
Consider the ﬁeld from one or more point sources lying in a plane. The electric ﬁeld at
position r associated with the light emitted from a monochromatic point source at r1 is a spherical
wave radiating from that point. This wave can be thought of as the real part of a complex
exponential:
E(r,t) =
A
|r − r1|
ei(q|r−r1|−ωt)
(9.60)
where |r − r1| is the distance between the source and the point of observation and q is the
wavenumber 2π/λ. The superposition principle implies that the total electric ﬁeld at r from
N point sources at ri is
E(r,t) = e−iωt
N
n=1
An
|r − rn|
ei(q|r−rn|)
= e−iωt
E(r). (9.61)
The time evolution can be expressed as an oscillatory complex exponential e−iωt that multiplies
a complex space part E(r). The spatial part E(r) is a phasor which contains both the maximum
value of the electric ﬁeld and the time within a cycle when the physical ﬁeld reaches its maximum
value. As the system evolves, the complex electric ﬁeld E(r,t) oscillates between purely
real and purely imaginary values. Both the energy density (the energy per unit volume) and
the light intensity (the energy passing through a unit area) are proportional to the square of the
magnitude of the phasor. Because light ﬁelds oscillate at ≈ 6 × 1014 Hz, typical optics experiments
observe the time average (rms value) of E and do not observe the phase angle.
Huygens’s principle states that each point on a wavefront (a surface of constant phase) can
be treated as the source of a new spherical wave or Huygens’s wavelet. The wavefront at some
later time is constructed by summing these wavelets. The HuygensApp program implements
Huygens’s principle by assuming superposition from an arbitrary number of point sources and
displaying a two-dimensional animation of (9.61) as shown in Figure 9.3 and described in Appendix
9C.
Sources are represented by circles and are added to the frame when a custom button invokes
the createSource method.
CHAPTER 9. NORMAL MODES AND WAVES 342
public void createSource ( ) {
InteractiveShape ishape = InteractiveShape . createCircle (0 , 0 , 0 . 5 ) ;
frame . addDrawable ( ishape ) ;
initPhasors ( ) ;
frame . repaint ( ) ;
}
Users can create as many sources as they wish. The program later retrieves a list of sources from
the frame using the latter’s getDrawables method.
The program uses n × n arrays to store the real and imaginary values. The code fragment
from the initPhasors method shown in the following starts the process by obtaining a list of
point sources in the frame. We then use an Iterator to access each source as we sum the vector
components at each grid point.
ArrayList l i s t =frame . getDrawables ( ) ; / / g e t s l i s t of point sources
/ / c r e a t e s an i t e r a t o r f o r the l i s t
I t e r a t o r i t = l i s t . i t e r a t o r ( ) ;
/ / t h e s e two statements are combined in the f i n a l code
List and Iterator are interfaces implemented by the objects returned by frame.getDrawables
and list.iterator, respectively. As the name implies, an iterator is a convenient way to access
a list without explicitly counting its elements. The iterator’s getNext method retrieves elements
from the list, and the hasNext method returns true if the end of the list has not been reached.
The initPhasors method in HuygensApp computes the phasors at every point by summing
the phasors at each grid point. Note how the distance from the source to the observation point
is computed by converting the grid’s index values to world coordinates.
I t e r a t o r i t = frame . getDrawables ( ) . i t e r a t o r ( ) ; / / source i t e r a t o r
while ( i t . hasNext ( ) ) {
InteractiveShape source = ( InteractiveShape ) i t . next ( ) ;
/ / world c o o r d i n a t e s f o r source
double xs = source . getX ( ) , ys = source . getY ( ) ;
for ( int ix = 0; ix<n ; ix ++) {
double x = frame . indexToX ( ix ) ;
double dx = ( xs−x ) ; / / source −> g ri d p o in t x se parat ion
for ( int iy = 0; iy<n ; iy ++) {
double y = frame . indexToY ( iy ) ;
double dy = ( ys−y ) ; / / charge −> g ri d p o in t y se parat ion
double r = Math . sqrt ( dx dx+dy dy ) ;
realArray [ ix ] [ iy ] += ( r==0) ? 0 : Math . cos ( PI2 r )/ r ;
imagArray [ ix ] [ iy ] += ( r==0) ? 0 : Math . sin ( PI2 r )/ r ;
}
}
}
To calculate the real and imaginary components of the phasor, the distance from the source to
the grid point is determined in terms of the wavelength λ, and the time is is determined in
terms of the period T . For example, for green light one unit of distance is ≈ 5 × 10−7 m and one
unit of time is ≈ 1.6 × 10−15 s.
The simulation is performed by multiplying the phasors by e−iωt in the doStep method.
Multiplying each phasor by e−iωt mixes the phasor’s real and imaginary components. We then
obtain the physical ﬁeld from (9.61) by taking the real part:
E(r,t) = Re[e−iωt
E(r)] = Re[E]cosωt + Im[E]sinωt. (9.62)
CHAPTER 9. NORMAL MODES AND WAVES 343
Listing 9.10 shows the entire HuygensApp class. A custom button is used to create sources
at the origin. Because the source is an InteractiveShape, it can be repositioned using the
mouse. The program also implements the InteractiveMouseHandler interface to recalculate
the phasors when the source is moved. (See Section 5.7 for a discussion of interactive handlers.)
Listing 9.10: The HuygensApp class simulates the energy density from one or more point
sources.
package org . opensourcephysics . sip . ch09 ;
import java . u t i l . ;
import java . awt . event . ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . display2d . ;
import org . opensourcephysics . frames . ;
public class HuygensApp extends AbstractSimulation implements
InteractiveMouseHandler {
s t a t i c final double PI2 = Math . PI 2;
Scalar2DFrame frame = new Scalar2DFrame ( "x" , "y" ,
"Intensity from point sources" ) ;
double time = 0;
double [ ] [ ] realPhasor , imagPhasor , amplitude ;
int n ; / / grid points on a s i d e
double a ; / / s i d e length
public HuygensApp ( ) {
/ / i n t e r p o l a t e d p l o t l o o k s b e s t
frame . convertToInterpolatedPlot ( ) ;
frame . setPaletteType ( ColorMapper .RED) ;
frame . setInteractiveMouseHandler ( this ) ;
}
public void i n i t i a l i z e ( ) {
n = control . getInt ( "grid size" ) ;
a = control . getDouble ( "length" ) ;
frame . setPreferredMinMax(−a /2 , a /2 , −a /2 , a / 2 ) ;
realPhasor = new double [n ] [ n ] ;
imagPhasor = new double [n ] [ n ] ;
amplitude = new double [n ] [ n ] ;
frame . setAll ( amplitude ) ;
initPhasors ( ) ;
}
void initPhasors ( ) {
for ( int ix = 0; ix<n ; ix ++) {
for ( int iy = 0; iy<n ; iy ++) {
imagPhasor [ ix ] [ iy ] = realPhasor [ ix ] [ iy ] = 0; / / zero the phasor
}
}
/ / an i t e r a t o r f o r the sources in the frame
I t e r a t o r i t = frame . getDrawables ( ) . i t e r a t o r ( ) ; / / source i t e r a t o r
/ / counts the number of sources
int counter = 0;
CHAPTER 9. NORMAL MODES AND WAVES 344
while ( i t . hasNext ( ) ) {
InteractiveShape source = ( InteractiveShape ) i t . next ( ) ;
counter ++;
double
xs = source . getX ( ) , ys = source . getY ( ) ;
for ( int ix = 0; ix<n ; ix ++) {
double x = frame . indexToX ( ix ) ;
double dx = ( xs−x ) ;
for ( int iy = 0; iy<n ; iy ++) {
double y = frame . indexToY ( iy ) ;
double dy = ( ys−y ) ;
double r = Math . sqrt ( dx dx+dy dy ) ;
realPhasor [ ix ] [ iy ] += ( r==0) ? 0 : Math . cos ( PI2 r )/ r ;
imagPhasor [ ix ] [ iy ] += ( r==0) ? 0 : Math . sin ( PI2 r )/ r ;
}
}
}
double cos = Math . cos (−PI2 time ) ;
double sin = Math . sin (−PI2 time ) ;
for ( int ix = 0; ix<n ; ix ++) {
for ( int iy = 0; iy<n ; iy ++) {
/ / only the r e a l part of the complex f i e l d i s p h y s i c a l
double re = cos realPhasor [ ix ] [ iy ]− sin imagPhasor [ ix ] [ iy ] ;
amplitude [ ix ] [ iy ] = re re ;
}
}
frame . setZRange ( false , 0 , 0.2 counter ) ; / / s c a l e the i n t e n s i t y
frame . setAll ( amplitude ) ;
}
public void reset ( ) {
time = 0;
control . setValue ( "grid size" , 128);
control . setValue ( "length" , 10);
frame . clearDrawables ( ) ;
frame . setMessage ( "t = "+decimalFormat . format ( time ) ) ;
control . println ( "Source button creates a new source." ) ;
control . println ( "Drag sources after they are created." ) ;
i n i t i a l i z e ( ) ;
}
public void createSource ( ) {
InteractiveShape ishape = InteractiveShape . createCircle (0 , 0 , 0 . 5 ) ;
frame . addDrawable ( ishape ) ;
initPhasors ( ) ;
frame . repaint ( ) ;
}
public void handleMouseAction ( InteractivePanel panel , MouseEvent evt ) {
panel . handleMouseAction ( panel , evt ) ; / / panel moves the source
i f ( panel . getMouseAction ()== InteractivePanel .MOUSE_DRAGGED) {
initPhasors ( ) ;
}
CHAPTER 9. NORMAL MODES AND WAVES 345
L
y
a
Figure 9.4: Young’s double slit experiment. The ﬁgure deﬁnes the quantities a, L, and y used in
Problem 9.31.
}
protected void doStep ( ) {
time += 0 . 1 ;
double cos = Math . cos (−PI2 time ) ;
double sin = Math . sin (−PI2 time ) ;
for ( int ix = 0; ix<n ; ix ++) {
for ( int iy = 0; iy<n ; iy ++) {
double re = cos realPhasor [ ix ] [ iy ]− sin imagPhasor [ ix ] [ iy ] ;
amplitude [ ix ] [ iy ] = re re ;
}
}
frame . setAll ( amplitude ) ;
frame . setMessage ( "t="+decimalFormat . format ( time ) ) ;
}
public s t a t i c void main ( String [ ] args ) {
OSPControl control = SimulationControl . createApp (new HuygensApp ( ) ) ;
control . addButton ( "createSource" , "Source" ) ;
}
}
The classic example of interference is Young’s double slit experiment (see Figure 9.4). Imagine
two narrow parallel slits separated by a distance a and illuminated by a light source that
emits light of only one frequency (monochromatic light). If the light source is placed on the line
bisecting the two slits and the slit opening is very narrow, the two slits become coherent light
sources with equal phases. We ﬁrst assume that the slits act as point sources, for example, pinholes.
A screen that displays the intensity of the light from the two sources is placed a distance
L away. What do we see on the screen?
In the following problems we discuss writing programs to determine the intensity of light
CHAPTER 9. NORMAL MODES AND WAVES 346
that is observed on a screen due to a variety of geometries. The wavelength of the light sources,
the positions of the sources ri, and the observation points on the screen need to be speciﬁed.
Your program should instantiate the necessary point sources and compute a plot showing the
intensity on the obervation screen located to the right of the sources by summing the phasors.
Although we suggest that you use Listing 9.10 as a guide, it is unrealistic to compute a twodimensional
grid that covers the entire region from the source to the screen. It is more eﬃcient
to plot the intensity on the screen, thereby reducing the computation to a single loop over the
screen coordinate y.
Problem 9.31. Point source interference
(a) Derive an analytic expression for E from two and three point sources if the screen is far
from the sources.
(b) Compute and plot the intensity of light on a screen due to two small sources (a source that
emits spherical waves). Compute the phasors using (9.61) and ﬁnd the intensity by taking
the magnitude of E. Let a be the distance between the sources and y be the vertical position
along the screen as measured from the central maximum. Set L = 200 mm, a = 0.1 mm, the
wavelength of light λ = 5000 Å (1 Å= 10−7 mm), and consider −5.0mm ≤ y ≤ 5.0mm (see
Figure 9.4). Describe the interference pattern you observe. Identify the locations of the
intensity maxima and plot the intensity of the maxima as a function of y. Compare your
result to the analytic expression for a two slit diﬀraction pattern.
(c) Repeat part (b) for L = 0.5 mm and 1.0mm ≤ y ≤ 1.0mm. Note that in this case L is not
much greater than a, and hence we cannot ignore the r dependence of |r−ri|−1 in (9.61).
Problem 9.32. Diﬀraction grating
High resolution optical spectroscopy is done with multiple slits. In its simplest form, a diﬀraction
grating consists of N parallel slits. Compute the intensity of light for N = 3, 4, 5, and 10
slits with λ = 5000 Å, slit separation a = 0.01 mm, screen distance L = 200 mm, and −15mm ≤
y ≤ 15mm. How do the intensity of the peaks and their separation vary with N?
In our analysis of the double slit and the diﬀraction grating, we assumed that each slit was
a pinhole that emits spherical waves. In practice, real slits are much wider than the wavelength
of visible light. In Problem 9.33 we consider the pattern of light produced when a plane wave
is incident on an aperture such as a single slit. To do so, we use Huygens’s principle and replace
the slit by many coherent sources of spherical waves. This equivalence is not exact, but is
applicable when the aperture width is large compared to the wavelength.
Problem 9.33. Single slit diﬀraction
(a) Compute the time averaged intensity of light diﬀracted from a single slit of width 0.02 mm
by replacing the slit by N = 20 point sources spaced 0.001 mm apart. Choose λ = 5000 Å,
L = 200 mm, and consider −30mm ≤ y ≤ 30mm. What is the width of the central peak?
How does the width of the central peak compare to the width of the slit? Do your results
change if N is increased?
(b) Determine the position of the ﬁrst minimum of the diﬀraction pattern as a function of the
wavelength, slit width, and distance to the screen.
CHAPTER 9. NORMAL MODES AND WAVES 347
(c) Compute the intensity pattern for L = 1 mm and 50 mm. Is the far ﬁeld condition satisﬁed
in this case? How do the patterns diﬀer?
Problem 9.34. A more realistic double slit simulation
Reconsider the intensity distribution for double slit interference using slits of ﬁnite width.
Modify your program to simulate two “thick” slits by replacing each slit by 20 point sources
spaced 0.001 mm apart. The centers of the thick slits are a = 0.1 mm apart. How does the
intensity pattern change?
∗Problem 9.35. Diﬀraction pattern from a rectangular aperture
We can use a similar approach to determine the diﬀraction pattern due to a two-dimensional
thin opaque mask with an aperture of ﬁnite width and height near the center. The simplest
approach is to divide the aperture into little squares and to consider each square as a source
of spherical waves. Similarly, we can divide the viewing screen or photographic plate into
small regions or cells and calculate the time averaged intensity at the center of each cell. The
calculations are straightforward, but time consuming because of the necessity of evaluating the
cosine function many times. The less straightforward part of the problem is deciding how to
plot the diﬀerent values of the calculated intensity on the screen. One way is to plot “points”
at random locations in each cell so that the number of points is proportional to the computed
intensity at the center of the cell. Suggested parameters are λ = 5000 Å and N = 200 mm for a
1mm × 3mm aperture.
If the number of sources N becomes large, the summation becomes the Huygens–Fresnel
integral. Optics texts discuss how diﬀerent approximations can be used to evaluate (9.61). For
example, if the sources are much closer to each other than they are to the screen (the far ﬁeld
condition), we obtain the condition for Fraunhofer diﬀraction. Otherwise, we obtain Fresnel
diﬀraction.
9.9 Fraunhofer Diﬀraction
The contemporary approach to diﬀraction is based on Fourier analysis. Consider a plane wave
incident on a one-dimensional aperture. Using Huygen’s idea that every point within the aperture
serves as the source of spherical secondary wavelets, we can approximate the aperture as
a linear array of in-phase coherent oscillators. If the spatial extent of the array is small and the
distance to the observing screen is large, the wavelet amplitudes arriving at the screen will be
essentially equal for wavelets within the aperture and zero for wavelets on the opaque mask.
The total ﬁeld is thus given by the real part of the sum:
E(r,t) = E0a1ei(qr1−ωt)
+ E0a2ei(qr2−ωt)
+ E0a3ei(qr2−ωt)
+ ··· + E0aN ei(qrN −ωt)
(9.63)
where ai = 1 within the aperture and ai = 0 outside the aperture, and the wavenumber q and the
angular frequency ω have their usual meaning. Equation (9.63) can be factored into a complex
phasor and time-dependent phase shift
E(r,t) = e−iωt
E(r) (9.64)
where
E(r) = E0eiqr1 1 + a1eiq(r2−r1)
+ a2eiq(r3−r1)
+ ··· + aN eiq(rN −r1)
. (9.65)
CHAPTER 9. NORMAL MODES AND WAVES 348
If the grid spacing d is uniform, it follows that the phase diﬀerence between adjacent sources is
δ = qd sinθ, where θ is the angle between the aperture and a point on the screen. We substitute
δ into (9.65) and observe that the ﬁeld can be expressed as an overall phase times a Fourier sum:
E(r) = E0eikqr1 1 +
N
n=1
aneinδ
. (9.66)
Equation (9.66) shows that a Fraunhofer (far ﬁeld) diﬀraction pattern can be obtained by
dividing the aperture using a uniform grid and computing the complex Fourier transform. Because
the intensity is proportional to the square of the electric ﬁeld, the diﬀraction pattern is the
modulus squared of the Fourier transform of the aperture’s transmission function. Listing 9.11
uses the one-dimensional fast Fourier transform to compute the Fraunhofer diﬀraction pattern
for a slit.
Listing 9.11: The FraunhoferApp program computes the Fraunhofer diﬀraction pattern from a
slit.
package org . opensourcephysics . sip . ch09 ;
import org . opensourcephysics . frames . ;
import org . opensourcephysics . numerics . FFT ;
public class FraunhoferApp {
s t a t i c final double PI2 = Math . PI 2;
/ / Math . log i s natural log
s t a t i c final double LOG10 = Math . log ( 1 0 ) ;
s t a t i c final double ALPHA = Math . log (1.0 e −3)/LOG10; / / c u t o f f value
public s t a t i c void main ( String [ ] args ) {
PlotFrame plot = new PlotFrame ( "x" , "intensity" ,
"Fraunhofer diffraction" ) ;
int N = 512;
FFT f f t = new FFT(N) ;
double [ ] cdata = new double [2 N] ;
double a = 10; / / aperture screen dimension
double dx = (2 a )/N;
double x = −a ;
for ( int ix = 0; ix<N; ix ++) {
cdata [2 ix ] = (Math . abs ( x ) <0.4) ? 1 : 0; / / s l i t
cdata [ ( 2 ix )+1] = 0;
x += dx ;
}
f f t . transform ( cdata ) ;
f f t . toNaturalOrder ( cdata ) ;
double max = 0;
/ / find the max i n t e n s i t y value
for ( int i = 0; i <N; i ++) {
double real = cdata [2 i ] ;
double imag = cdata [ ( 2 i ) + 1 ] ;
max = Math .max(max, ( real real )+( imag imag ) ) ;
plot . append (0 , i , ( real real )+( imag imag ) ) ;
}
plot . s e t V i s i b l e ( true ) ;
/ / c r e a t e N by 30 r a s t e r p l o t to show an image
CHAPTER 9. NORMAL MODES AND WAVES 349
int [ ] [ ] data = new int [N] [ 3 0 ] ;
/ / compute p i x e l i n t e n s i t y
for ( int i = 0; i <N; i ++) {
double real = cdata [2 i ] ;
double imag = cdata [ ( 2 i )+1]
/ / log s c a l e i n c r e a s e s v i s i b i l i t y of weaker f r i n g e s
double logIntensity = Math . log ( ( ( real real )+( imag imag ) ) /max)
/LOG10;
int intensity = ( logIntensity <=ALPHA)
? 0 : ( int ) (254 (1 −( logIntensity /ALPHA ) ) ) ;
for ( int j = 0; j <30; j ++) {
data [ i ] [ j ] = intensity ;
}
}
RasterFrame frame =
new RasterFrame ( "Fraunhofer Diffraction (log scale)" ) ;
frame . setBWPalette ( ) ;
frame . setAll ( data ) ; / / send the f f t data to the r a s t e r frame
frame . s e t V i s i b l e ( true ) ;
frame . setDefaultCloseOperation ( javax . swing . JFrame .EXIT_ON_CLOSE ) ;
}
}
To emphasize the weaker regions of the diﬀraction pattern, the program plots the logarithm
of the intensity. First the intensity is normalized to its peak value; then the logarithm is taken
and all values less than a cutoﬀ are truncated. The resulting range is mapped linearly to (0,255)
to set the gray scale.
Problem 9.36. Two-dimensional apertures
Modify FraunhoferApp to show the diﬀraction pattern from a double slit and compare the
computed diﬀraction pattern to the analytic result.
The Fraunhofer2DApp program computes the Fraunhofer diﬀraction pattern for a circular
aperture using a two-dimensional FFT. This program is used in Problem 9.37 but is not listed
because it is too long and because it is similar to Listing 9.11.
Problem 9.37. Two-dimensional apertures
(a) Compile and run the Fraunhofer2DApp program. Compute the diﬀraction pattern using
aperture radii of 4λ and 0.25λ in a mask with dimension aλ. How does the radius inﬂuence
the diﬀraction pattern? How does the rectangular grid inﬂuence the pattern? How can the
eﬀect of the rectangular grid be reduced?
(b) Because a typical computer monitor displays only 256 gray scale values, Fraunhofer2DApp
uses a logarithmic scale to enhance the visibility of the fringes. Add code to display a linear
plot of intensity as a function of radius using a slice through the center of the pattern.
(c) Compute the diﬀraction pattern for an annular ring with inner radius 1.8λ and outer radius
2.2λ. Why is there a bright spot at the center of the diﬀraction pattern? What eﬀect does
the ﬁnite width of the annular ring produce?
(d) Compute the diﬀraction pattern for a rectangular aperture with width 2λ and height 6λ.
Describe the eﬀect of the asymmetry of the slit.
CHAPTER 9. NORMAL MODES AND WAVES 350
Figure 9.5: Computed Fresnel diﬀraction on a screen illuminated by a uniform plane wave and
located 0.5 × 106λ from a circular aperture of radius 2000λ.
Problem 9.38. Diﬀraction patterns due to multiple apertures
(a) Compute the diﬀraction pattern due to a 5×5 array of rectangular slits. Each slit has a width
of 0.5 and a height of 0.25 and is oﬀset by (1,1) from neighboring slits using units such that
λ = 1. The aperture array is centered within a mask ten units on a side. What is the eﬀect of
the asymmetry of the slit? What happens if the number of slits is decreased or increased?
(b) Compute the diﬀraction pattern due to 25 randomly placed rectangular slits. Each slit
should have a width of 0.5 and a height of 0.25 and be placed within a region ten units on a
side. Do not be concerned if rectangles overlap.
(c) Compare the results from (a) and (b). What eﬀect does the random placement have on the
pattern?
9.10 Fresnel Diﬀraction
Fourier analysis can be used to compute the Fresnel diﬀraction pattern by decomposing a wave
incident on an aperture into a sum of plane waves and then propagating each plane wave from
the aperture mask to the screen. A plane wave with wavenumber (qx,qy,qz) propagating in a
homogenous environment can be written as
U = U0ei(qxx+qyy+qzz)
(9.67)
where U0 is the amplitude of the ﬁeld at the origin, and (qx,qy,qz) is a vector of length 2π/λ in
the direction of propagation. If we place a viewing screen perpendicular to the direction of the
incoming light at a point z0 along the z axis, then the ﬁeld on the screen is
U = U0eiqzz0 = U0eiz0(q2−q2
x−q2
y )1/2
(9.68)
CHAPTER 9. NORMAL MODES AND WAVES 351
where we have used the fact that q2
x + q2
y + q2
z = q2.
We now place an aperture mask at the origin z = 0 in the xy-plane and illuminate it from
the left by a plane wave. Because the aperture truncates the incident plane wave, we obtain a
more complicated ﬁeld U0(qx,qy) that contains both qx and qy spatial components:
U0(qx,qy) =
aperture
ei(qxx+qyy)
dxdy. (9.69)
The ﬁeld that propagates from the origin contains the Fourier components of the aperture mask.
In other words, because we have truncated the wave, we have a ﬁeld composed of a mixture of
plane waves with wavenumbers (qx,qy,qz). Each ﬁeld component is multiplied by the eiqzz0
phase factor in (9.68) as it propagates toward the viewing screen at z0. The following steps
summarize the algorithm:
1. Compute the Fourier transformation of the aperture (9.69) to obtain the ﬁeld’s components
in the plane of the aperture.
2. Multiply each component by the propagation phase factor eiz0(q2−q2
x−q2
y )1/2
.
3. Compute the inverse transformation to obtain the amplitude.
4. Compute the magnitude squared of the amplitude to obtain the intensity.
The Fresnel diﬀraction pattern algorithm is implemented in Listing 9.12. Note that the ﬁeld
includes evanescent waves if q2 − q2
x − q2
y < 0.
Listing 9.12: The FresnelApp program computes the Fresnel diﬀraction pattern from a circular
aperture.
package org . opensourcephysics . sip . ch09 ;
import org . opensourcephysics . frames . RasterFrame ;
import org . opensourcephysics . numerics . FFT2D ;
public class FresnelApp {
final s t a t i c double PI2 = Math . PI 2;
final s t a t i c double PI4 = PI2 PI2 ;
public s t a t i c void main ( String [ ] args ) {
/ / power of 2 f o r optimum speed
int N = 512;
/ / d i s t a n c e from aperture to screen
double z = 0.5 e +6;
FFT2D fft2d = new FFT2D(N, N) ;
double [ ] cdata = new double [2 N N] ; / / complex data
double a = 6000; / / aperture mask dimension
double
dx = 2 a/N, dy = 2 a/N;
double x = −a ;
for ( int ix = 0; ix<N; ix ++) {
int o f f s e t = 2 ix N;
double y = −a ;
for ( int iy = 0; iy<N; iy ++) {
double r2 = ( x x+y y ) ;
CHAPTER 9. NORMAL MODES AND WAVES 352
cdata [ o f f s e t +2 iy ] = ( r2<4e6 ) ? 1 : 0; / / c i r c u l a r aperture
cdata [ o f f s e t +2 iy +1] = 0;
y += dy ;
}
x += dx ;
}
fft2d . transform ( cdata ) ;
/ / get arrays containing the wavenumbers in wrapped order
double [ ] kx = fft2d . getWrappedOmegaX(−a , a ) ;
double [ ] ky = fft2d . getWrappedOmegaY(−a , a ) ;
for ( int ix = 0; ix<N; ix ++) {
int o f f s e t = 2 ix N; / / o f f s e t to beginning of row
for ( int iy = 0; iy<N; iy ++) {
double radical = PI4−kx [ ix ] kx [ ix ]−ky [ iy ] ky [ iy ] ;
i f ( radical >0) {
double phase = z Math . sqrt ( radical ) ;
double real = Math . cos ( phase ) ;
double imag = Math . sin ( phase ) ;
double temp = cdata [ o f f s e t +2 iy ] ;
cdata [ o f f s e t +2 iy ] = real cdata [ o f f s e t +2 iy ]
−imag cdata [ o f f s e t +2 iy +1];
cdata [ o f f s e t +2 iy +1] = real cdata [ o f f s e t +2 iy +1]+imag temp ;
} else { / / evanescent waves decay e x p o n e n t i a l l y
double decay = Math . exp(−z Math . sqrt (− radical ) ) ;
cdata [ o f f s e t +2 iy ] = decay ;
cdata [ o f f s e t +2 iy +1] = decay ;
}
}
}
fft2d . inverse ( cdata ) ;
double max = 0;
for ( int i = 0; i <N N; i ++) { / / find max i n t e n s i t y
double real = cdata [2 i ] ;
double imag = cdata [2 i +1];
max = Math .max(max, real real+imag imag ) ;
}
/ / i n t e n s i t y i s squared magnitude of the amplitude
int [ ] data = new int [N N] ;
for ( int i = 0 , N2 = N N; i <N2; i ++) {
double real = cdata [2 i ] ; / / r e a l
double imag = cdata [2 i +1]; / / imaginary
data [ i ] = ( int ) (255 ( real real+imag imag )/max ) ;
}
/ / r a s t e r f o r l e a s t memory and b e s t speed
RasterFrame frame = new RasterFrame ( "Fraunhofer Diffraction" ) ;
frame . setBWPalette ( ) ;
frame . setAll ( data , N, −0.5 , 0.5 , −0.5 , 0 . 5 ) ;
frame . s e t V i s i b l e ( true ) ;
frame . setDefaultCloseOperation ( javax . swing . JFrame .EXIT_ON_CLOSE ) ;
}
}
Because the algorithm in Listing 9.12 depends only on the linearity of the wave equation, it
is exact and may be applied to many practical optics problems. Its main limitation occurs when
CHAPTER 9. NORMAL MODES AND WAVES 353
z0 is large because of rapid oscillations in the phase factor. In this case, the far ﬁeld (Fraunhofer)
approximation usually becomes applicable and should be used.
Problem 9.39. Fresnel diﬀraction
The diﬀraction pattern from a circular aperture is important because most lenses, mirrors, and
optical instruments have cylindrical symmetry (see Figure 9.5).
(a) Compute the diﬀraction pattern due to a circular aperture with a radius of 1000λ at a screen
distance of 105 λ. Is the center of the diﬀraction pattern dark or light? Reposition the screen
to 2×105λ and repeat the calculation. Does the intensity at the center of the shadow change?
Use a 512 × 512 grid to sample a region of space 5000λ on a side.
(b) Replace the screen by a circular disk. That is, use an aperture mask that is opaque if r <
1000λ. Is the center of the screen light or dark? Does the center change from bright to dark
if the screen is repositioned?
Problem 9.40. Optical resolution
Consider a mask containing two circular openings of radius 500λ separated by 100λ. Do a
simulation to determine how far the screen can be placed from the aperture mask and still
observe two distinct shadows.
Appendix 9A: Complex Fourier Series
A function f (t) with period T can be expressed in terms of a trigonometric Fourier series:
f (t) =
1
2
a0 +
∞
k=1
ak cosωkt + bk sinωkt (9.70)
where ωk = kω0 and ω0 = 2∂/T . To derive the exponential form of this series, we express the
sine and cosine functions as
sinx =
eix − e−ix
2i
(9.71a)
cosx =
eix + e−ix
2
(9.71b)
We substitute (9.71) into (9.70) and obtain
f (t) =
1
2
a0 +
∞
k=1
eikω0t ak − ibk
2
+ e−ikω0t ak + ibk
2
. (9.72)
We use 1/i = −i and deﬁne new Fourier coeﬃcients as follows:
c0 ≡
1
2
a0 (9.73a)
ck ≡
ak − ibk
2
(9.73b)
c−k ≡
a−k − ib−k
2
=
ak + ibk
2
(9.73c)
CHAPTER 9. NORMAL MODES AND WAVES 354
where the right-hand side of (9.73c) follows from ak = a−k and bk = −b−k. We substitute these
coeﬃcients into (9.72) and ﬁnd
f (t) = c0 +
∞
k=1
ckeikω0t
+
∞
k=1
c−ke−ikω0t
(9.74)
or
f (t) =
∞
k=0
ckeikω0t
+
∞
k=1
c−ke−ikω0t
. (9.75)
Finally, we re-index the second sum from −1 to −∞
f (t) =
∞
k=0
ckeikω0t
+
−∞
k=−1
ckeikω0t
(9.76)
and combine the summations to obtain the exponential form:
f (t) =
∞
k=−∞
ckeikω0t
. (9.77)
Appendix 9B: Fast Fourier Transform
The fast Fourier transform (FFT) was discovered independently in a variety of contexts by
many workers. There are several variations of the algorithm, and we describe a version due
to Danielson and Lanczos. The goal is to compute the Fourier transform g(ωk) given the data
set f (j∆) ≡ fj of (9.35). For convenience we rewrite the relation
gk ≡ g(ωk) =
N−1
j=0
f (j∆)e−i2πkj/N
(9.78)
and introduce the complex number W given by
W = e−i2π/N
. (9.79)
The following algorithm works with any complex data set if N is a power of two. Real data sets
can be transformed by setting the array elements corresponding to the imaginary part equal to
0.
To understand the FFT algorithm, we consider the case N = 8 and rewrite (9.78) as
gk =
j=0,2,4,6
f (j∆)e−i2πkj/N
+
j=1,3,5,7
f (j∆)e−i2πkj/N
(9.80a)
=
j=0,1,2,3
f (2j∆)e−i2πk2j/N
+
j=0,1,2,3
f ((2j + 1)∆)e−i2πk(2j+1)/N
(9.80b)
=
j=0,1,2,3
f (2j∆)e−i2πkj/(N/2)
+ W k
j=0,1,2,3
f ((2j + 1)∆)e−i2πkj/(N/2)
(9.80c)
= ge
k + W k
go
k (9.80d)
CHAPTER 9. NORMAL MODES AND WAVES 355
where W k = e−i2πk/N . The quantity ge is the Fourier transform of length N/2 formed from the
even components of the original f (j∆); go is the Fourier transform of length N/2 formed from
the odd components.
We can continue this decomposition if N is a power of two. That is, we can decompose ge
into its N/4 even and N/4 odd components, gee and geo, and decompose go into its N/4 even
and N/4 odd components, goe and goo. We ﬁnd
gk = gee
k + W 2k
geo
k + W k
goe
k + W 3k
goo
k . (9.81)
One more decomposition leads to
gk = geee
k + W 4k
geeo
k + W 2k
geoe
k + W 6k
geoo
k
+ W k
goee
k + W 5k
goeo
k + W 3k
gooe
k + W 7k
gooo
k . (9.82)
At this stage each of the Fourier transforms in (9.82) uses only one data point. We see from
(9.78) with N = 1 that the value of each of these Fourier transforms, geee
k , geeo
k ,... , is equal to the
value of f at the corresponding data point. Note that for N = 8, we have performed 3 = log2 N
decompositions. In general, we would perform log2 N decompositions.
There are two steps to the FFT. First, we reorder the components so that they appear in the
order given in (9.82). This step makes the subsequent calculations easier to organize. To see
how to do the reordering, we rewrite (9.82) using the values of f :
gk = f (0) + W 4k
f (4∆) + W 2k
f (2∆) + W 6k
f (6∆)
+ W k
f (∆) + W 5k
f (5∆) + W 3k
f (3∆) + W 7k
f (7∆). (9.83)
We use a trick to obtain the ordering in (9.83) from the original order f (0∆), f (1∆), ...,f (7∆).
Part of the trick is to refer to each g in (9.82) by a string of “e” and “o” characters. We assign
0 to “e” and 1 to “o” so that each string represents the binary representation of a number. If
we reverse the order of the representation; that is, set 110 to 011, we obtain the value of f we
want. For example, the ﬁfth term in (9.82) contains goee, corresponding to the binary number
100. The reverse of this number is 001, which equals 1 in decimal notation, and hence, the ﬁfth
term in (9.83) contains the function f (1∆). Convince yourself that this bit reversal procedure
works for the other seven terms.
The ﬁrst step in the FFT algorithm is to use this bit reversal procedure on the original array
representing the data. In the next step this array is replaced by its Fourier transform. If you
want to save your original data, it is necessary to ﬁrst copy the data to another array before
passing the array to a FFT implementation. The SimpleFFT class implements the Danielson–
Lanczos algorithm using three loops. The outer loop runs over log2 N steps. For each of these
steps, N calculations are performed in the two inner loops. As can be seen in Listing 9.13,
in each pass through the innermost loop, each element of the array g is updated once by the
quantity temp formed from a power of W multiplied by the current value of an appropriate
element of g. The power of W used in temp is changed after each pass through the innermost
loop. The power of the FFT algorithm is that we do not separately multiply each f (j∆) by
the appropriate power of W . Instead, we ﬁrst take pairs of f (j∆) and multiply them by an
appropriate power of W to create new values for the array g. Then we repeat this process for
pairs of the new array elements (each array element now contains four of the f (j∆)). We repeat
this process until each array element contains a sum of all N values of f (j∆) with the correct
powers of W multiplying each term to form the Fourier transform.
CHAPTER 9. NORMAL MODES AND WAVES 356
Listing 9.13: A simple implementation of the FFT algorithm.
package org . opensourcephysics . sip . ch09 ;
public class SimpleFFT {
public s t a t i c void transform ( double [ ] real , double [ ] imag ) {
int N = real . length ;
int pow = 0;
while (N/2>0) {
i f (N%2==0) { / / N should be even
pow++;
N /= 2;
} else {
throw new IllegalArgumentException ( "Number of points in
this FFT implementation must be even." ) ;
}
}
int N2 = N/2;
int j j = N2;
/ / rearrange input according to b i t r e v e r s a l
for ( int i = 1; i <N−1; i ++) {
i f ( i < j j ) {
double tempRe = real [ j j ] ;
double tempIm = imag [ j j ] ;
real [ j j ] = real [ i ] ;
imag [ j j ] = imag [ i ] ;
real [ i ] = tempRe ;
imag [ i ] = tempIm ;
}
int k = N2;
while (k<= j j ) {
j j = j j −k ;
k = k/2;
}
j j = j j +k ;
}
j j = 1;
for ( int p = 1;p<=pow; p++) {
int inc = 2 j j ;
double
wp1 = 1 , wp2 = 0;
double theta = Math . PI/ j j ;
double
cos = Math . cos ( theta ) , sin = −Math . sin ( theta ) ;
for ( int j = 0; j < j j ; j ++) {
for ( int i = j ; i <N; i += inc ) {
/ / c a l c u l a t e the transform of 2^p
int ip = i+ j j ;
double tempRe = wp1 real [ ip ]−wp2 imag [ ip ] ;
double tempIm = wp2 real [ ip ]+wp1 imag [ ip ] ;
real [ ip ] = real [ i ]−tempRe ;
imag [ ip ] = imag [ i ]−tempIm ;
real [ i ] = real [ i ]+tempRe ;
imag [ i ] = imag [ i ]+tempIm ;
}
CHAPTER 9. NORMAL MODES AND WAVES 357
double temp = wp1;
wp1 = wp1 cos −wp2 sin ;
wp2 = temp sin+wp2 cos ;
}
j j = inc ;
}
}
}
Exercise 9.41. Testing the FFT algorithm
1. Test the SimpleFFT class for N = 8 by going through the code by hand and showing that
the class reproduces (9.83).
2. Display the Fourier coeﬃcients of random real values of f (j∆) for N = 8 using both
SimpleFFT and the direct computation of the Fourier coeﬃcients based on (9.34). Compare
the two sets of data to insure that there are no errors in SimpleFFT. Repeat for a
random collection of complex data points.
3. Modify the SimpleFFT class to compute the inverse Fourier transform deﬁned by (9.37).
The inverse Fourier transform of a Fourier transformed data set should be the original
data set.
4. Compute the CPU time as a function of N for N = 16, 64, 256, and 1024 for the SimpleFFT
algorithm and the direct computation. You can use the currentTimeMillis method in
System class to record the time.
int n = 10;
long startTime = System . currentTimeMillis ( ) ;
for ( int i =0; i <n ; i ++) { / / average f o r b e t t e r r e s u l t s
f f t . transform ( ) ;
}
long endTime = System . currentTimeMillis ( ) ;
System . out . println ( "time/FFT = " +(( endTime−startTime )/n ) ) ;
Verify that the dependence on N is what you expect.
5. Compare the CPU time as a function of N for the SimpleFFT class and the FFT class in the
numerics package.
6. Compare the CPU time for the FFT class in the numerics package using two slightly different
values of N such that one value is a power of two and the other is not.
Appendix 9C: Plotting Scalar Fields
Imagine a plate that is heated at an interior point and cooled along its edges. In principle, the
temperature of this plate can be measured at every point. A scalar quantity, such as temperature,
pressure, or light intensity, that is deﬁned throughout a region of space is known as a
scalar ﬁeld. The Open Source Physics library contains a number of tools that help us visualize
two-dimensional scalar ﬁelds. A more complete description of two-dimensional visualization
tools is available in Open Source Physics User’s Guide.
CHAPTER 9. NORMAL MODES AND WAVES 358
An image in which pixels are color coded can be used to visualize a scalar ﬁeld. The frames
package deﬁnes a RasterFrame class that makes this process easy and eﬃcient if the scalar
ﬁeld can be represented by integers from 0 to 255. The following program shows how such a
RasterFrame is used.
Listing 9.14: A scalar ﬁeld visualization using a raster frame.
package org . opensourcephysics . sip . ch09 ;
import org . opensourcephysics . frames . RasterFrame ;
public class RasterFrameApp {
public s t a t i c void main ( String [ ] args ) {
RasterFrame frame = new RasterFrame ( "x" , "y" , "Raster Frame" ) ;
frame . setPreferredMinMax ( −10 , 10 , −10, 10);
/ / generate random data
int nx = 256 , ny = 256;
int [ ] [ ] data = new int [ nx ] [ ny ] ;
for ( int i = 0; i <nx ; i ++) {
for ( int j = 0; j <ny ; j ++) {
data [ i ] [ j ] = ( int ) (255 Math . random ( ) ) ;
}
}
frame . setAll ( data ) ;
frame . s e t V i s i b l e ( true ) ;
frame . setDefaultCloseOperation ( javax . swing . JFrame .EXIT_ON_CLOSE ) ;
}
}
After the scalar ﬁeld’s values are calculated, the raster’s pixels are set using the setBlock
method. Note that the [0,0] array element maps to the lower left hand pixel. Because the image
raster’s mapping has been optimized for speed, the image cannot be resized. The on-screen
image size in pixels always matches the array size. Listing 9.11 uses a raster frame to display a
Fraunhofer diﬀraction pattern.
Although the RasterFrame makes it easy to work with integer-based data, it lacks the ﬂexibility
for more advanced visualizations. It is unsuitable if the array size is small or if the
dynamic range of the scalar ﬁeld is too large. The Scalar2DFrame class overcomes these limitations.
Using a Scalar2DFrame allows us to view the data using diﬀerent representations, such
as contour plots and three-dimensional surface plots. Listing 9.15 shows a RasterFrame being
used to visualize the function f (x,y) = xy.
Listing 9.15: A scalar ﬁeld test program.
package org . opensourcephysics . sip . ch09 ;
import org . opensourcephysics . frames . Scalar2DFrame ;
public class Scalar2DFrameApp {
final s t a t i c int SIZE = 16; / / array s i z e
public s t a t i c void main ( String [ ] args ) {
Scalar2DFrame frame = new Scalar2DFrame ( "x" , "y" , "Scalar Field" ) ;
double [ ] [ ] data = new double [ 1 6 ] [ 1 6 ] ;
frame . setAll ( data , −10, 10 , −10, 10);
/ / generate sample data
for ( int i = 0; i <SIZE ; i ++) {
CHAPTER 9. NORMAL MODES AND WAVES 359
double x = frame . indexToX ( i ) ;
for ( int j = 0; j <SIZE ; j ++) {
double y = frame . indexToY ( j ) ;
data [ i ] [ j ] = x y ;
}
}
frame . setAll ( data ) ;
frame . s e t V i s i b l e ( true ) ;
frame . setDefaultCloseOperation ( javax . swing . JFrame .EXIT_ON_CLOSE ) ;
}
}
Exercise 9.42. Scalar ﬁeld visualization
Run the scalar ﬁeld test program and describe the various types of visualizations available under
the Tools menu. Which visualizations give respectable representations if the grid is small?
What is the maximum grid size that can be used with each type of visualization and still give
acceptable performance on your computer?
References and Suggestions for Further Reading
David C. Champeney, Fourier Transforms and Their Physical Applications (Academic Press,
1973).
James B. Cole, Rudolph A. Krutar, Susan K. Numrich, and Dennis B. Creamer, “Finite-diﬀerence
time-domain simulations of wave propagation and scattering as a research and educational
tool,” Computers in Physics 9, 235–239 (1995).
Frank S. Crawford, Waves, Berkeley Physics Course, Vol. 3 (McGraw–Hill, 1968). A delightful
book on waves of all types. The home experiments are highly recommended. One observation
of wave phenomena equals many computer demonstrations.
Paul DeVries, A First Course in Computational Physics (John Wiley & Sons, 1994). Part of our
discussion of the wave equation is based on Chapter 7. There are also good sections on
the numerical solution of other partial diﬀerential equations, Fourier transforms, and the
FFT.
N. A. Dodd, “Computer simulation of diﬀraction patterns,” Phys. Educ. 18, 294–299 (1983).
P. G. Drazin and R. S. Johnson, Solitons: An Introduction (Cambridge University Press, 1989).
This book focuses on analytic solutions to the Korteweg–de Vries equation which has soliton
solutions.
Richard P. Feynman, Robert B. Leighton, and Matthew Sands, The Feynman Lectures on Physics,
Vol. 1 (Addison–Wesley, 1963). Chapters relevant to wave phenomena include Chapters 28–
30 and Chapter 33.
A. P. French, Vibrations and Waves (W. W. Norton & Co., 1971). An introductory level text that
emphasizes mechanical systems.
Robert Guenther, Modern Optics, Vol. 1 (John Wiley & Sons, 1990). Chapter 6 discusses Fourier
analysis and Chapters 9–12 apply Fourier analysis to the study of Fraunhofer and Fresnel
diﬀraction and holography.
CHAPTER 9. NORMAL MODES AND WAVES 360
Eugene Hecht, Optics, 4th ed. (Addison–Wesley, 2002). An intermediate level optics text that
emphasizes wave concepts.
Akira Hirose and Karl E. Lonngren, Introduction to Wave Phenomena (John Wiley & Sons,
1985). An intermediate level text that treats the general properties of waves in various
contexts.
Amy Kolan, Barry Cipra, and Bill Titus, “Exploring localization in nonperiodic systems,” Computers
in Physics 9 (4), 387–395 (1995). An elementary discussion of how to solve the
problem of a chain of coupled oscillators with disorder using transfer matrices.
J. F. James, A Student’s Guide to Fourier Transforms, 2nd ed. (Cambridge University Press,
2002).
William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery, Numerical
Recipes, 2nd ed., Cambridge University Press (1992). See Chapter 12 for a discussion of
the fast Fourier transform.
Iain G. Main, Vibrations and Waves in Physics (Cambridge University Press, 1993). See Chapter
12 for a discussion of a chain of coupled oscillators.
Masud Mansuripur, Classical Optics and its Applications (Cambridge University Press, 2002).
See Chapter 2 for a discussion of Fourier optics.
Timothy J. Rolfe, Stuart A. Rice, and John Dancz, “A numerical study of large amplitude motion
on a chain of coupled nonlinear oscillators,” J. Chem. Phys. 70, 26–33 (1979). Problem
9.30 is based on this paper.
Garrison Sposito, An Introduction to Classical Dynamics (John Wiley & Sons, 1976). A good
discussion of the coupled harmonic oscillator problem is given in Chapter 6.
William J. Thompson, Computing for Scientists and Engineers (John Wiley & Sons, 1992). See
Chapters 9 and 10 for a discussion of Fourier transform methods.
Michael L. Williams and Humphrey J. Maris, “Numerical study of phonon localization in disordered
systems,” Phys. Rev. B 31, 4508–4515 (1985). The authors consider the normal
modes of a two-dimensional system of coupled oscillators with random masses. The idea
of using mechanical resonance to extract the normal modes is the basis of a new numerical
method for ﬁnding the eigenmodes of large lattices. See Kousuke Yukubo, Tsuneyoshi
Nakayama, and Humphrey J. Maris, “Analysis of a new method for ﬁnding eigenmodes of
very large lattice systems,” J. Phys. Soc. Japan 60, 3249 (1991).
Chapter 10
Electrodynamics
We compute the electric ﬁelds due to static and moving charges, describe methods for computing
the electric potential in boundary value problems, and solve Maxwell’s equations numeri-
cally.
10.1 Static Charges
Suppose we want to know the electric ﬁeld E(r) at the point r due to N point charges q1,q2,...,qN
at ﬁxed positions r1,r2,...,rN . We know that E(r) satisﬁes a superposition principle and is given
by
E(r) = K
N
i
qi
|r − ri|3
(r − ri) (10.1)
where ri is the ﬁxed location of the ith charge, and K is a constant that depends on our choice of
units. One of the diﬃculties associated with electrodynamics is the competing systems of units.
In the SI (or rationalized MKS) system of units, the charge is measured in coulombs (C) and the
constant K is given by
K =
1
4π 0
≈ 9.0 × 109
N · m2
/C2
(SI units). (10.2)
The constant 0 is the electrical permittivity of free space. This choice of units is not convenient
for computer programs because K 1. Another popular system of units is the Gaussian (cgs)
system for which the constant K is absorbed into the unit of charge so that K = 1. Charge is in
electrostatic units or esu. One feature of Gaussian units is that the electric and magnetic ﬁelds
have the same units. For example, the (Lorentz) force on a particle of charge q and velocity v in
an electric ﬁeld E and a magnetic ﬁeld B has the form
F = q(E +
v
c
× B) (Gaussian units). (10.3)
These virtues of the Gaussian system of units lead us to adopt this system for this chapter even
though SI units are used in introductory texts.
361
CHAPTER 10. ELECTRODYNAMICS 362
10.2 Electric Fields
The electric ﬁeld is an example of a vector ﬁeld because it deﬁnes a vector quantity at every
point in space. One way to visualize this ﬁeld is to divide space into a discrete grid and to draw
arrows in the direction of E at the vertices of this grid. The length of the arrow can be chosen to
be proportional to the magnitude of the electric ﬁeld. Another possibility is to use color or gray
scale to represent the magnitude. Because we have found that using an arrow’s color rather
than its length to represent ﬁeld strength produces a more eﬀective representation of vector
ﬁelds over a wider dynamic range, the Vector2DFrame class in the Open Source Physics frames
package uses the color representation (see Appendix 10A).
The ElectricFieldApp program computes the electric ﬁeld due to an arbitrary number of
point charges. A charge is created using the control’s x, y, and q parameters when the Calculate
button is clicked. Each time the Calculate button is clicked, another charge is created.
Whenever a charge is added or moved, the electric ﬁeld is recomputed in the calculateField
method. The Reset button removes all charges.
Because the number of charges can change, we need a way to obtain the position and charge
for each point charge so that the electric ﬁeld can be computed. The drawing panel for frame
contains a list of all drawable objects, but we need a way to obtain only those objects that are of
type Charge. The following statements show one way of doing this.
List chargeList = frame . getDrawables ( Charge . class ) ;
I t e r a t o r i t = chargeList . i t e r a t o r ( ) ;
The argument Charge.class tells the frame.getDrawables method to return only a list of objects
that can be cast to the Charge class.1 These objects are then placed in an object of type
ArrayList which implements the Java interface List. The iterator method returns an object
called it which implements the Iterator interface. We then use the object it to loop through
all the charges using the next and hasNext methods of the interface Iterator. As the name
implies, an iterator is a convenient way to access a list without explicitly counting its elements.
You will modify ElectricFieldApp to include a moving test charge in Problem 10.1.
Listing 10.1: ElectricFieldApp computes and displays the electric ﬁeld from a list of point
charges.
package org . opensourcephysics . sip . ch10 ;
import java . awt . event . MouseEvent ;
import java . u t i l . I t e r a t o r ;
import java . u t i l . List ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . frames . Vector2DFrame ;
public class ElectricFieldApp extends AbstractCalculation implements
InteractiveMouseHandler {
int n = 20; / / grid points on a s i d e
double a = 10; / / viewing s i d e length
double [ ] [ ] [ ] eField = new double [ 2 ] [ n ] [ n ] ; / / s t o r e s e l e c t r i c f i e l d
Vector2DFrame frame = new Vector2DFrame ( "x" , "y" , "Electric field" ) ;
public ElectricFieldApp ( ) {
1The syntax Charge.class uses a small portion of the Java reﬂection API to determine the type of object that is being
requested. The reﬂection API is an advanced feature of Java.
CHAPTER 10. ELECTRODYNAMICS 363
frame . setPreferredMinMax(−a /2 , a /2 , −a /2 , a / 2 ) ;
frame . setZRange ( false , 0 , 2 ) ;
frame . setAll ( eField ) ; / / s e t s the v e c t o r f i e l d
frame . setInteractiveMouseHandler ( this ) ;
}
public void calculate ( ) {
double x = control . getDouble ( "x" ) ;
double y = control . getDouble ( "y" ) ;
double q = control . getDouble ( "q" ) ;
Charge charge = new Charge ( x , y , q ) ;
frame . addDrawable ( charge ) ;
calculateField ( ) ;
}
public void reset ( ) {
control . println (
"Calculate creates a new charge and updates the field." ) ;
control . println ( "You can drag charges." ) ;
frame . clearDrawables ( ) ; / / removes a l l charges
control . setValue ( "x" , 0 ) ;
control . setValue ( "y" , 0 ) ;
control . setValue ( "q" , 1 ) ;
calculateField ( ) ;
}
void calculateField ( ) {
for ( int ix = 0; ix<n ; ix ++) {
for ( int iy = 0; iy<n ; iy ++) {
eField [ 0 ] [ ix ] [ iy ] = eField [ 1 ] [ ix ] [ iy ] = 0; / / zeros f i e l d
}
}
/ / the charges in the frame
List chargeList = frame . getDrawables ( Charge . class ) ;
I t e r a t o r i t = chargeList . i t e r a t o r ( ) ;
while ( i t . hasNext ( ) ) {
Charge charge = ( Charge ) i t . next ( ) ;
double
xs = charge . getX ( ) , ys = charge . getY ( ) ;
for ( int ix = 0; ix<n ; ix ++) {
double x = frame . indexToX ( ix ) ;
/ / d i s t a n c e of charge to g r i dp o i n t
double dx = ( x−xs ) ;
for ( int iy = 0; iy<n ; iy ++) {
double y = frame . indexToY ( iy ) ;
double dy = ( y−ys ) ; / / charge to g r i dp o i n t
double r2 = dx dx+dy dy ; / / d i s t a n c e squared
double r3 = Math . sqrt ( r2 ) r2 ; / / d i s t a n c e cubed
i f ( r3 >0) {
eField [ 0 ] [ ix ] [ iy ] += charge . q dx/r3 ;
eField [ 1 ] [ ix ] [ iy ] += charge . q dy/r3 ;
}
}
CHAPTER 10. ELECTRODYNAMICS 364
}
}
frame . setAll ( eField ) ;
}
public void handleMouseAction ( InteractivePanel panel , MouseEvent evt ) {
panel . handleMouseAction ( panel , evt ) ; / / panel moves the charge
i f ( panel . getMouseAction ()== InteractivePanel .MOUSE_DRAGGED) {
/ / remove t h i s l i n e i f user i n t e r f a c e i s s l u g g i s h
calculateField ( ) ;
panel . repaint ( ) ;
}
}
public s t a t i c void main ( String [ ] args ) {
CalculationControl . createApp (new ElectricFieldApp ( ) ) ;
}
}
To make the program interactive, the ElectricFieldApp class implements the InteractiveMouseHandler
to process mouse events when a charge is dragged. (See Section 5.7 for a discussion
of interactive panels and interactive mouse handlers.) The class registers its interest in
handling these events using the setInteractiveMouseHandler method. The handler passes
the event to the panel to move the charge and then recalculates the ﬁeld. Note that the Charge
class in Listing 10.2 inherits from the InteractiveCircle class.2
Listing 10.2: The Charge class extends the InteracticeCircle class and adds the charge prop-
erty.
package org . opensourcephysics . sip . ch10 ;
import java . awt . Color ;
import org . opensourcephysics . display . I n t e r a c t i v e C i r c l e ;
public class Charge extends I n t e r a c t i v e C i r c l e {
double q = 0;
public double getQ ( ) {
return q ;
}
public Charge ( double x , double y , double q ) { / /
super ( x , y ) ;
this . q = q ;
i f (q>0) {
color = Color . red ;
} else {
color = Color . blue ;
}
}
}
2Dragging may become sluggish if too many computations are performed within the mouse action method.
CHAPTER 10. ELECTRODYNAMICS 365
Problem 10.1. Motion of a charged particle in an electric ﬁeld
(a) Test ElectricFieldApp by adding one charge at a time at various locations. Do the electric
ﬁeld patterns look reasonable? For example, does the electric ﬁeld point away from positive
charges and toward negative charges? How well is the magnitude of the electric ﬁeld
represented?
(b) Modify ElectricFieldApp so that it uses AbstractSimulation to compute the motion of a
test particle of mass m and charge q in the presence of the electric ﬁeld created by a ﬁxed
distribution of point charges. That is, create a drawable test charge that implements the ODE
interface and add it to the vector ﬁeld frame. Use the same approach that was used for the
trajectory problems in Chapter 5. The acceleration of the charge is given by qE/m, where E
is the electric ﬁeld due to the ﬁxed point charges. Use a higher-order algorithm to advance
the position and velocity of the particle. (Ignore the eﬀects of radiation due to accelerating
charges.)
(c) Assume that E is due to a charge q(1) = 1.5 ﬁxed at the origin. Simulate the motion of a
charged particle of mass m = 0.1 and charge q = 1 initially at x = 1,y = 0. Consider the
following initial conditions for its velocity: vx = 0,vy = 0; vx = 1,vy = 0; vx = 0,vy = 1; and
vx = −1, vy = 0. Is the trajectory of the particle tangent to the ﬁeld vectors? Explain.
(d) Assume that the electric ﬁeld is due to two ﬁxed point charges: q(1) = 1 at x(1) = 2,y(1) =
0 and q(2) = −1 at x(2) = −2,y(2) = 0. Place a charged particle of unit mass and unit
positive charge at x = 0.05,y = 0. What do you expect the motion of this charge to be? Do
the simulation and determine the qualitative nature of the motion.
e)∗ Consider the motion of a charged particle in the vicinity of the electric dipole deﬁned in
part (d). Choose the initial position to be ﬁve times the separation of the charges in the
dipole. Do you ﬁnd any bound orbits? Do you ﬁnd any closed orbits or do all orbits show
some precession?
10.3 Electric Field Lines
Another way of visualizing the electric ﬁeld is to draw electric ﬁeld lines. The properties of these
lines are as follows:
1. An electric ﬁeld line is a directed line whose tangent at every position is parallel to the
electric ﬁeld at that position.
2. The lines are smooth and continuous except at singularities such as point charges. (It
makes no sense to talk about the electric ﬁeld at a point charge.)
3. The density of lines at any point in space is proportional to the magnitude of the ﬁeld at
that point. This property implies that the total number of electric ﬁeld lines from a point
charge is proportional to the magnitude of that charge. The value of the proportionality
constant is chosen to provide the clearest pictorial representation of the ﬁeld. The drawing
of ﬁeld lines is art plus science.
The FieldLineApp program draws electric ﬁeld lines in two dimensions. The program
makes extensive use of the FieldLine class which implements the following algorithm:
CHAPTER 10. ELECTRODYNAMICS 366
1. Begin at a point (x,y) and compute the components Ex and Ey of the electric ﬁeld vector E
using (10.1).
2. Draw a small line segment of size ∆s = |∆s| tangent to E at that point. The components of
the line segment are given by
∆x = ∆s
Ex
|E|
and ∆y = ∆s
Ey
|E|
. (10.4)
3. Iterate the process beginning at the new point (x+∆x,y +∆y). Continue until the ﬁeld line
approaches a point charge singularity or escapes toward inﬁnity.
This ﬁeld line algorithm is equivalent to solving the following diﬀerential equations:
dx
ds
=
Ex
|E|
(10.5a)
dy
ds
=
Ey
|E|
. (10.5b)
Because a ﬁeld line extends in both directions from the algorithm’s starting point, the computation
must be repeated in the (−Ex/|E|,−Ey/|E|) direction to obtain a complete visualization of
the ﬁeld line. Note that this algorithm draws a correct ﬁeld line but does not draw a collection
of ﬁeld lines with a density proportional to the ﬁeld intensity.
To draw the ﬁeld lines, a computation starts when a user double clicks in the panel and end
the computation when the ﬁeld line approaches a point charge or when the magnitude of the
ﬁeld becomes too small. Although we can easily describe these stopping conditions, we do not
know how long the computation will take, and we might want to compute multiple ﬁeld lines
simultaneously. An elegant way to do this computation is to use threads.
As we discussed in Section 2.6, Java programs can have multiple threads to separate and
organize related tasks. A thread is an independent task within a single program that shares the
program’s data with other threads.3 In the following example, we create a thread to compute
the solution of the diﬀerential equation for an electric ﬁeld line. It is natural to use threads in
this context because the drawing of a ﬁeld line involves starting the ﬁeld line, drawing each
piece of the ﬁeld line, and then stopping the calculation when some stopping condition is met.
The computation begins when the FieldLine object is created and ends when the stopping
condition is satisﬁed.
A thread executes statements within an object, such as FieldLine, that implements the
Runnable interface. This interface consists of a single method, the run method, and the thread
executes the code within this method. The run method is not invoked directly, but is invoked
automatically by the thread after the thread is started. When the run method exits, the thread
that invoked the run method stops executing and is said to die. After a thread dies, it cannot be
restarted. Another thread must be created if we wish to invoke the run method a second time.
We build a FieldLine class by subclassing Thread and adding the necessary drawing and
diﬀerential equation capabilities using the Drawable and ODE interfaces, respectively. This class
is shown in Listing 10.3.
Listing 10.3: The FieldLine class computes an electric ﬁeld line using a Thread.
package org . opensourcephysics . sip . ch10 ;
3The Open Source Physics User’s Guide describes simulation threads in more detail.
CHAPTER 10. ELECTRODYNAMICS 367
import java . awt . Graphics ;
import java . u t i l . ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . numerics . ;
public class FieldLine implements Drawable , ODE, Runnable {
DrawingFrame frame ;
double [ ] s t a t e = new double [ 2 ] ; / / Ex and Ey f o r ODE
ODESolver odeSolver = new RK45MultiStep ( this ) ;
ArrayList chargeList ; / / l i s t of charged p a r t i c l e s
Trail t r a i l ;
double stepSize ;
volatile boolean done = false ;
public FieldLine ( DrawingFrame frame , double x0 , double y0 ,
double stepSize ) {
this . stepSize = stepSize ;
this . frame = frame ;
odeSolver . setStepSize ( stepSize ) ;
s t a t e [ 0 ] = x0 ;
s t a t e [ 1 ] = y0 ;
chargeList = frame . getDrawables ( Charge . class ) ;
t r a i l = new Trail ( ) ;
t r a i l . addPoint ( x0 , y0 ) ;
Thread thread = new Thread ( this ) ;
thread . s t a r t ( ) ;
}
public double [ ] getState ( ) {
return s t a t e ;
}
public void getRate ( double [ ] state , double [ ] rate ) {
double ex = 0;
double ey = 0;
for ( I t e r a t o r i t = chargeList . i t e r a t o r ( ) ; i t . hasNext ( ) ; ) {
Charge charge = ( Charge ) i t . next ( ) ;
double dx = ( charge . getX () − s t a t e [ 0 ] ) ;
double dy = ( charge . getY () − s t a t e [ 1 ] ) ;
double r2 = dx dx+dy dy ;
double r = Math . sqrt ( r2 ) ;
i f ( ( r <2 stepSize ) | | ( r >100)) { / / done i f too c l o s e or too f a r
done = true ;
}
ex += ( r==0) ? 0 : charge . q dx/r2/ r ;
ey += ( r==0) ? 0 : charge . q dy/r2/ r ;
}
double mag = Math . sqrt ( ex ex+ey ey ) ;
rate [ 0] = (mag==0) ? 0 : ex/mag;
rate [ 1] = (mag==0) ? 0 : ey/mag;
}
public void run ( ) {
CHAPTER 10. ELECTRODYNAMICS 368
int counter = 0;
while ( ( ( counter <1000)&&!done ) ) {
odeSolver . step ( ) ;
t r a i l . addPoint ( s t a t e [ 0 ] , s t a t e [ 1 ] ) ;
i f ( counter%50==0) { / / repaint every 50th st ep
frame . repaint ( ) ;
try {
Thread . sleep ( 2 0 ) ; / / give the event queue a chance
} catch ( InterruptedException ex ) { }
}
counter ++;
Thread . yield ( ) ;
}
frame . repaint ( ) ;
}
public void draw ( DrawingPanel panel , Graphics g ) {
t r a i l . draw ( panel , g ) ;
}
}
The FieldLine constructor saves a reference to the list of charges to calculate the electric
ﬁeld using (10.1). The loop in the run method solves the diﬀerential equation and stores the
solution in a drawable trail. The loop is exited when the ﬁeld line is close to a charge or when
the magnitude of the ﬁeld becomes too small. Because there are situations where the ﬁeld line
will never stop, this loop is executed no more than 1000 times.
The FieldLineApp program instantiates a ﬁeld line when the user double clicks within the
panel. Adding a charge or moving a charge removes all ﬁeld lines from the panel. Study how
the handleMouseAction allows the user to drag charges and to initiate the drawing of ﬁeld lines.
You are asked to modify this program in Problem 10.2.
Listing 10.4: The FieldLineApp program computes an electric ﬁeld line when the user clicks
within the panel.
package org . opensourcephysics . sip . ch10 ;
import java . awt . event . MouseEvent ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . frames . DisplayFrame ;
public class FieldLineApp extends AbstractCalculation implements
InteractiveMouseHandler {
DisplayFrame frame = new DisplayFrame ( "x" , "y" , "Field lines" ) ;
public FieldLineApp ( ) {
frame . setInteractiveMouseHandler ( this ) ;
frame . setPreferredMinMax ( −10 , 10 , −10, 10);
}
public void calculate ( ) {
/ / remove old f i e l d l i n e s
frame . removeObjectsOfClass ( FieldLine . class ) ;
double x = control . getDouble ( "x" ) ;
double y = control . getDouble ( "y" ) ;
CHAPTER 10. ELECTRODYNAMICS 369
double q = control . getDouble ( "q" ) ;
Charge charge = new Charge ( x , y , q ) ;
frame . addDrawable ( charge ) ;
}
public void reset ( ) {
control . println (
"Calculate creates a new charge and clears the field lines." ) ;
control . println ( "You can drag charges." ) ;
control . println ( "Double click in display to compute a field line." ) ;
frame . clearDrawables ( ) ; / / remove charges and f i e l d l i n e s
control . setValue ( "x" , 0 ) ;
control . setValue ( "y" , 0 ) ;
control . setValue ( "q" , 1 ) ;
}
public void handleMouseAction ( InteractivePanel panel , MouseEvent evt ) {
panel . handleMouseAction ( panel , evt ) ; / / panel handles dragging
switch ( panel . getMouseAction ( ) ) {
case InteractivePanel .MOUSE_DRAGGED :
i f ( panel . getInteractive ()== null ) {
return ;
}
/ / f i e l d i s i n v a l i d
frame . removeObjectsOfClass ( FieldLine . class ) ;
/ / repaint to keep the screen up to date
frame . repaint ( ) ;
break ;
case InteractivePanel .MOUSE_CLICKED :
/ / check f o r double c l i c k
i f ( evt . getClickCount () >1) {
double
x = panel . getMouseX ( ) , y = panel . getMouseY ( ) ;
FieldLine fieldLine = new FieldLine ( frame , x , y , +0.1);
panel . addDrawable ( fieldLine ) ;
fieldLine = new FieldLine ( frame , x , y , −0.1);
panel . addDrawable ( fieldLine ) ;
}
break ;
}
}
public s t a t i c void main ( String [ ] args ) {
CalculationControl . createApp (new FieldLineApp ( ) ) ;
}
}
Problem 10.2. Veriﬁcation of ﬁeld line program
(a) Draw ﬁeld lines for a few simple sets of one, two, and three charges. Choose sets of charges
for which all have the same sign and sets for which they are diﬀerent. Verify that the ﬁeld
lines never connect charges of the same sign. Why do ﬁeld lines never cross? Are the units
of charge and distance relevant?
CHAPTER 10. ELECTRODYNAMICS 370
(b) Compare FieldLineApp and ElectricFieldApp. Which representation conveys more information?
Consider how each program provides (or does not provide) information about
the electric ﬁeld magnitude and direction. Discuss some of the diﬃculties with making an
accurate ﬁeld line diagram.
(c) FieldLine uses a constant value for ∆s. Modify the algorithm so that the calculation continues
when a ﬁeld line moves oﬀ the screen but speed up the algorithm by increasing the
value of ∆s.
(d) Removing a ﬁeld line from the drawing panel in the reset method does not stop the thread.
Improve the performance of the program by modifying ElectricFieldApp so that a ﬁeld
line’s done variable is set to false when it is removed from the drawing panel.
Problem 10.3. Electric ﬁeld lines from point charges
(a) Modify FieldLineApp so that a charge starts ten ﬁeld lines per unit of charge whenever a
new charge is added to the panel or when a charge is moved. Start these ﬁeld lines close to
each charge in such a way that they propagate away from the charge. Should you start these
ﬁeld lines on both positive and negative charges? Explain your answer.
(b) Draw the ﬁeld lines for an electric dipole.
(c) Draw the ﬁeld lines for the electric quadrupole with q(1) = 1, x(1) = 1, y(1) = 1, q(2) = −1,
x(2) = −1, y(2) = 1, q(3) = 1, x(3) = −1, y(3) = −1, q(4) = −1, x(4) = 1, and y(4) = −1.
(d) A continuous charge distribution can be approximated by a large number of closely spaced
point charges. Draw the electric ﬁeld lines due to a row of ten equally spaced unit charges
located between −2.5 and +2.5 on the x-axis. How does the electric ﬁeld distribution compare
to the distribution due to a single point charge?
(e) Repeat part (c) with two rows of equally spaced positive charges on the lines y = 0 and y = 1,
respectively. Then consider one row of positive charges and one row of negative charges.
Problem 10.4. Field lines due to inﬁnite line of charge
(a) The FieldLineApp program plots ﬁeld lines in two dimensions. Sometimes this restriction
can lead to spurious results (see Freeman). Consider four identical charges placed at the
corners of a square. Use the program to plot the ﬁeld lines. What, if anything, is wrong
with the results? What should happen to the ﬁeld lines near the center of the square?
(b) The two-dimensional analog of a point charge is an inﬁnite line (thin cylinder) of charge
perpendicular to the plane. The electric ﬁeld due to an inﬁnite line of charge is proportional
to the linear charge density and inversely proportional to the distance (instead of the
distance squared) from the line of charge to a point in the plane. Modify the FieldLine
class to compute the ﬁeld lines from line charges with E(r) = 1/r. Use your modiﬁed class
to draw the ﬁeld lines due to four identical line charges located at the corners of a square
and compare the ﬁeld lines with your results in part (a).
(c) Use your modiﬁed program from part (b) to draw the ﬁeld lines for the two-dimensional
analogs of the distributions considered in Problem 10.3. Compare the results for two and
three dimensions and discuss any qualitative diﬀerences.
(d) Can your program be used to demonstrate Gauss’s law using point charges? What about
line charges?
CHAPTER 10. ELECTRODYNAMICS 371
10.4 Electric Potential
It often is easier to analyze the behavior of a system using energy rather than force concepts.
We deﬁne the electric potential V (r) by the relation
V (r2) − V (r1) = −
r2
r1
E · dr (10.6)
or
E(r) = − V (r). (10.7)
Only diﬀerences in the potential between two points have physical signiﬁcance. The gradient
operator is given in Cartesian coordinates by
=
∂
∂x
ˆx +
∂
∂y
ˆy +
∂
∂z
ˆz (10.8)
where the vectors ˆx, ˆy, and ˆz are unit vectors along the x-, y-, and z-axes, respectively. If V
depends only on the magnitude of r, then (10.7) becomes E(r) = −dV (r)/dr. Recall that V (r) for
a point charge q relative to a zero potential at inﬁnity is given by
V (r) =
q
r
(Gaussian units). (10.9)
The surface on which the electric potential has an equal value everywhere is called an
equipotential surface (a curve in two dimensions). Because E is in the direction in which the
electric potential decreases most rapidly, the electric ﬁeld lines are orthogonal to the equipotential
surfaces at any point.
The Open Source Physics frames package contains the Scalar2DFrame class to provide
graphical representations of scalar ﬁelds (see Appendix 9B). Problem 10.5 uses a scalar ﬁeld
plot to show the electric potential. The following code fragment shows how to calculate the
electric potential at a grid point.
List chargeList = frame . getDrawables ( Charge . class ) ;
I t e r a t o r i t = chargeList . i t e r a t o r ( ) ;
while ( i t . hasNext ( ) ) {
Charge charge = ( Charge ) i t . next ( ) ;
double xs = charge . getX ( ) , ys = charge . getY ( ) ;
for ( int ix = 0; ix < n ; ix ++) {
double x= frame . indexToX ( ix ) ;
double dx = ( xs − x ) ; / / charge g r i dp o i n t separ ation
for ( int iy = 0; iy < n ; iy ++) {
double y= frame . indexToY ( iy ) ;
double dy = ( ys −y ) ; / / charge g r i dp o i n t separ ation
double r2 = dx dx + dy dy ;
double r = Math . sqrt ( r2 ) ;
i f ( r > 0) {
eField [ ix ] [ iy ] += charge . q/ r ;
}
}
}
}
frame . setAll ( eField ) ;
CHAPTER 10. ELECTRODYNAMICS 372
Problem 10.5. Equipotential contours
(a) Write a program based on ElectricFieldApp that draws equipotential lines using the charge
distributions considered in Problem 10.3.
(b) Explain why equipotential surfaces (lines in two dimensions) never cross.
We can use the orthogonality between the electric ﬁeld lines and the equipotential lines to
modify FieldLineApp so that it draws the latter. Because the components of the line segment ∆s
parallel to the electric ﬁeld line are given by ∆x = ∆s(Ex/E) and ∆y = ∆s(Ey/E), the components
of the line segment perpendicular to E, and hence, parallel to the equipotential line, are given
by ∆x = −∆s(Ey/E) and ∆y = ∆s(Ex/E). It is unimportant whether the minus sign is assigned to
the x or y component, because the only diﬀerence would be the direction that the equipotential
lines are drawn.
Problem 10.6. Equipotential lines
(a) Write a program that is based on FieldLineApp and FieldLine to draw some of the equipotential
lines for the charge distributions considered in Problem 10.3. Use a mouse click to
determine the initial position of an equipotential line. The equipotential calculation should
stop when the line returns close to the starting point or after an unreasonable number of
calculations. The program should also kill the thread when the user moves a charge, hits
the Reset button, or when the application terminates.
(b) What would a higher density of equipotential lines mean if we drew lines such that each
adjacent line diﬀered from a neighboring one by a ﬁxed potential diﬀerence?
(c) Explain why equipotential surfaces never cross.
Problem 10.7. The electric potential due to a ﬁnite sheet of charge
Consider a uniformly charged nonconducting plate of total charge Q and linear dimension L
centered at (0,0,0) in the x-y plane. In the limit L → ∞ with the charge density σ = Q/L2
a constant, we know that the electric ﬁeld is normal to the sheet and its magnitude is given
by 2πσ (Gaussian units). What is the electric ﬁeld due to a ﬁnite sheet of charge? A simple
method is to divide the plate into a grid of p square regions on a side such that each region is
suﬃciently small to be approximated by a point charge of magnitude q = Q/p2. Because the
potential is a scalar, it is easier to compute the total potential rather than the total electric ﬁeld
from the N = p2 point charges. Use the relation (10.9) for the potential from a point charge
and write a program to compute V (z) and hence, Ez = −∂V (z)/∂z for points along the z-axis and
perpendicular to the sheet. Take L = 1, Q = 1, and p = 10 for your initial calculations. Increase
p until your results for V (z) do not change signiﬁcantly. Plot V (z) and Ez as a function of z and
compare their z-dependence to their inﬁnite sheet counterparts.
∗Problem 10.8. Electrostatic shielding
We know that the (static) electric ﬁeld is zero inside a conductor, all excess charges reside on
the surface of the conductor, and the surface charge density is greatest at the points of greatest
curvature. Although these properties are plausible, it is instructive to do a simulation to
see how these properties follow from Coulomb’s law. For simplicity, consider the conductor
to be two-dimensional so that the potential energy is proportional to lnr rather than 1/r (see
Problem 10.4). It is also convenient to choose the surface of the conductor to be an ellipse.
CHAPTER 10. ELECTRODYNAMICS 373
(a) If we are interested only in the ﬁnal distribution of the charges and not in the dynamics
of the system, we can use a Monte Carlo method. Our goal is to ﬁnd the minimum energy
conﬁguration beginning with the N charges randomly placed within a conducting ellipse.
One method is to choose a charge i at random and make a trial change in the position of
the charge. The trial position should be no more than δ from the old position and still be
within the ellipse. Choose δ ≈ b/10, where b is the semiminor axis of the ellipse. Compute
the change in the total potential energy given by (in arbitrary units)
∆U = −
j
[lnr
(new)
ij − lnr
(old)
ij ]. (10.10)
The sum is over all charges in the system not including i. If ∆U > 0, then reject the trial
move; otherwise accept it. Repeat this procedure many times until very few trial moves are
accepted. Write a program to implement this Monte Carlo algorithm. Run the simulation
for N ≥ 20 charges inside a circle and then repeat the simulation for an ellipse. How are the
charges distributed in the (approximately) minimum energy distribution? Which parts of
the ellipse have a higher charge density?
(b) Repeat part (a) for a two-dimensional conductor, but assume that the potential energy U ∼
1/r. Do the charges move to the surface?
(c) Is it suﬃcient that the interaction be repulsive for the results of parts (a) and (b) to hold?
(d) Repeat part (a) with the added condition that there is a ﬁxed positive charge of magnitude
N/2 located outside the ellipse. How does this ﬁxed charge aﬀect the charge distribution?
Are the excess free charges still at the surface? Try diﬀerent positions for the ﬁxed charge.
(e) Repeat parts (a) and (b) for N = 50 charges located within an ellipsoid in three dimensions.
10.5 Numerical Solutions of Boundary Value Problems
In Section 10.1 we found the electric ﬁelds and potentials due to a ﬁxed distribution of charges.
Suppose that we do not know the positions of the charges but instead know only the potential on
a set of boundaries surrounding a charge-free region. This information is suﬃcient to determine
the potential V (r) at any point within the charge-free region.
The direct method of solving for V (x,y,z) is based on Laplace’s equation which can be expressed
in Cartesian coordinates as
2
V (x,y,z) ≡
∂2V
∂x2
+
∂2V
∂y2
+
∂2V
∂z2
= 0. (10.11)
The problem is to ﬁnd the function V (x,y,z) that satisﬁes (10.11) and the speciﬁed boundary
conditions. This type of problem is an example of a boundary value problem. Because analytic
methods for regions of arbitrary shape do not exist, the only general approach is to use numerical
methods. Laplace’s equation is not a new law of physics, but can be derived directly from
(10.7) and the relation ·E = 0 or indirectly from Coulomb’s law in regions of space where there
is no charge.
CHAPTER 10. ELECTRODYNAMICS 374
For simplicity, we consider only two-dimensional boundary value problems for V (x,y). We
use a ﬁnite diﬀerence method and divide space into a discrete grid of sites located at the coordinates
(x,y). In Problem 10.9b, we show that in the absence of a charge at (x,y), the discrete
form of Laplace’s equation satisﬁes the relation
V (x,y) ≈
1
4
[V (x + ∆x,y) + V (x − ∆x,y)
+ V (x,y + ∆y) + V (x,y − ∆y)] (two dimensions) (10.12)
where V (x,y) is the value of the potential at the site (x,y). Equation (10.12) says that V (x,y)
is the average of the potential of its four nearest neighbor sites. This remarkable property of
V (x,y) can be derived by approximating the partial derivatives in (10.11) by ﬁnite diﬀerences
(see Problem 10.9b).
In Problem 10.9(a) we verify (10.12) by calculating the potential due to a point charge at a
point in space we select and at the four nearest neighbors. As the form of (10.12) implies, the
average of the potential at the four neighboring sites should equal the potential at the center
site. We assume the form (10.9) for the potential V (r) due to a point charge, a form that satisﬁes
Laplace’s equation for r 0.
Problem 10.9. Veriﬁcation of the diﬀerence equation for the potential
(a) Modify PotentialFieldApp to compare the computed potential at a point to the average of
the potential at its four nearest neighbor sites. Choose reasonable values for the spacings ∆x
and ∆y and consider a point that is not too close to the source charge. Do similar measurements
for other points. Does the relative agreement with (10.12) depend on the distance of
the point to the source charge? Choose smaller values of ∆x and ∆y and determine if your
results are in better agreement with (10.12). Does it matter whether ∆x and ∆y have the
same value?
(b) Derive the ﬁnite diﬀerence equation (10.12) for V (x,y) using the second-order Taylor ex-
pansion:
V (x + ∆x,y) = V (x,y) + ∆x
∂V (x,y)
∂x
+
1
2
(∆x)2 ∂2V (x,y)
∂x2
+ ··· (10.13)
V (x,y + ∆y) = V (x,y) + ∆y
∂V (x,y)
∂y
+
1
2
(∆y)2 ∂2V (x,y)
∂y2
+ ··· . (10.14)
The eﬀect of including higher derivatives is discussed by MacDonald (see references).
Now that we have found that (10.12), a ﬁnite diﬀerence form of Laplace’s equation, is consistent
with Coulomb’s law, we adopt (10.12) as the basis for computing the potential for systems
for which we cannot calculate the potential directly. In particular, we consider problems
where the potential is speciﬁed on a closed surface that divides space into interior and exterior
regions in which the potential is independently determined. For simplicity, we consider only
two-dimensional geometries. The approach, known as the relaxation method, is based on the
following algorithm:
1. Divide the region of interest into a rectangular grid spanning the region. The region is
enclosed by a surface (curve in two dimensions) with speciﬁed values of the potential
along the curve.
2. Assign to a boundary site the potential of the boundary nearest the site.
CHAPTER 10. ELECTRODYNAMICS 375
3. Assign all interior sites an arbitrary potential (preferably a reasonable guess).
4. Compute new values for the potential V for each interior site. Each new value is obtained
by ﬁnding the average of the previous values of the potential at the four nearest neighbor
sites.
5. Repeat step (4) using the values of V obtained in the previous iteration. This iterative
process is continued until the potential at each interior site is computed to the desired
accuracy.
The program shown in Listing 10.5 implements this algorithm using a grid of voltages and a
boolean grid to signal the presence of a conductor.
Listing 10.5: The LaplaceApp program solves the Laplace equation using the relaxation
method.
package org . opensourcephysics . sip . ch10 ;
import java . awt . event . ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . display2d . ;
import org . opensourcephysics . frames . ;
public class LaplaceApp extends AbstractSimulation implements
InteractiveMouseHandler {
Scalar2DFrame frame = new Scalar2DFrame ( "x" , "y" ,
"Electric potential" ) ;
boolean [ ] [ ] isConductor ;
double [ ] [ ] potential ; / / e l e c t r i c p o t e n t i a l
double maximumError ;
int gridSize ; / / number of s i t e s on s i d e of grid
public LaplaceApp ( ) {
frame . setInteractiveMouseHandler ( this ) ;
}
public void i n i t i a l i z e ( ) {
maximumError = control . getDouble ( "maximum error" ) ;
gridSize = control . getInt ( "size" ) ;
initArrays ( ) ;
frame . s e t V i s i b l e ( true ) ;
frame . showDataTable ( true ) ; / / show the data t a b l e
}
public void initArrays ( ) {
isConductor = new boolean [ gridSize ] [ gridSize ] ;
potential = new double [ gridSize ] [ gridSize ] ;
frame . setPaletteType ( ColorMapper .DUALSHADE) ;
/ / isConductor array i s f a l s e by d e f a u l t
/ / v o l t a g e in p o t e n t i a l array i s 0 by d e f a u l t
for ( int i = 0; i <gridSize ; i ++) { / / i n i t i a l i z e the s i d e s
isConductor [ 0 ] [ i ] = true ; / / l e f t boundary
isConductor [ gridSize −1][ i ] = true ; / / r i g h t boundary
isConductor [ i ] [ 0 ] = true ; / / bottom boundary
CHAPTER 10. ELECTRODYNAMICS 376
isConductor [ i ] [ gridSize −1] = true ; / / top boundary
}
/ / s e t p o t e n t i a l on inner conductor
for ( int i = 5; i <gridSize −5; i ++) {
potential [ gridSize /3][ i ] = 100;
isConductor [ gridSize /3][ i ] = true ;
potential [2 gridSize /3][ i ] = −100;
isConductor [2 gridSize /3][ i ] = true ;
}
frame . setAll ( potential ) ;
}
public void doStep ( ) {
double error = 0;
for ( int i = 1; i <gridSize −1; i ++) {
for ( int j = 1; j <gridSize −1; j ++) {
/ / change the v o l t a g e f o r nonconductors
i f ( ! isConductor [ i ] [ j ] ) {
double v = ( potential [ i −1][ j ]+ potential [ i +1][ j ]
+potential [ i ] [ j −1]+ potential [ i ] [ j +1])/4;
double dv = potential [ i ] [ j ]−v ;
error = Math .max( error , Math . abs ( dv ) ) ;
potential [ i ] [ j ] = v ;
}
}
}
frame . setAll ( potential ) ;
i f ( error <maximumError ) {
/ / stop the simulation thread
animationThread = null ;
control . calculationDone ( "Computation done." ) ;
}
}
public void reset ( ) {
control . setValue ( "maximum error" , 0 . 1 ) ;
control . setValue ( "size" , 31);
i n i t i a l i z e ( ) ;
}
public void handleMouseAction ( InteractivePanel panel , MouseEvent evt ) {
switch ( panel . getMouseAction ( ) ) {
case InteractivePanel .MOUSE_DRAGGED :
case InteractivePanel .MOUSE_PRESSED :
double x = panel . getMouseX ( ) ; / / mouse x in world units
double y = panel . getMouseY ( ) ;
int i = frame . xToIndex ( x ) ; / / c l o s e s t array index
int j = frame . yToIndex ( y ) ;
frame . setMessage ( "V="+decimalFormat . format ( potential [ i ] [ j ] ) ) ;
break ;
case InteractivePanel .MOUSE_RELEASED :
panel . setMessage ( null ) ;
break ;
CHAPTER 10. ELECTRODYNAMICS 377
}
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl . createApp (new LaplaceApp ( ) ) ;
}
}
As the algorithm loops through the grid sites, it ﬁrst checks if each grid site is a conductor. If
it is, the site is skipped. If not, a new potential is calculated and assigned to the proper element
in the potential array. A local variable named maximumError keeps track of the maximum
diﬀerence between the potential at a site and the average potential of the four neighbors. This
variable is used to determine the end of the simulation.
In Problems 10.10–10.12 you are asked to modify LaplaceApp to compute the potential for
various geometries.
Problem 10.10. Numerical solution of the potential within a rectangular region
(a) Modify LaplaceApp to determine the potential V (x,y) in a square region with linear dimension
L = 10. The boundary of the square is at a potential V = 10. Choose the grid size
∆x = ∆y = 1. Before you run the program, guess the exact form of V (x,y) and set the initial
values of the interior potential 10% lower than the exact answer. How many iterations are
necessary to achieve 1% accuracy? Decrease the grid size by a factor of two and determine
the number of iterations that are now necessary to achieve 1% accuracy.
(b) Consider the same geometry as in part (a), but set the initial potential at the interior sites
equal to zero except for the center site whose potential is set equal to four. Does the potential
distribution evolve to the same values as in part (a)? What is the eﬀect of a poor initial guess?
Are the ﬁnal results independent of your initial guess?
(c) Modify LaplaceApp so that the value of the potential at the four sides is 5, 10, 5, and 10,
respectively (see Figure 10.1). Sketch the equipotential surfaces. What happens if the potential
is 10 on three sides and 0 on the fourth? Start with a reasonable guess for the initial
values of the potential at the interior sites and iterate until 1% accuracy is obtained.
(d)∗ Consider the same initial choice of the potential as in part (b) and focus your attention
on the potential at the sites near the center of the square. If the central site has an initial
potential of four, what is the potential at the nearest neighbor sites after the ﬁrst iteration?
Follow the distribution of the potential as a function of the number of iterations and verify
that the nature of the relaxation of the potential to its correct distribution is closely related
to diﬀusion (see Chapter 7). It may be helpful to increase the number of sites in the grid
and the initial value of the potential at the central site to see the nature of the relaxation
more clearly.
In Problem 10.10, we implemented a simple version of the relaxation method known as
the Jacobi method. In particular, the new potential at each site is based on the values of the
potentials at the neighboring sites at the previous iteration. After the entire lattice is visited,
the potential at each site is updated simultaneously. The diﬃculty with this relaxation method is
that it converges very slowly. The use of more general relaxation methods is discussed in many
texts (cf. Sadiku or Press et al.). In Problem 10.11 we consider a method known as Gauss–Seidel
relaxation.
CHAPTER 10. ELECTRODYNAMICS 378
5 5
5 5
5 5
5 5
5 5
5 5
5 5
5 5
5 5
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
Figure 10.1: Potential distribution considered in Problem 10.10c. The number of interior sites
in each direction is nine.
Problem 10.11. Gauss–Seidel relaxation
(a) Modify the program that you used in Problem 10.10 so that the potential at each site is
updated sequentially. That is, after the average potential of the nearest neighbor sites of
site i is computed, update the potential at i immediately. In this way the new potential of
the next site is computed using the most recently computed values of its nearest neighbor
potentials. Are your results better, worse, or about the same as for the simple relaxation
method?
(b) Imagine coloring the alternate sites of a grid red and black, so that the grid resembles a
checkerboard. Modify the program so that all the red sites are updated ﬁrst, and then all
the black sites are updated. This ordering is repeated for each iteration. Do your results
converge any more quickly than in part (a)?
c)∗ The slow convergence of the relaxation methods we have explored is due to the fact that it
takes a long time for a change in the potential at one site to eﬀect changes further away. We
can improve the Gauss–Seidel method by using an overrelaxation method that updates the
new potential as follows:
Vnew(x,y) = wVave(x,y) + (1 − w)V (x,y) (10.15)
where Vave(x,y) is the average of the potential of the four neighbors of (x,y). The overrelaxation
parameter w is in the range 1 < w < 2. The eﬀect of w is to cause the potential to change
by a greater amount than in the simple relaxation procedure. Explore the dependence of
the rate of convergence on w. A relaxation method that increases the rate of convergence is
explored in Project 10.26.
CHAPTER 10. ELECTRODYNAMICS 379
Vin
Vout
Lout
Lin
Figure 10.2: The geometry of the two concentric squares considered in Problem 10.12.
Problem 10.12. The capacitance of concentric squares
(a) Use a relaxation method to compute the potential distribution between the two concentric
square cylinders shown in Figure 10.2. The potential of the outer square conductor is Vout =
10, and the potential of the inner square conductor is Vin = 5. The linear dimensions of the
exterior and interior squares are Lout = 25 and Lin = 5, respectively. Modify your program
so that the potential of the interior square is ﬁxed. Sketch the equipotential surfaces.
(b) A system of two conductors with charge Q and −Q respectively has a capacitance C that
is deﬁned as the ratio of Q to the potential diﬀerence ∆V between the two conductors.
Determine the capacitance per unit length of the concentric cylinders considered in part (a).
In this case ∆V = 5. The charge Q can be determined from the fact that near a conducting
surface, the surface charge density σ is given by σ = En/4π, where En is the magnitude of
the electric ﬁeld normal to the surface. En can be approximated by the relation −δV/δr,
where δV is the potential diﬀerence between a boundary site and an adjacent interior site
a distance δr away. Use the result of part (a) to compute δV for each site adjacent to the
two square surfaces. Use this information to determine En for the two surfaces and the
charge per unit length on each conductor. Are the charges equal and opposite in sign?
Compare your numerical result to the capacitance per unit length, 1/2lnrout/rin, of a system
of two concentric circular cylinders of radii rout and rin. Assume that the circumference of
each cylinder equals the perimeter of the corresponding square, that is, 2πrout = 4Lout and
2πrin = 4Lin.
(c) Move the inner square 1 cm oﬀ center and repeat the calculations of parts (a) and (b). How
do the potential surfaces change? Is there any qualitative diﬀerence if we set the inner
conductor potential equal to −5?
Laplace’s equation holds only in charge-free regions. If there is a charge density ρ(x,y,z) in
the region, we need to use Poisson’s equation which can be written as
2
V (r) =
∂2V
∂x2
+
∂2V
∂y2
+
∂2V
∂z2
= −4πρ(r) (10.16)
CHAPTER 10. ELECTRODYNAMICS 380
where ρ(r) is the charge density. The diﬀerence form of Poisson’s equation is given in two
dimensions by
V (x,y) ≈
1
4
V (x + ∆x,y) + V (x − ∆x,y) + V (x,y + ∆y) + V (x,y − ∆y)
+
1
4
∆x∆y 4πρ(x,y). (10.17)
Note that the product ρ(x,y)∆x∆y is the total charge in a ∆x × ∆y region centered at (x,y).
Problem 10.13. Surface charge
(a) Poisson’s equation can be used to ﬁnd the surface charge on a conductor after Laplace’s
equation has been solved. The potential is ﬁxed at the boundary sites. If we assume the
boundary is a conductor with some thickness, we can assume that the potential for the next
layer of sites outside the boundary has the same potential as the boundary. If we use this
assumption, then after we have solved numerically for the potential of the interior sites,
we will ﬁnd that the average value of the neighbors of a boundary site will not equal the
imposed potential. From (10.17) the diﬀerence will equal ∆x∆yπρ(x,y). Modify LaplaceApp
to calculate and display the surface charge density, assuming ∆x = ∆y = 1. Notice
that because we are in two dimensions the “surface” charge density, ∆x∆yρ(x,y), is a linear
density of charge per unit length.
(b) Consider the same system as in Problem 10.10(c) and ﬁnd the surface charge density on the
boundary sites. Make a reasonable choice for assigning the potential at the corner sites.
(c) Model a system with the boundary at a potential V = 0 and a centered interior rectangle of
6 × 12 at a potential of V = 10. Where is the charge density the highest?
(d) Repeat if the interior rectangle is placed close to an edge.
Problem 10.14. Numerical solution of Poisson’s equation
(a) Consider a square of linear dimension L = 25 whose boundary is ﬁxed at a potential equal to
V = 10. Assume that the interior region has a uniform charge density ρ such that the total
charge is Q = 1. Modify LaplaceApp to compute the potential distribution for this case.
Compare the equipotential surfaces obtained for this case to that found in Problem 10.12.
(b) Find the potential distribution if the charge distribution of part (a) is restricted to a 5 × 5
square at the center.
(c) Find the potential distribution if the charge distribution of part (a) is restricted to a 1 × 1
square at the center. How does the potential compare to that of a point charge without the
boundary?
∗Problem 10.15. Vector potential and magnetic ﬁelds
The magnetic ﬁeld from arbitrary currents can also be obtained using Poisson’s equation. The
ﬁeld is generated from a vector potential A that satisﬁes
2
A = µj (10.18)
CHAPTER 10. ELECTRODYNAMICS 381
V1
k1
V4
k4
V0
k0
V2
k2
V3
k3
2h
2h
Figure 10.3: The grid used to compute the integral in (10.21) is based on Gauss’s law for the
electric ﬁeld.
where j is the current density in the wires and µ is the magnetic permeability. If current ﬂows
only in the z direction, then j = (0,0,jz(x,y)) and A = (0,0,Az(x,y)), and we again have a twodimensional
problem that can be solved using the relaxation method.
Do a simulation that models the magnetic ﬁeld from an arbitrary number of wires. Combine
features of the ElectricFieldApp and the LaplaceApp programs. The program should read the
control and create a current carrying wire when a custom button is clicked. The computation
is performed using the animation’s doStep method to perform a Gauss–Seidel relaxation step.
Compute the magnetic ﬁeld after the computation converges by computing the curl of the vector
potential:
B = × A. (10.19)
See (10.52) for how to compute the curl when only discrete values are available.
Dielectrics can be added to the solution of Laplace’s equation by adding an array to store
the dielectric constant k at every grid site and imposing the condition
D1n = D2n (10.20)
where D = kE and k is the dielectric susceptibility. This condition is equivalent to
0 =
l
k V · dl =
l
k
∂V
∂n
dl (10.21)
where ∂V /∂n denotes the derivative of V normal to the contour l. The vector dl is the twodimensional
equivalent of a surface vector. Its magnitude is the length of a line segment and its
direction is perpendicular to the tangent of the segment. If we approximate (10.21) along each
edge of length 2h using a ﬁnite diﬀerence for the derivative, we obtain
0 = k1
V1 − V0
h
2h + k2
V2 − V0
h
2h + k3
V3 − V0
h
2h + k4
V4 − V0
h
2h. (10.22)
We rearrange terms in (10.22) and ﬁnd a modiﬁed form of (10.12) that includes the dielectric
V0 =
1
4(k1 + k2 + k3 + k4)
[k1V1 + k2V2 + k3V3 + k4V4] (10.23)
CHAPTER 10. ELECTRODYNAMICS 382
where ki is the average dielectric constant at a site where the electric potential is Vi.
Problem 10.16. Capacitor with dielectric
(a) Modify your Laplace program to include a dielectric medium. That is, create an array of
dielectric susceptibilities and implement (10.23) using a relaxation algorithm. Be sure to
set the dielectric array elements to unity in free space and inside conductors.
(b) Test your algorithm by creating a capacitor consisting of +10 and −10 potential plates near
the center of the grid. Initialize the dielectric susceptibility to two in half the capacitor and
run the program. Use a Scalar2DFrame to display the electric potential, but note that some
representations of the scalar ﬁeld are more appropriate than others. Compare the spacing
between the contour lines inside and outside the dielectric. Why does the spacing change?
(c) The bound charge on the surface of a dielectric can be computed by subtracting V (x,y)
from the average of the potential at the four nearest neighbor sites. You are, in eﬀect, using
(10.17) to solve for the charge. Implement this calculation and describe the bound charge
on the surface of the dielectric.
10.6 Random Walk Solution of Laplace’s Equation
In Section 10.5 we found that the solution to Laplace’s equation in two dimensions at the point
(x,y) is given by
V (x,y) =
1
4
4
i=1
V (i) (10.24)
where V (i) is the value of the potential at the ith neighbor. A generalization of this result is
that the potential at any point equals the average of the potential on a circle (or sphere in three
dimensions) centered about that point.
The relation (10.24) can be given a probabilistic interpretation in terms of random walks
(see Problem 10.10d). Suppose that many random walkers are at the site (x,y) and each walker
“jumps” to one of its four neighbors (on a square grid) with equal probability p = 1/4. From
(10.24) we see that the average potential found by the walkers after jumping one step is the
potential at (x,y). This relation generalizes to walkers that visit a site on a closed surface with
ﬁxed potential. The random walk algorithm for computing the solution to Laplace’s equation
can be stated as:
1. Begin at a point (x,y) where the value of the potential is desired and take a step in a
random direction.
2. Continue taking steps until the walker reaches the surface. Record Vb(i), the potential at
the boundary site i. A typical walk is shown in Figure 10.4.
3. Repeat steps (1) and (2) n times and sum the potential found at the surface each time.
4. The value of the potential at the point (x,y) is estimated by
V (x,y) =
1
n
n
i=1
Vb(i) (10.25)
where n is the total number of random walkers.
CHAPTER 10. ELECTRODYNAMICS 383
(x,y)
Vb(1)
Figure 10.4: A random walk on a 6 × 6 grid starting at the point (x,y) = (3,3) and ending at the
boundary site Vb(3,6) where the potential is recorded.
Problem 10.17. Random walk solution of Laplace’s equation
(a) Consider the square region shown in Figure 10.1 and compare the results of the random
walk method with the results of the relaxation method (see Problem 10.10c). Try n = 100
and n = 1000 walkers and choose a point near the center of the square.
(b) Repeat part (a) for other points within the square. Do you need more or less walkers when
the potential near the surface is desired? How quickly do your answers converge as a function
of n?
The disadvantage of the random walk method is that it requires many walkers to obtain a
good estimate of the potential at each site. However, if the potential is needed at only a small
number of sites, then the random walk method might be more appropriate than the relaxation
method, which requires the potential to be computed at all points within the region. Another
case where the random walk method is appropriate is when the geometry of the boundary is
ﬁxed, but the potential in the interior for a variety of diﬀerent boundary potentials is needed.
In this case the quantity of interest is G(x,y,xb,yb), the number of times that a walker from
the point (x,y) lands at the boundary (xb,yb). The random walk algorithm is equivalent to the
relation
V (x,y) =
1
n xb,yb
G(x,y,xb,yb)V (xb,yb) (10.26)
where the sum is over all sites on the boundary. We can use the same function G for diﬀerent
distributions of the potential on a given boundary. G is an example of a Green’s function,
a function that you will encounter in advanced treatments of electrodynamics and quantum
mechanics (cf. Section 16.9). Of course, if we change the geometry of the boundary, we have to
recompute the function G.
CHAPTER 10. ELECTRODYNAMICS 384
VL
VR
Figure 10.5: Two regions of space connected by a narrow neck. The boundary of the left region
has a potential VL, and the boundary of the right region has a potential VR.
Problem 10.18. Green’s function solution of Laplace’s equation
(a) Compute the Green’s function G(x,y,xb,yb) for the same geometry considered in Problem
10.17. Use at least 200 walkers at each interior site to estimate G. Because of the
symmetry of the geometry, you can determine some of the values of G from other values
without doing an additional calculation. Store your results for G in a ﬁle.
(b) Use your results for G found in part (a) to determine the potential at each interior site when
the boundary potential is the same as in part (a), except for ﬁve boundary sites which are
held at V = 20. Find the locations of the ﬁve boundary sites that maximize the potential at
the interior site located at (3,5). Repeat the calculation to maximize the potential at (5,3).
Use trial and error guided by your physical intuition.
The random walk algorithm can help us gain additional insight into the nature of the solutions
of Laplace’s equation. Suppose that you have a boundary similar to the one shown in
Figure 10.5. The potentials on the left and right boundaries are VL and VR, respectively. If the
neck between the two sides is narrow, it is clear that a random walker starting on the left side
has a low probability of reaching the other side. Hence, we can conclude that the potential in
the interior of the left side is approximately VL, except very near the neck.
Poisson’s equation can also be solved using the random walk method. In this case, the
potential is given by
V (x,y) =
1
n α
V (α) +
π∆x∆y
n
i,α
ρ(xi,α,yi,α) (10.27)
where α labels the walker and i labels the site visited by the walker. That is, each time a walker
is at site i, we add the charge density at that site to the second sum in (10.27).
10.7 *Fields Due to Moving Charges
The fact that accelerating charges radiate electromagnetic waves is one of the more important
results in the history of physics. In this section we discuss a numerical algorithm for computing
CHAPTER 10. ELECTRODYNAMICS 385
the electric and magnetic ﬁelds due to the motion of charged particles. The algorithm is very
general, but requires some care in its application.
To understand the algorithm, we need a few results that can be conveniently found in Feynman’s
lectures. We begin with the fact that the scalar potential at the observation point R due
to a stationary particle of charge q is
V (R) =
q
|R − r|
(10.28)
where r is the position of the charged particle. The electric ﬁeld is given by
E(R) = −
∂V (R)
∂R
(10.29)
where ∂V (R)/∂R is the gradient with respect to the coordinates of the observation point. (Note
that our notation for the observation point diﬀers from that used in other sections of this chapter.)
How do the relations (10.28) and (10.29) change when the particle is moving? We might
guess that because it takes a ﬁnite time for the disturbance due to a charge to reach the point of
observation, we should modify (10.28) by writing
V (R)
?
=
q
rret
(10.30)
where
rret = |R − r(tret)|. (10.31)
The quantity rret is the separation of the charged particle from the observation point R at the
retarded time tret. The latter is the time at which the particle was at r(tret) such that a disturbance
starting at r(tret) and traveling at the speed of light would reach R at time t; tret is given
by the implicit equation
tret = t −
rret(tret)
c
(10.32)
where t is the observation time and c is the speed of light.
Although the above reasoning is plausible, the relation (10.30) is not quite correct (cf. Feynman
et al. for a derivation of the correct result). We need to take into account that the potential
due to the charge is a maximum if the particle is moving toward the observation point and a
minimum if it is moving away. The correct result can be written as
V (R,t) =
q
rret 1 − ˆrret · vret/c
(10.33)
where
vret =
dr(t)
dt t=tret
(10.34)
and ˆr = r/r.
To ﬁnd the electric ﬁeld of a moving charge, we recall that the electric ﬁeld is related to the
time rate of change of the magnetic ﬂux. Hence, we expect that the total electric ﬁeld at the
observation point R has a contribution due to the magnetic ﬁeld created by the motion of the
charge. We know that the magnetic ﬁeld due to a moving charge is given by
B =
1
c
qv × r
r3
. (10.35)
CHAPTER 10. ELECTRODYNAMICS 386
If we deﬁne the vector potential A as
A =
q
r
v
c
(10.36)
we can express B in terms of A as
B = × A. (10.37)
As we did for the scalar potential V , we argue that the correct formula for A is
A(R,t) = q
vret/c
rret 1 − ˆrret · vret/c
. (10.38)
Equations (10.33) and (10.38) are known as the Liénard–Wiechert form of the potentials.
The contribution to the electric ﬁeld E from V and A is given by
E = − V −
1
c
∂A
∂t
. (10.39)
The derivatives in (10.39) are with respect to the observation coordinates. The diﬃculty associated
with calculating these derivatives is that the potentials depend on tret, which in turn
depends on R, r, and t. The result can be expressed as
E(R,t) =
qrret
rret · uret
3
uret(c2
− v2
ret) + rret × uret × aret (10.40)
where
uret ≡ cˆrret − vret. (10.41)
The acceleration of the particle is given by aret = dv(t)/dt|t=tret
. We can also show using (10.37)
that the magnetic ﬁeld B is given by
B = ˆrret × E. (10.42)
The above discussion is not rigorous but leads to the correct expressions for E and B. We
suggest that you accept (10.40) and (10.42) in the same spirit as you accepted Coulomb’s law
and the Biot-Savart law. All of classical electrodynamics can be reduced to (10.40) and (10.42)
if we assume that the sources of all ﬁelds are charges, and all electric currents are due to the
motion of charged particles. Note that (10.40) and (10.42) are consistent with the special theory
of relativity and reduce to known results in the limit of stationary charges and steady currents.
Although (10.40) and (10.42) are deceptively simple (we do not even have to solve any
diﬀerential equations), it is diﬃcult to calculate the ﬁelds analytically even if the position of a
charged particle is an analytic function of time. The diﬃculty is that we must ﬁnd the retarded
time tret from (10.32) for each observation position R and time t. For example, consider a
charged particle whose motion is sinusoidal, that is, x(tret) = Acosωtret. To calculate the ﬁelds
at the position R = (X,Y ,Z) at time t, we need to solve the following transcendental equation
for tret:
tret = t −
rret
c
= t −
1
c
(X − Acos2 ωtret)2 + Y 2 + Z2. (10.43)
The solution of (10.43) can be expressed as a root ﬁnding problem for which we need to ﬁnd
the zero of the function f (tret):
f (tret) = t − tret −
rret
c
. (10.44)
CHAPTER 10. ELECTRODYNAMICS 387
There are various ways of ﬁnding the solution for the retarded time. For example, if the
motion of the charges is given by an analytic expression, we can employ Newton’s method or the
bisection method. Because we will store the path of the charged particle, we use a simple method
that looks for a change in the sign of the function f (tret) along the path. First ﬁnd a value ta
such that f (ta) > 0 and another value tb such that f (tb) < 0. Because f (tret) is continuous, there
is a value of tret in the interval ta < tret < tb such that f (tret) = 0. This technique is used in the
RadiatingCharge class shown in Listing 10.6. Note that the particle’s path is a sinusoidal oscillation
speciﬁed in method evaluate. The name evaluate is used because RadiatingCharge
implements the Function interface, which requires an evaluate method. What is the maximum
velocity for a particle that moves according to this function?
Listing 10.6: The RadiatingCharge class computes the radiating electric and magnetic ﬁelds
using Liénard–Wiechert potentials.
package org . opensourcephysics . sip . ch10 ;
import java . awt . Graphics ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . numerics . ;
public class RadiatingCharge implements Drawable , Function {
Circle c i r c l e = new Circle (0 , 0 , 5 ) ;
double t = 0; / / time
double dt = 0 . 5 ; / / time st ep
/ / current number of points in s t o r a g e
int numPts = 0;
double [ ] [ ] path = new double [ 3 ] [ 1 0 2 4 ] ; / / s t o r a g e f o r t , x , y
double [ ] r = new double [ 2 ] ;
double [ ] v = new double [ 2 ] ;
double [ ] u = new double [ 2 ] ;
double [ ] a = new double [ 2 ] ;
/ / maximum v e l o c i t y f o r charge in units where c = 1
double vmax ;
public RadiatingCharge ( ) {
resetPath ( ) ;
}
private void resizePath ( ) {
int length = path [ 0 ] . length ;
i f ( length >32768) { / / drop h a l f the points
System . arraycopy ( path [ 0 ] , length /2 , path [ 0 ] , 0 , length / 2 ) ;
System . arraycopy ( path [ 1 ] , length /2 , path [ 1 ] , 0 , length / 2 ) ;
System . arraycopy ( path [ 2 ] , length /2 , path [ 2 ] , 0 , length / 2 ) ;
numPts = length /2;
return ;
}
double [ ] [ ] newPath = new double [ 3 ] [ 2 length ] ; / / new path
System . arraycopy ( path [ 0 ] , 0 , newPath [ 0 ] , 0 , length ) ;
System . arraycopy ( path [ 1 ] , 0 , newPath [ 1 ] , 0 , length ) ;
System . arraycopy ( path [ 2 ] , 0 , newPath [ 2 ] , 0 , length ) ;
path = newPath ;
}
void step ( ) {
CHAPTER 10. ELECTRODYNAMICS 388
t += dt ;
i f (numPts>=path [ 0 ] . length ) {
resizePath ( ) ;
}
path [ 0 ] [ numPts ] = t ;
path [ 1 ] [ numPts ] = evaluate ( t ) ; / / x p o s i t i o n of charge
path [ 2 ] [ numPts ] = 0;
numPts++;
}
void resetPath ( ) {
numPts = 0;
t = 0;
path = new double [ 3 ] [ 1 0 2 4 ] ; / / s t o r a g e f o r t , x , y
path [ 0 ] [ numPts ] = t ;
path [ 1 ] [ numPts ] = evaluate ( t ) ; / / x p o s i t i o n of charge
path [ 2 ] [ numPts ] = 0;
numPts++; / / i n i t i a l p o s i t i o n has been added
}
void e l e c t r o s t a t i c F i e l d ( double x , double y , double [ ] f i e l d ) {
double dx = x−path [ 1 ] [ 0 ] ;
double dy = y−path [ 2 ] [ 0 ] ;
double r2 = dx dx+dy dy ;
double r3 = r2 Math . sqrt ( r2 ) ;
double ex = dx/r3 ;
double ey = dy/r3 ;
f i e l d [ 0] = ex ;
f i e l d [ 1] = ey ;
f i e l d [ 2] = 0; / / magnetic f i e l d
}
double dsSquared ( int i , double t , double x , double y ) {
double dt = t −path [ 0 ] [ i ] ;
double dx = x−path [ 1 ] [ i ] ;
double dy = y−path [ 2 ] [ i ] ;
return dx dx+dy dy−dt dt ;
}
void calculateRetardedField ( double x , double y , double [ ] f i e l d ) {
int f i r s t = 0;
int l a s t = numPts−1;
double d s _ f i r s t = dsSquared ( f i r s t , t , x , y ) ;
i f ( ds_first >=0) { / / f i e l d has not yet propagated to the l o c a t i o n
e l e c t r o s t a t i c F i e l d ( x , y , f i e l d ) ;
return ;
}
while ( ( ds_first <0)&&( last − f i r s t ) >1) {
int i = f i r s t +( last − f i r s t ) / 2 ; / / b i s e c t the i n t e r v a l
double ds = dsSquared ( i , t , x , y ) ;
i f ( ds<=0) {
d s _ f i r s t = ds ;
f i r s t = i ;
CHAPTER 10. ELECTRODYNAMICS 389
} else {
l a s t = i ;
}
}
double t_r et = path [ 0 ] [ f i r s t ] ; / / time where ds changes sign
r [ 0] = x−evaluate ( t_ret ) ; / / evaluate x at retarded time
r [ 1] = y ; / / evaluate y at retarded time
/ / d e r i v a t i v e of x at retarded time
v [0 ] = Derivative . centered ( this , t_ret , dt ) ;
v [1 ] = 0; / / d e r i v a t i v e of y at retarded time
/ / a c c e l e r a t i o n of x at retarded time
a [ 0 ] = Derivative . second ( this , t_ret , dt ) ;
a [ 1 ] = 0; / / a c c e l e r a t i o n of y at retarded time
double rMag = Vector2DMath .mag2D( r ) ; / / magnitdue of r
u [0 ] = r [0]/ rMag−v [ 0 ] ;
u [1 ] = r [1]/ rMag−v [ 1 ] ;
double r_dot_u = Vector2DMath . dot2D ( r , u ) ;
double k = rMag/r_dot_u/r_dot_u/r_dot_u ;
/ / u c r o s s a i s perpendicular to plane of motion
double u_cross_a = Vector2DMath . cross2D (u , a ) ;
double [ ] temp = { r [ 0 ] , r [ 1 ] } ;
temp = Vector2DMath . crossZ (temp , u_cross_a ) ; / / r c r o s s u
/ / ( c c − v v ) where c = 1
double c2v2 = 1−Vector2DMath . dot2D ( v , v ) ;
double ex = k (u [ 0 ] c2v2+temp [ 0 ] ) ;
double ey = k (u [ 1 ] c2v2+temp [ 1 ] ) ;
f i e l d [ 0] = ex ;
f i e l d [ 1] = ey ;
f i e l d [ 2] = k Vector2DMath . cross2D (temp , r )/rMag ;
}
public void draw ( DrawingPanel panel , Graphics g ) {
c i r c l e . setX ( evaluate ( t ) ) ;
/ / draw the charged p a r t i c l e on the screen
c i r c l e . draw ( panel , g ) ;
}
public double evaluate ( double t ) {
return 5 Math . cos ( t vmax / 5 . 0 ) ;
}
}
The RadiatingCharge class computes the electric ﬁeld due to an oscillating charge using
the Liénard–Wiechert potentials. We choose units such that the speed of light c = 1. As the
charge moves, it stores its ith data point in a two-dimensional array path[3][i] containing the
time, its x-position, and its y-position. To ﬁnd the retarded time at the position (x,y), we use the
dsSquared method to compute the square of the space-time interval between the given location
and points along the path. The square of the space-time separation is deﬁned as
∆s2
= ∆x2
+ ∆y2
− c2
∆t2
(10.45)
where ∆x = x − xpath, ∆y = y − ypath, and ∆t = t − tpath. The last point on the path contains the
current position of the charge so ∆s2 must be positive because ∆t is zero (unless the charge is
CHAPTER 10. ELECTRODYNAMICS 390
at the observation point (x,y) in which case ∆s2 is zero and the ﬁeld is inﬁnite due to the 1/r2
dependence). The calcRetardedField method evaluates ∆s2 at the ﬁrst point in the trajectory
to determine if it is negative. We assume the charge was stationary for t < 0 and compute the
electrostatic ﬁeld if ∆s2 is positive at the trajectory’s ﬁrst point where t = 0. If ∆s2 is negative
at the trajectory’s ﬁrst point, we repeatedly bisect the path into smaller and smaller segments
while checking to see if ∆s2 remains negative at the beginning of the segment and positive at
the end. In this way we can ﬁnd the retarded time when we have a path segment bounded by
two data points. Note that the RadiatingCharge class uses the Vector2DMath class to perform
the necessary vector arithmetic. This helper class is not listed but is available in the ch10 code
package.
The RadiatingEFieldApp program is shown in Listing 10.7. It displays the electric ﬁeld
in the x-y plane using a Vector2DFrame. The calculateFields method computes the retarded
ﬁeld at every grid point. The simulation’s doStep method invokes this method after it moves
the charge.
Listing 10.7: The RadiatingEFieldApp program computes the radiating electric and magnetic
ﬁelds using Liénard–Wiechert potentials.
package org . opensourcephysics . sip . ch10 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . Vector2DFrame ;
public class RadiatingEFieldApp extends AbstractSimulation {
Vector2DFrame frame = new Vector2DFrame ( "x" , "y" , "Electric field" ) ;
RadiatingCharge charge = new RadiatingCharge ( ) ;
int gridSize ; / / l i n e a r dimension of grid used to compute f i e l d s
double [ ] [ ] [ ] Exy ; / / x and y components of e l e c t r i c f i e l d
double xmin = −20, xmax = 20 , ymin = −20, ymax = 20;
public RadiatingEFieldApp ( ) {
frame . setPreferredMinMax (xmin , xmax , ymin , ymax ) ;
frame . setZRange ( false , 0 , 0 . 2 ) ;
frame . addDrawable ( charge ) ;
}
public void i n i t i a l i z e ( ) {
gridSize = control . getInt ( "size" ) ;
Exy = new double [ 2 ] [ gridSize ] [ gridSize ] ;
/ / maximum speed of charge
charge . vmax = control . getDouble ( "vmax" ) ;
charge . dt = control . getDouble ( "dt" ) ;
frame . setAll ( Exy ) ;
initArrays ( ) ;
}
private void initArrays ( ) {
charge . resetPath ( ) ;
calculateFields ( ) ;
}
private void calculateFields ( ) {
double [ ] f i e l d s = new double [ 3 ] ; / / Ex , Ey , Bz
for ( int i = 0; i <gridSize ; i ++) {
CHAPTER 10. ELECTRODYNAMICS 391
for ( int j = 0; j <gridSize ; j ++) {
/ / x l o c a t i o n where we c a l c u l a t e the f i e l d
double x = frame . indexToX ( i ) ;
/ / y l o c a t i o n where we c a l c u l a t e the f i e l d
double y = frame . indexToY ( j ) ;
/ / return the retarded time
charge . calculateRetardedField ( x , y , f i e l d s ) ;
Exy [ 0 ] [ i ] [ j ] = f i e l d s [ 0 ] ; / / Ex
Exy [ 1 ] [ i ] [ j ] = f i e l d s [ 1 ] ; / / Ey
}
}
frame . setAll ( Exy ) ;
}
public void reset ( ) {
control . setValue ( "size" , 31);
control . setValue ( "dt" , 0 . 5 ) ;
control . setValue ( "vmax" , 0 . 9 ) ;
i n i t i a l i z e ( ) ; / / i n i t i a l i z e the model
}
protected void doStep ( ) {
charge . step ( ) ;
calculateFields ( ) ;
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl . createApp (new RadiatingEFieldApp ( ) ) ;
}
}
Problem 10.19. Field lines from an accelerating charge
(a) Read the code for RadiatingEFieldApp carefully to understand the correspondence between
the program and the analytic results, (10.40) and (10.42), discussed in the text.
(b) Describe qualitatively the nature of the electric and magnetic ﬁelds from an oscillating point
charge. How does the electric ﬁeld diﬀer from that of a static charge at the origin? What
happens as the speed increases? The physics breaks down if the maximum speed is greater
than c. Does the algorithm break down? Explain.
(c) Modify the program to show the magnetic ﬁeld in the x-y plane using a Scalar2DFrame to
show the Bz vector component.
(d) Modify the program to observe a charge moving with uniform circular motion about the
origin. What happens as the speed of the charge approaches the speed of light?
Problem 10.20. Spatial dependence of the radiating ﬁelds
(a) As waves propagate from an accelerating point source, the total power that passes through
a spherical surface of radius R remains constant. Because the surface area is proportional to
R2, the power per unit area or intensity is proportional to 1/R2. Also, because the intensity
is proportional to E2, we expect that E ∝ 1/R far from the source. Modify the program to
CHAPTER 10. ELECTRODYNAMICS 392
verify this result for a charge that is oscillating along the x-axis according to x(t) = 0.2cost.
Plot |E| as a function of the observation time t for a ﬁxed position, such as R = (10,10,0).
The ﬁeld should oscillate in time. Find the amplitude of this oscillation. Next double the
distance of the observation point from the origin. How does the amplitude depend on R?
(b) Repeat part (a) for several directions and distances. Generate a polar diagram showing the
amplitude as a function of angle in the x-y plane. Is the radiation greatest along the line in
which the charge oscillates?
Problem 10.21. Fields from a charge moving at constant velocity
(a) Use RadiationApp to calculate E due to a charged particle moving at constant velocity toward
the origin, for example, x(tret) = 1 − 2tret. Take a snapshot at t = 0.5 and compare the
ﬁeld lines with those you expect from a stationary charge.
(b) Modify RadiationApp so that x(tret) = 1 − 2tret for tret < 0.5 and x(tret) = 0 for tret > 0.5.
Describe the ﬁeld lines for t > 0.5. Does the particle accelerate at any time? Is there any
radiation?
Problem 10.22. Frequency dependence of an oscillating charge
(a) The radiated power at any point in space is proportional to E2. Plot |E| versus time at a ﬁxed
observation point (for example, X = 10,Y = Z = 0) and calculate the frequency dependence
of the amplitude of |E| due to a charge oscillating at the frequency ω. It is shown in standard
textbooks that the power associated with radiation from an oscillating dipole is proportional
to ω4. How does the ω-dependence that you measured compare to that for dipole radiation?
Repeat for a much bigger value of R and explain any diﬀerences.
(b) Repeat part (a) for a charge moving in a circle. Are there any qualitative diﬀerences?
10.8 *Maxwell’s Equations
In Section 10.7 we found that accelerating charges produce electric and magnetic ﬁelds that
depend on position and time. We now investigate the direct relation between changes in E and
B given by the diﬀerential form of Maxwell’s equations:
∂B
∂t
= −
1
c
× E (10.46)
∂E
∂t
= c × B − 4πj (10.47)
where j is the electric current density. We can regard (10.46) and (10.47) as the basis of electrodynamics.
In addition to (10.46) and (10.47), we need the relation between j and the charge
density ρ that expresses the conservation of charge:
∂ρ
∂t
= − · j. (10.48)
A complete description of electrodynamics requires (10.46), (10.47), and (10.48), and the initial
values of all currents and ﬁelds.
CHAPTER 10. ELECTRODYNAMICS 393
For completeness, we obtain the Maxwell’s equations that involve · B and · E by taking
the divergence of (10.46) and (10.47), substituting (10.48) for · j, and then integrating over
time. If the initial ﬁelds are zero, we obtain (using the relation · ( × a) = 0 for any vector a)
· E = 4πρ (10.49)
· B = 0. (10.50)
If we introduce the electric and magnetic potentials, it is possible to convert the ﬁrst-order
equations (10.46) and (10.47) to second-order diﬀerential equations. However, the familiar
ﬁrst-order equations are better suited for numerical analysis. To solve (10.46) and (10.47) numerically,
we need to interpret the curl and divergence of a vector. As its name implies, the curl
of a vector measures how much the vector twists around a point. A coordinate free deﬁnition
of the curl of an arbitrary vector W is
( × W) · ˆS = lim
S→0
1
S C
W · dl (10.51)
where S is the area of any surface bordered by the closed curve C, and ˆS is a unit vector normal
to the surface S.
Equation (10.51) gives the component of × W in the direction of ˆS and suggests a way
of computing the curl numerically. We divide space into cubes of linear dimension ∆l. The
rectangular components of W can be deﬁned either on the edges or on the faces of the cubes.
We compute the curl using both deﬁnitions. We ﬁrst consider a vector B that is deﬁned on the
edges of the cubes so that the curl of B is deﬁned on the faces. (We use the notation B because
we will ﬁnd that it is convenient to deﬁne the magnetic ﬁeld in this way.) Associated with each
cube is one edge vector and one face vector. We label the cube by the coordinates corresponding
to its lower left front corner; the three components of B associated with this cube are shown in
Figure 10.6a. The other edges of the cube are associated with B vectors deﬁned at neighboring
cubes.
The discrete version of (10.51) for the component of × B deﬁned on the front face of the
cube (i,j,k) is
( × B) · ˆS =
1
(∆l)2
4
i=1
Bi∆li (10.52)
where S = (∆l)2, and Bi and li are shown in Figures 10.6b and 10.6c, respectively. Note that two
of the Bi are associated with neighboring cubes.
The components of a vector can also be deﬁned on the faces of the cubes. We call this vector
E because it will be convenient to deﬁne the electric ﬁeld in this way. In Figure 10.7(a) we show
the components of E associated with the cube (i,j,k). Because E is normal to a cube face, the
components of × E lie on the edges. The components Ei and li are shown in Figures 10.7(b)
and 10.7(c), respectively. The form of the discrete version of × E is similar to (10.52) with
Bi replaced by Ei, where Ei and li are shown in Figures 10.7(b) and 10.7(c), respectively. The
z-component of × E is along the left edge of the front face.
A coordinate free deﬁnition of the divergence of the vector ﬁeld W is
· W = lim
V →0
1
V S
W · dS (10.53)
where V is the volume enclosed by the closed surface S. The divergence measures the average
ﬂow of the vector through a closed surface. An example of the discrete version of (10.53) is
given in (10.54).
CHAPTER 10. ELECTRODYNAMICS 394
(a)
(i, j, k)
Bz By
Bx
(∇ × B)y
(b)
(i, j, k) (i+1, j, k)
(i, j, k+1)
B1
B2
B3
B4
(c)
(i, j, k) ∆l1
∆l2
∆l3
∆l4
z
y
x
Figure 10.6: Calculation of the curl of B deﬁned on the edges of a cube. (a) The edge vector
B associated with cube (i,j,k). (b) The components Bi along the edges of the front face of the
cube. B1 = Bx(i,j,k), B2 = Bz(i + 1,j,k), B3 = −Bx(i,j,k + 1), and B4 = −Bz(i,j,k). (c) The vector
components ∆li on the edges of the front face. (The y-component of × B deﬁned on the face
points in the negative y direction.)
We now discuss where to deﬁne the quantities ρ, j, E, and B on the grid. It is natural to
deﬁne the charge density ρ at the center of a cube. From the continuity equation (10.48), we see
that this deﬁnition leads us to deﬁne j at the faces of the cube. Hence, each face of a cube has a
number associated with it corresponding to the current density ﬂowing parallel to the outward
normal to that face. Given the deﬁnition of j on the grid, we see from (10.47) that the electric
ﬁeld E and j should be deﬁned at the same places, and hence, we deﬁne the electric ﬁeld on
the faces of the cubes. Because E is deﬁned on the faces, it is natural to deﬁne the magnetic
ﬁeld B on the edges of the cubes. Our deﬁnitions of the vectors j, E, and B on the grid are now
complete.
We label the faces of cube c by the symbol fc. If we use the simplest ﬁnite diﬀerence method
with a discrete time step ∆t and discrete spatial interval ∆x = ∆y = ∆z ≡ ∆l, we can write the
continuity equation as
ρ c,t +
1
2
∆ − ρ c,t −
1
2
∆t = −
∆t
∆l
6
fc=1
j(fc,t). (10.54)
The factor of 1/∆l comes from the area of a face (∆l)2 used in the surface integral in (10.53)
divided by the volume (∆l)3 of a cube. In the same spirit, the discretization of (10.47) can be
CHAPTER 10. ELECTRODYNAMICS 395
(a)
(i, j, k)
Ez
Ex
Ey
z
y
x
E1
E2
E3
E4
(b)
(i, j, k)
(i, j-1, k)
(i-1, j, k)
∆l1
∆l2
∆l3
∆l4
(∇ × E)z
(i, j, k)
(c)
Figure 10.7: Calculation of the curl of the vector E deﬁned on the faces of a cube. (a) The
face vector E associated with the cube (i,j,k). The components associated with the left, front,
and bottom faces are Ex(i,j,k),Ey(i,j,k),Ez(i,j,k), respectively. (b) The components Ei on the
faces that share the front left edge of the cube (i,j,k). E1 = Ex(i,j − 1,k),E2 = Ey(i,j,k),E3 =
−Ex(i,j,k),andE4 = −Ey(i − 1,j,k). The cubes associated with E1 and E4 also are shown. (c) The
vector components ∆li on the faces that share the left front edge of the cube. (The z-component
of the curl of E deﬁned on the left edge points in the positive z direction.)
written as
E(f ,t +
1
2
∆t) − E f ,t −
1
2
∆t = ∆t × B − 4πj(f ,t) . (10.55)
Note that E in (10.55) and ρ in (10.54) are deﬁned at diﬀerent times than j. As usual, we choose
units such that c = 1.
We next need to deﬁne a square around which we can discretize the curl. If E is deﬁned on
the faces, it is natural to use the square that is the border of the faces. As we have discussed,
this choice implies that we should deﬁne the magnetic ﬁeld on the edges of the cubes. We write
(10.55) as:
E(f ,t +
1
2
∆t) − E(f ,t −
1
2
∆t) = ∆t
1
∆l
4
ef =1
B(ef ,t) − 4πj(f ,t) (10.56)
where the sum is over ef , the four edges of the face f (see Figure 10.7b). Note that B is deﬁned
CHAPTER 10. ELECTRODYNAMICS 396
at the same time as j. In a similar way we can write the discrete form of (10.46) as
B(e,t + ∆t) − B(e,t) = −
∆t
∆l
4
fe=1
E fe,t +
1
2
∆t (10.57)
where the sum is over fe, the four faces that share the same edge e (see Figure 10.7b).
We now have a well-deﬁned algorithm for computing the spatial dependence of the electric
and magnetic ﬁeld, the charge density, and the current density as a function of time. This
algorithm was developed by Yee, an electrical engineer, in 1966, and independently by Visscher,
a physicist, in 1988 who also showed that all of the integral relations and other theorems that
are satisﬁed by the continuum ﬁelds are also satisﬁed for the discrete ﬁelds.
Usually, the most diﬃcult part of this algorithm is specifying the initial conditions because
we cannot simply place a charge somewhere. The reason is that the initial ﬁelds appropriate
for this charge would not be present. Indeed, our rules for updating the ﬁelds and the charge
densities reﬂect the fact that the electric and magnetic ﬁelds do not appear instantaneously at
all positions in space when a charge appears, but instead evolve from the initial appearance
of a charge. Of course, charges do not appear out of nowhere, but appear by disassociating
from neutral objects. Conceptually, the simplest initial condition corresponds to two charges of
opposite sign moving oppositely to each other. This condition corresponds to an initial current
on one face. From this current a charge density and thus an electric ﬁeld appears using (10.54)
and (10.56), respectively, and a magnetic ﬁeld appears using (10.57).
Because we cannot compute the ﬁelds for an inﬁnite lattice, we need to specify the boundary
conditions. The easiest method is to use ﬁxed boundary conditions such that the ﬁelds
vanish at the edges of the lattice. If the lattice is suﬃciently large, ﬁxed boundary conditions
are a reasonable approximation. However, ﬁxed boundary conditions usually lead to nonphysical
reﬂections oﬀ the edges, and a variety of approaches have been used including boundary
conditions equivalent to a conducting medium that gradually absorbs the ﬁelds. In some cases
physically motivated boundary conditions can be employed. For example, in simulations of
microwave cavity resonators (see Problem 10.24), the appropriate boundary conditions are that
the tangential component of E and the normal component of B vanish at the boundary.
As we have noted, E and ρ are deﬁned at diﬀerent times than B and j. This half-step approach
leads to well behaved equations that are stable over a range of parameters. An analysis
of the stability requirement for the Yee-Visscher algorithm shows that the time step ∆t must be
smaller than the spatial grid ∆l by:
c∆t ≤
∆l
√
3
(stability requirement). (10.58)
The Maxwell class implements the Visscher-Yee ﬁnite diﬀerence algorithm for solving Maxwell’s
equations. The ﬁeld and current data are stored in multi-dimensional arrays E, B, and J. The
ﬁrst index determines the vector component. The last three indices represent the three spatial
coordinates. The current method models a positive current ﬂowing for one time unit. This
current ﬂow produces both electric and magnetic ﬁelds. Because charge is conserved, the current
ﬂow produces an electrostatic dipole. Negative charge remains at the source and a positive
charge is deposited at the destination. Note that the doStep method invokes a damping method
that reduces the ﬁelds at points near the boundaries, thereby absorbing the emitted radiation
and reducing the reﬂected electromagnetic waves. Your understanding of the Yee-Visscher algorithm
for ﬁnding solutions to Maxwell’s equations will be enhanced by carefully reading the
MaxwellApp program and the Maxell class.
CHAPTER 10. ELECTRODYNAMICS 397
Listing 10.8: The Maxwell class implements the Yee-Visscher ﬁnite diﬀerence approximation to
Maxwell’s equations.
package org . opensourcephysics . sip . ch10 ;
public class Maxwell {
/ / s t a t i c v a r i a b l e s determine units and time s c a l e
s t a t i c final double pi4 = 4 Math . PI ;
s t a t i c final double dt = 0.03;
s t a t i c final double dl = 0 . 1 ;
s t a t i c final double escale = dl /(4 Math . PI dt ) ;
s t a t i c final double bscale = escale dl /dt ;
s t a t i c final double j s c a l e = 1;
double dampingCoef = 0 . 1 ; / / damping c o e f f i c i e n t near boundaries
int size ;
double t ; / / time
double [ ] [ ] [ ] [ ] E , B , J ;
public Maxwell ( int size ) {
this . size = size ;
/ / 3D arrays f o r e l e c t r i c f i e l d , magnetic f i e l d , and current
/ / l a s t t h r e e i n d i c e s i n d i c a t e location , f i r s t index i n d i c a t e s
/ / x , y , or z component
E = new double [ 3 ] [ size ] [ size ] [ size ] ;
B = new double [ 3 ] [ size ] [ size ] [ size ] ;
J = new double [ 3 ] [ size ] [ size ] [ size ] ;
}
public void doStep ( ) {
current ( t ) ; / / update the current
computeE ( ) ; / / st ep e l e c t r i c f i e l d
computeB ( ) ; / / st ep magnetic f i e l d
damping ( ) ; / / damp t r a n s i e n t s
t += dt ;
}
void current ( double t ) {
final int mid = size /2;
double delta = 1 . 0 ;
for ( int i = −3; i <5; i ++) {
J [ 0 ] [ mid+i ] [ mid ] [ mid] = ( t<delta ) ? +1 : 0;
}
}
void computeE ( ) {
for ( int x = 1; x<size −1;x++) {
for ( int y = 1; y<size −1;y++) {
for ( int z = 1; z<size −1; z++) {
double curlBx = (B [ 1 ] [ x ] [ y ] [ z]−B [ 1 ] [ x ] [ y ] [ z+1]
+B [ 2 ] [ x ] [ y +1][ z]−B [ 2 ] [ x ] [ y ] [ z ] ) / dl ;
E [ 0 ] [ x ] [ y ] [ z ] += dt ( curlBx −pi4 J [ 0 ] [ x ] [ y ] [ z ] ) ;
double curlBy = (B [ 2 ] [ x ] [ y ] [ z]−B [ 2 ] [ x +1][ y ] [ z ]
+B [ 0 ] [ x ] [ y ] [ z+1]−B [ 0 ] [ x ] [ y ] [ z ] ) / dl ;
E [ 1 ] [ x ] [ y ] [ z ] += dt ( curlBy −pi4 J [ 1 ] [ x ] [ y ] [ z ] ) ;
CHAPTER 10. ELECTRODYNAMICS 398
double curlBz = (B [ 0 ] [ x ] [ y ] [ z]−B [ 0 ] [ x ] [ y +1][ z ]
+B [ 1 ] [ x +1][ y ] [ z]−B [ 1 ] [ x ] [ y ] [ z ] ) / dl ;
E [ 2 ] [ x ] [ y ] [ z ] += dt ( curlBz −pi4 J [ 2 ] [ x ] [ y ] [ z ] ) ;
}
}
}
}
void computeB ( ) {
for ( int x = 1; x<size −1;x++) {
for ( int y = 1; y<size −1;y++) {
for ( int z = 1; z<size −1; z++) {
double curlEx = (E [ 2 ] [ x ] [ y ] [ z]−E [ 2 ] [ x ] [ y −1][ z ]
+E [ 1 ] [ x ] [ y ] [ z−1]−E [ 1 ] [ x ] [ y ] [ z ] ) / dl ;
B [ 0 ] [ x ] [ y ] [ z ] −= dt curlEx ;
double curlEy = (E [ 0 ] [ x ] [ y ] [ z]−E [ 0 ] [ x ] [ y ] [ z −1]
+E [ 2 ] [ x −1][ y ] [ z]−E [ 2 ] [ x ] [ y ] [ z ] ) / dl ;
B [ 1 ] [ x ] [ y ] [ z ] −= dt curlEy ;
double curlEz = (E [ 1 ] [ x ] [ y ] [ z]−E [ 1 ] [ x −1][ y ] [ z ]
+E [ 0 ] [ x ] [ y −1][ z]−E [ 0 ] [ x ] [ y ] [ z ] ) / dl ;
B [ 2 ] [ x ] [ y ] [ z ] −= dt curlEz ;
}
}
}
}
void damping ( ) {
for ( int i = 0; i <size ; i ++) {
for ( int j = 0; j <size ; j ++) {
/ / w used to index c e l l near boundary s u b j e c t to damping
for ( int w = 0;w<4;w++) {
for ( int comp = 0;comp<3;comp++) {
E[comp ] [w] [ i ] [ j ] −= dampingCoef E[comp ] [w] [ i ] [ j ] ;
E[comp ] [ size −w−1][ i ] [ j ] −= dampingCoef
E[comp ] [ size −w−1][ i ] [ j ] ;
E[comp ] [ i ] [w] [ j ] −= dampingCoef E[comp ] [ i ] [w] [ j ] ;
E[comp ] [ i ] [ size −w−1][ j ] −= dampingCoef
E[comp ] [ i ] [ size −w−1][ j ] ;
E[comp ] [ i ] [ j ] [w] −= dampingCoef E[comp ] [ i ] [ j ] [w] ;
E[comp ] [ i ] [ j ] [ size −w−1] −= dampingCoef
E[comp ] [ i ] [ j ] [ size −w−1];
B[comp ] [w] [ i ] [ j ] −= dampingCoef B[comp ] [w] [ i ] [ j ] ;
B[comp ] [ size −w−1][ i ] [ j ] −= dampingCoef
B[comp ] [ size −w−1][ i ] [ j ] ;
B[comp ] [ i ] [w] [ j ] −= dampingCoef B[comp ] [ i ] [w] [ j ] ;
B[comp ] [ i ] [ size −w−1][ j ] −= dampingCoef
B[comp ] [ i ] [ size −w−1][ j ] ;
B[comp ] [ i ] [ j ] [w] −= dampingCoef B[comp ] [ i ] [ j ] [w] ;
B[comp ] [ i ] [ j ] [ size −w−1] −= dampingCoef
B[comp ] [ i ] [ j ] [ size −w−1];
}
}
}
CHAPTER 10. ELECTRODYNAMICS 399
}
}
}
Listing 10.9: The MaxwellApp program computes and displays the electric ﬁeld by solving
Maxwell’s equations.
package org . opensourcephysics . sip . ch10 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . Vector2DFrame ;
public class MaxwellApp extends AbstractSimulation {
Vector2DFrame frame = new Vector2DFrame ( "x" , "y" ,
"EField in XY Plane" ) ;
int size ;
Maxwell maxwell ;
/ / x and y components of E f o r middle plane in z d i r e c t i o n
double [ ] [ ] [ ] Exy ;
public MaxwellApp ( ) {
frame . setZRange ( false , 0 , 1 . 0 ) ;
}
public void reset ( ) {
control . setValue ( "size" , 31);
control . setValue ( "dt" , 0 . 5 ) ;
}
public void i n i t i a l i z e ( ) {
size = control . getInt ( "size" ) ;
Exy = new double [ 2 ] [ size ] [ size ] ;
maxwell = new Maxwell ( size ) ;
frame . setAll ( Exy ) ;
frame . setPreferredMinMax (0 , Maxwell . dl size , Maxwell . dl size , 0 ) ;
plotField ( ) ;
}
protected void doStep ( ) {
maxwell . doStep ( ) ;
plotField ( ) ;
frame . setMessage ( "t="+decimalFormat . format ( maxwell . t ) ) ;
}
void plotField ( ) {
double [ ] [ ] [ ] [ ] E = maxwell . E ; / / e l e c t r i c f i e l d
int mid = size /2;
for ( int i = 0; i <size ; i ++) {
for ( int j = 0; j <size ; j ++) {
Exy [ 0 ] [ i ] [ j ] = E [ 0 ] [ i ] [ j ] [ mid ] ; / / Ex
Exy [ 1 ] [ i ] [ j ] = E [ 1 ] [ i ] [ j ] [ mid ] ; / / Ey
}
}
frame . setAll ( Exy ) ;
}
CHAPTER 10. ELECTRODYNAMICS 400
public s t a t i c void main ( String [ ] args ) {
SimulationControl . createApp (new MaxwellApp ( ) ) ;
}
}
The MaxwellApp program shows the electric ﬁeld in the x-y plane. The x-y components of
the electric ﬁeld are represented by arrows whose length is ﬁxed and whose color indicates the
ﬁeld magnitude at each position where the ﬁeld is deﬁned.
Problem 10.23. Fields from a current loop
(a) A steady current in the middle of the x-y plane is turned on at t = 0 and left on for one
time unit. Before running the program, predict what you expect to see. Compare your
expectations with the results of the simulation. Use ∆t = 0.03, ∆l = 0.1, and take the number
of cubes in each direction to be n(1) = n(2) = n(3) = 8.
(b) Add a plot of the magnetic ﬁeld. Where should the viewing plane be placed to produce the
best visualization? How should the plane be oriented? Predict what you expect to see before
you run the simulation.
(c) Verify the stability requirement (10.58), by running your program with ∆t = 0.1 and ∆l =
0.1. Then try ∆t = 0.05 and ∆l = ∆t
√
3. What happens to the results in part (a) if the stability
requirement is not satisﬁed?
(d) Modify the current density in part (a) so that j oscillates sinusoidally. What happens to the
electric and magnetic ﬁeld vectors?
(e) How much must you change the factor dampingCoef in the damping method before you
can visually see a diﬀerence in the simulation? What problems occur when the damping is
removed?
f)∗ The amplitude of the ﬁelds far from the current loop should be characteristic of radiation
ﬁelds for which the amplitude falls oﬀ as 1/r, where r is the distance from the current loop
to the observation point. Do a simulation to detect this dependence if you have suﬃcient
computer resources.
Problem 10.24. Microwave cavity resonators
(a) Cavity resonators are a practical way of storing energy in the form of oscillating electric
and magnetic ﬁelds without losing as much energy as would be dissipated in a resonant
LC circuit. Consider a cubical resonator of linear dimension L whose walls are made of a
perfectly conducting material. The tangential components of E and the normal component
of B vanish at the walls. Standing microwaves can be set up in the box of the form (cf. Reitz
et al.)
Ex = Ex0 coskxxsinkyy sinkzzeiωt
(10.59a)
Ey = Ey0 coskyy sinkxxsinkzzeiωt
(10.59b)
Ez = Ez0 coskzzsinkxxsinkyy eiωt
. (10.59c)
CHAPTER 10. ELECTRODYNAMICS 401
The wave vector k = (kx,ky,kz) = (mxπ/L,myπ/L,mzπ/L), where mx, my, and mz are integers.
A particular mode is labeled by the integers (mx,my,mz). The initial electric ﬁeld is perpendicular
to k, and ω = ck. Implement the boundary conditions at (x = 0,y = 0,z = 0) and
(x = L,y = L,z = L). Set ∆t = 0.05, ∆l = 0.1, and L = 1. At t = 0, set B = 0, j = 0 (there are no
currents within the cavity), and use (10.59) with (mx,my,mz) = (0,1,1) and Ex0 = 1. Plot the
ﬁeld components at speciﬁc positions as a function of t and ﬁnd the resonant frequency ω.
Compare your computed value of ω with the analytic result. Do the magnetic ﬁelds change
with time? Are they perpendicular to k and E?
(b) Repeat part (a) for two other modes.
(c) Repeat part (a) with a uniform random noise added to the initial ﬁeld at all positions. Assume
the amplitude of the noise is δ and describe the resulting ﬁelds for δ = 0.1. Are they
similar to those without noise? What happens for δ = 0.5? More quantitative results can
be found by computing the power spectrum |E(ω)|2 for the electric ﬁeld at a few positions.
What is the order of magnitude of δ for which the maximum of |E(ω)|2 at the standing wave
frequency is swamped by the noise?
(d) Change the shape of the container slightly by removing a 0.1 × 0.1 cubical box from each of
the corners of the original resonator. Do the standing wave frequencies change? Determine
the standing wave frequency by adding noise to the initial ﬁelds and looking at the power
spectrum. How do the standing wave patterns change?
(e) Change the shape of the container slightly by adding a 0.1 × 0.1 cubical box at the center of
one of the faces of the original resonator. Do the standing wave frequencies change? How
do the standing wave patterns change?
(f) Cut a 0.2×0.2 square hole in a face in the y-z plane and double the computational region in
the x direction. Begin with a (0,1,1) standing wave and observe how the ﬁelds “leak” out of
the hole.
Problem 10.25. Billiard microwave cavity resonators
(a) Repeat Problem 10.24a for Lx = Ly = 2, Lz = 0.2, ∆l = 0.1, and ∆t = 0.05. Indicate the
magnitude of the electric ﬁeld in the Lz = 0.1 plane by a color code. Choose an initial
normal mode ﬁeld distribution and describe the pattern that you obtain. Then repeat your
calculation for a random initial ﬁeld distribution.
(b) Place an approximately circular conductor in the middle of the cavity of radius r = 0.4.
Describe the patterns that you see. Such a geometry leads to chaotic trajectories for particles
moving within such a cavity (see Project 6.26). Is there any evidence of chaotic behavior in
the ﬁeld pattern?
(c) Repeat part (b) with the circular conductor placed oﬀ center.
10.9 Projects
Part of the diﬃculty in understanding electromagnetic phenomena is visualizing its threedimensional
nature. Many interesting problems can be posed based on the simple, but nontrivial
question of how three-dimensional electromagnetic ﬁelds can be best represented visually
CHAPTER 10. ELECTRODYNAMICS 402
in various contexts (cf. Belcher and Olbert). However, we have not suggested projects in this
area because of their diﬃculty.
Many of the techniques used in this chapter, for example, the random walk method and
the relaxation method for solving Laplace’s equation, have applications in other ﬁelds, especially
problems in ﬂuid ﬂow and transport. Similarly, the multigrid method, discussed in
Project 10.26, has far reaching applications.
Project 10.26. Multigrid method
In general, the relaxation method for solving Laplace’s equation is very slow even when using
overrelaxation. The reason is that the local updates of the relaxation method cannot quickly
take into account eﬀects at very large length scales. The multigrid method greatly improves
performance by using relaxation at many length scales. The important idea is to use a relaxation
method to ﬁnd the values of the potential on coarser and coarser grids, and then to use the
coarse grid values to determine the ﬁne grid values. The ﬁne grid relaxation updates take into
account eﬀects at short length scales. If we deﬁne the initial grid by a lattice spacing b = 1, then
the coarser grids are characterized by b = 2n, where n determines the coarseness of the grid and
is known as the grid level. We need to decide how to use the ﬁne grid values of the potential
to assign values to a coarser grid, and then how to use a coarse grid to assign values to a ﬁner
grid. The ﬁrst step is called restriction and the second step is called prolongation. There is some
ﬂexibility on how to do these two operations. We discuss one approach.
We deﬁne the centers of the sites of the coarse grid to be located at the centers of every other
site of the ﬁne grid. That is, if the set {i,j} represents the positions of the sites of the ﬁne grid,
then {2i,2j} represents the positions of the coarse grid sites. The ﬁne grid sites that are at the
same position as a coarse grid point are assigned the value of the potential of the corresponding
coarse grid point. The ﬁne grid sites that have two coarse grid points as nearest neighbors are
assigned the average value of these two coarse grid sites. The other ﬁne grid sites have four
coarse grid sites as next nearest neighbors and are assigned the average value of these four
coarse grid sites. This prescription speciﬁes how values on the ﬁne grid are computed using the
values on the coarse grid.
In the full weighting prolongation method, each coarse grid site receives one fourth of the
potential of the ﬁne grid site at the same position, one eighth of the potential for the four
nearest neighbor sites of the ﬁne grid, and one sixteenth of the potential for the four next nearest
neighbor points of the ﬁne grid. The sum of these fractions, 1/4 + 4(1/8) + 4(1/16), adds up to
unity. An alternative procedure, known as half weighting, ignores the next nearest neighbors
and uses one half of the potential of the ﬁne grid site at the same position as the coarse grid site.
(a) Write a program that implements the multigrid method using Gauss–Seidel relaxation on a
checkerboard lattice (see Problem 10.11b). In its simplest form the program should allow
the user to intervene and decide whether to go to a ﬁner or coarser grid, or to remain at
the same level for the next relaxation step. Have the program print the potential at each
site of the current level after each relaxation step. Test your program on a 4 × 4 grid whose
boundary sites are all equal to unity, and whose initial internal sites are set to zero. Make
sure that the boundary sites of the coarser grids are also set to unity.
(b) The exact solution for part (a) gives a potential of unity at each point. How many relaxation
steps does it take to reach unity within 0.1% at every site by simply using the 4 × 4 grid?
How many steps does it take if you use one coarse grid and continue until the coarse grid
values are within 0.1% of unity? Is it necessary to carry out any ﬁne grid relaxation steps to
reach the desired accuracy on the ﬁne grid? Next start with the coarsest scale, which is just
one site. How many relaxation steps does it take now?
CHAPTER 10. ELECTRODYNAMICS 403
(c) Repeat part (b) but change the boundary so that one side of the boundary is held at a potential
of 0.5. Experiment with diﬀerent sequences of prolongation, restriction, and relaxation.
(d) Assume that the boundary points alternate between zero and unity and repeat part (b). Does
the multigrid method work? Should one go up and down in levels many times instead of
staying at the coarsest level and then going down to the ﬁnest level?
Appendix A: Vector Fields
The frames package contains the Vector2DFrame class for displaying two-dimensional vector
ﬁelds. To use this class we instantiate a multi-dimensional array to store components of the
vector. The ﬁrst array index indicates the component, the second index indicates the column or
x-position, and the third index indicates the row or y-position. The vectors in the visualization
are set by passing the data array to the frame using the setAll method. The program in Listing
10.10 demonstrates how this is done by displaying the electric ﬁeld of a unit charge located
at the origin.
Listing 10.10: A vector ﬁeld test program.
package org . opensourcephysics . sip . ch10 ;
import javax . swing . JFrame ;
import org . opensourcephysics . frames . Vector2DFrame ;
public class VectorPlotApp {
public s t a t i c void main ( String [ ] args ) {
Vector2DFrame frame =
new Vector2DFrame ( "x" , "y" , "Vector field" ) ;
double a = 2; / / h a l f width of frame in world c o o r d i n a t e s
frame . setPreferredMinMax(−a , a , −a , a ) ;
int nx = 15 , ny = 15; / / grid s i z e s in x and y d i r e c t i o n
/ / generate sample data
double [ ] [ ] [ ] vectorField = new double [ 2 ] [ nx ] [ ny ] ;
frame . setAll ( vectorField ) ; / / v e c t o r f i e l d d i s p l a y s zero data
for ( int i = 0; i <nx ; i ++) {
double x = frame . indexToX ( i ) ;
for ( int j = 0; j <ny ; j ++) {
double y = frame . indexToY ( j ) ;
double r2 = x x+y y ; / / d i s t a n c e squared
double r3 = Math . sqrt ( r2 ) r2 ; / / d i s t a n c e cubed
vectorField [ 0 ] [ i ] [ j ] = ( r2==0) ? 0 : x/r3 ; / / x component
vectorField [ 1 ] [ i ] [ j ] = ( r2==0) ? 0 : y/r3 ; / / y component
}
}
frame . setAll ( vectorField ) ; / / v e c t o r f i e l d d i s p l a y s new data
frame . s e t V i s i b l e ( true ) ;
frame . setDefaultCloseOperation ( JFrame .EXIT_ON_CLOSE ) ;
}
}
The arrows in the visualization have a ﬁxed length that is chosen to ﬁll the viewing area.
The arrow’s color represents the ﬁeld’s magnitude. We have found that using an arrow’s color
rather than its length to represent ﬁeld strength produces a more eﬀective representation of
CHAPTER 10. ELECTRODYNAMICS 404
vector ﬁelds over a wider dynamic range. The frame’s Legend menu item under Tools shows
this mapping. The appropriate representation of vector ﬁelds is an active area of interest.
Problem 10.27. Gradient of a scalar ﬁeld
The gradient of a scalar ﬁeld, A(x,y), deﬁnes a vector ﬁeld. In a two-dimensional Cartesian
coordinate system, the components of the gradient are equal to the derivative of the scalar ﬁeld
along the x- and y-axes, respectively
A =
∂A
∂x
ˆx +
∂A
∂y
ˆy. (10.60)
Write a short program that displays both a scalar ﬁeld and its gradient. (Hint: Deﬁne a function
and use numerical derivatives along the rows and columns.) Create separate frames for
the scalar and vector ﬁeld visualizations. The Open Source Physics: A User’s Guide with Examples
manual describes how a vector ﬁeld visualization can be superimposed on a scalar ﬁeld
visualization.
References and Suggestions for Further Reading
Forman S. Acton, Numerical Methods That Work (Harper & Row, 1970); corrected edition
(Mathematical Association of America, 1990). Chapter 18 discusses solutions to Laplace’s
equation using the relaxation method and alternative approaches.
John W. Belcher and Stanislaw Olbert, “Field line motion in classical electromagnetism,” Am.
J. Phys. 71, 220–228 (2003). The authors discuss some of the diﬃculties with ﬁeld lines
that change in time, and suggest a procedure where the local direction of motion of a ﬁeld
line is in the same direction as the Poynting vector.
Charles K. Birdsall and A. Bruce Langdon, Plasma Physics via Computer Simulation (McGraw–
Hill, 1985).
D. H. Choi and W. J. R. Hoefer, “The ﬁnite-diﬀerence-time-domain method and its application
to eigenvalue problems,” IEEE Trans. Microwave Theory and Techniques 34, 1464–1469
(1986). The authors use Yee’s algorithm to model microwave cavity resonators.
David M. Cook, The Theory of the Electromagnetic Field (Prentice Hall, 1975). One of the ﬁrst
books to introduce numerical methods in the context of electromagnetism.
Robert Ehrlich, Jaroslaw Tuszynski, Lyle Roelofs, and Ronald Stoner, Electricity and Magnetism
Simulations: The Consortium for Upper-Level Physics Software (John Wiley &
Sons, 1995).
Richard P. Feynman, Robert B. Leighton, and Matthew Sands, The Feynman Lectures on Physics,
Vol. 2 (Addison–Wesley, 1963).
T. E. Freeman, “One-, two- or three-dimensional ﬁelds?,” Am. J. Phys. 63, 273–274 (1995).
R. L. Gibbs, Charles W. Beason, and James D. Beason, “Solutions to boundary value problems
of the potential type by random walk method,” Am. J. Phys. 43, 782–785 (1975).
R. H. Good, “Dipole radiation: Simulation using a microcomputer,” Am. J. Phys. 52, 1150–1151
(1984). The author discusses a graphical simulation of dipole radiation.
CHAPTER 10. ELECTRODYNAMICS 405
David J. Griﬃths, Introduction to Electrodynamics, 3rd ed. (Prentice Hall, 1999). A classic undergraduate
text on electromagnetism. See also David J. Griﬃths and Daniel Z. Uvanovic,
“The charge distribution on a conductor for non-coulombic potentials,” Am. J. Phys. 69,
435–440 (2001), O. F. de Alcantara Bonﬁm and David Griﬃths, “Comment on ‘Charge
density on a thin straight wire, revisited,’ by J. D. Jackson [Am. J. Phys. 68 (9), 789-Ð799
(2000)],” Am. J. Phys. 69, 515–516 (2001); and O. F. de Alcantara Bonﬁm, David J. Grifﬁths,
and Sasha Hinkley, “Chaotic and hyperchaotic motion of a charged particle in a
magnetic dipole ﬁeld,” Int. J. Bifurcation and Chaos10, 265–271 (2000).
R. W. Hockney and J. W. Eastwood, Computer Simulation Using Particles (McGraw–Hill, 1981).
Steven E. Koonin and Dawn C. Meredith, Computational Physics (Addison–Wesley, 1990).
See Chapter 6 for a discussion of the numerical solution of elliptic partial diﬀerential
equations of which Laplace’s and Poisson’s equations are examples.
William M. MacDonald, “Discretization and truncation errors in a numerical solution of Laplace’s
equation,” Am. J. Phys. 62, 169–173 (1994).
William H. Press and Saul A. Teukolsky, “Multigrid methods for boundary value problems I,”
Computers in Physics 5 (5), 514 (1991).
Edward M. Purcell, Electricity and Magnetism, 2nd ed., Berkeley Physics Course, Vol. 2 (McGrawHill,
1985). A well known text that discusses the relaxation method.
John R. Reitz, Frederick J. Milford, and Robert W. Christy, Foundations of Electromagnetic
Theory, 3rd ed. (Addison–Wesley, 1979). This text discusses microwave cavity resonators.
Matthew N. O. Sadiku, Numerical Techniques in Electromagnetics, 2nd ed. (CRC Press, 2001).
A. Taﬂove and M. E. Brodwin, “Numerical solution of steady state electromagnetic scattering
problems using the time dependent Maxwell equations,” IEEE Trans. Microwave Theory
and Techniques 23, 623–630 (1975). The authors derive the stability conditions for the Yee
algorithm.
P. B. Visscher, Fields and Electrodynamics (John Wiley & Sons, 1988). An intermediate level
text that incorporates computer simulations and analysis into its development.
P. J. Walker and I. D. Johnston, “Computer model clariﬁes spontaneous charge distribution in
conductors,” Computers in Physics 9, 42 (1995).
Gregg Williams, “An introduction to relaxation methods,” Byte 12 (1), 111–124 (1987). The author
discusses the application of relaxation methods to the solution of the two-dimensional
Poisson’s equation.
K. S. Yee, “Numerical solution of initial boundary value problems involving Maxwell’s equations
in isotropic media,” IEEE Trans. Antennas and Propagation 14, 302–307 (1966). Yee
uses the discretized Maxwell’s equations to model the scattering of electromagnetic waves
oﬀ a perfectly conducting rectangular obstacle.
Chapter 11
Numerical and Monte Carlo
Methods
Simple classical and Monte Carlo methods including importance sampling are illustrated in the
context of the numerical evaluation of deﬁnite integrals.
11.1 Numerical Integration Methods in One Dimension
In this chapter we will learn that we can use sequences of random numbers to estimate deﬁnite
integrals, a problem that seemingly has nothing to do with randomness. To place the Monte
Carlo numerical integration methods in perspective, we will ﬁrst discuss several common classical
methods for numerically evaluating deﬁnite integrals. We will ﬁnd that these methods,
although usually preferable in low dimensions, are impractical for multidimensional integrals
and that Monte Carlo methods are essential for the evaluation of the latter if the number of
dimensions is suﬃciently high.
Consider the one-dimensional deﬁnite integral of the form
F =
b
a
f (x)dx. (11.1)
For some choices of the integrand f (x), the integration in (11.1) can be done analytically, found
in tables of integrals, or evaluated as a series. However, there are relatively few functions that
can be evaluated analytically and most functions must be integrated numerically.
Most classical methods of numerical integration are based on the geometrical interpretation
of the integral (11.1) as the area under the curve of the function f (x) from x = a to x = b (see
Figure 11.1). In these methods the x-axis is divided into n equal intervals of width ∆x, where
∆x is given by
∆x =
b − a
n
(11.2a)
and
xn = x0 + n∆x. (11.2b)
406
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 407
f(x)
a b
x
area
Figure 11.1: The integral F equals the area under the curve f (x).
In the above, x0 = a and xn = b.
The simplest approximation of the area under the curve f (x) is the sum of the area of the
rectangles shown in Figure 11.2. In the rectangular approximation, f (x) is evaluated at the
beginning of the interval, and the approximate of the integral Fn is given by
Fn =
n−1
i=0
f (xi)∆x (rectangular approximation). (11.3)
In the trapezoidal approximation, the integral is approximated by a sum of trapezoids. The
area is computed by choosing one side equal to f (x) at the beginning of the interval and the
other side equal to f (x) at the end of the interval. This approximation is equivalent to replacing
the function by a straight line connecting the values of f (x) at the beginning and the end of each
interval. Because the area of the trapezoid from xi to xi+1 is given by 1
2 [f (xi+1) + f (xi)]∆x, the
total area Fn of the trapezoids is given by
Fn =
1
2
f (x0) +
n−1
i=1
f (xi) +
1
2
f (xn) ∆x (trapezoidal approximation). (11.4)
A generally more accurate method is to use a quadratic or parabolic interpolation procedure
through adjacent triplets of points. (The general problem of interpolation between data points
using polynomials is discussed in Appendix 11D.) For example, the equation of the secondorder
polynomial that passes through the points (x0,y0), (x1,y1), and (x2,y2) can be written as
y(x) =
(x − x1)(x − x2)
(x0 − x1)(x0 − x2)
y0 +
(x − x0)(x − x2)
(x1 − x0)(x1 − x2)
y1
+
(x − x0)(x − x1)
(x2 − x0)(x2 − x1)
y2. (11.5)
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 408
1.0
f(x)
x
0 π/4 π/2
Figure 11.2: The rectangular approximation for f (x) = cosx for 0 ≤ x ≤ π/2. The error is shaded.
The error for various values of the number of intervals n is given in Table 11.1.
What is the value of y(x) at x = x1? The area under the parabola y(x) between x0 and x2 can be
found by integration and is given by
F0 =
1
3
y0 + 4y1 + y2 ∆x (11.6)
where ∆x = x1 −x0 = x2 −x1. The total area under all the parabolic segments yields the parabolic
approximation for the total area:
Fn =
1
3
[f (x0) + 4f (x1) + 2f (x2) + 4f (x3) + ···
+ 2f (xn−2) + 4f (xn−1) + f (xn)]∆x (Simpson’s rule). (11.7)
This approximation is known as Simpson’s rule, although a more descriptive name would be the
parabolic approximation. Note that Simpson’s rule requires that n be even.
To write a program that implements the rectangular approximation, we must deﬁne the
function we wish to integrate. Although we could deﬁne a new integration class for each function
(or change the function in the class and recompile each time), it is convenient to input the
name of the function as a string and then parse the string so that the function can be evaluated.
The ParsedFunction class in the numerics package is designed for this task.
String s t r = "cos(x)" ; / / d e f a u l t s t r i n g ; t h i s s t r i n g could be an input
Function f ;
try {
f = new ParsedFunction ( s t r ) ;
} catch ( ParserException ex ) {
/ / r e c o v e r i f s t r does not r e p r e s e n t a v a l i d function
}
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 409
Because the ParsedFunction is often used with keyboard input and it is common for users to
mistype the name of a function, the ParsedFunction constructor throws an exception that must
be caught.
One way to display a function in a drawing panel is to evaluate the function f (x) at various
x values and plot the (x,f (x)) data points. Although we could do so using a loop to add a
predetermined number of points to a data set, a better way is to use the FunctionDrawer class
in the display package. The FunctionDrawer evaluates a given function at every pixel location
within a drawing panel thereby producing a plot with optimum resolution.
/ / drawingPanel c r e a t e d p r e v i o u s l y
drawingPanel . addDrawable (new FunctionDrawer ( f ) ) ;
We next deﬁne the class RectangularApproximation which computes the area under the
curve using the rectangular approximation. This class also displays the rectangles used to compute
the area. Note how we have extended the Dataset class to produce the visualization.
Listing 11.1: The class RectangularApproximation illustrates the nature of the rectangular
approximation.
package org . opensourcephysics . sip . ch11 ;
import org . opensourcephysics . display . Dataset ;
import org . opensourcephysics . numerics . Function ;
public class RectangularApproximation extends Dataset {
double sum = 0; / / the i n e g r a l
public RectangularApproximation ( Function f , double a , double b ,
int num) {
/ / transparent red
setMarkerColor (new java . awt . Color (255 , 0 , 0 , 12 8) );
setMarkerShape ( Dataset .AREA) ;
sum = 0;
double x = a ; / / lower l i m i t
double y = f . evaluate ( a ) ;
double dx = (b−a )/num;
/ / ude methods in Dataset s u p e r c l a s s
append ( x , 0 ) ; / / s t a r t on the x axis
append ( x , y ) ; / / the top l e f t hand corner of the f i r s t r e c t a n g l e
while ( x<b ) { / / b i s the upper l i m i t
x += dx ;
append ( x , y ) ; / / top r i g h t hand corner of current r e c t a n g l e
sum += y ;
y = f . evaluate ( x ) ; / / c a l c u l a t e a new y at the new x
/ / top l e f t hand corner of the next r e c t a n g l e
append ( x , y ) ;
}
append ( x , 0 ) ; / / f i n i s h on the x axis
sum = dx ;
}
}
Listing 11.2: The target class is deﬁned in the NumericalIntegrationApp class.
package org . opensourcephysics . sip . ch11 ;
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 410
import org . opensourcephysics . controls . ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . frames . ;
import org . opensourcephysics . numerics . ;
public class NumericalIntegrationApp extends AbstractCalculation {
PlotFrame plotFrame = new PlotFrame ( "x" , "f(x)" ,
"Numerical Integration Visualization" ) ;
public void reset ( ) {
control . setValue ( "f(x)" , "cos(x)" ) ;
control . setValue ( "lower limit a" , 0 ) ;
control . setValue ( "upper limit b" , Math . PI / 2 ) ;
control . setValue ( "n" , 4 ) ;
}
public void calculate ( ) {
String f s t r i n g = control . getString ( "f(x)" ) ;
double a = control . getDouble ( "lower limit a" ) ;
double b = control . getDouble ( "upper limit b" ) ;
int n = control . getInt ( "n" ) ; / / number of i n t e r v a l s
Function f ;
try {
f = new ParsedFunction ( f s t r i n g ) ;
} catch ( ParserException ex ) {
control . println ( ex . getMessage ( ) ) ;
plotFrame . clearDrawables ( ) ;
return ;
}
plotFrame . clearDrawables ( ) ;
/ / s e t s the domain of x to the i n t e g r a t i o n l i m i t s
plotFrame . setPreferredMinMaxX ( a , b ) ;
plotFrame . addDrawable (new FunctionDrawer ( f ) ) ;
RectangularApproximation approximate =
new RectangularApproximation ( f , a , b , n ) ;
plotFrame . addDrawable ( approximate ) ;
plotFrame . setMessage ( "~ area = "
+decimalFormat . format ( approximate .sum ) ) ;
control . println ( "approximate area under curve = "+approximate .sum ) ;
}
public s t a t i c void main ( String [ ] args ) {
CalculationControl . createApp (new NumericalIntegrationApp ( ) ) ;
}
}
We consider the accuracy of the rectangular approximation for the integral of f (x) = cosx
from x = 0 to x = π/2 by comparing the numerical results shown in Table 11.1 with the exact
answer of unity. Note that the error decreases as n−1. This observed n−1-dependence of
the error is consistent with the analytic derivation of the n-dependence of the error obtained
in Appendix 11A. We explore the n-dependence of the error associated with other numerical
integration methods in Problems 11.1 and 11.2.
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 411
n Fn ∆n
2 1.34076 0.34076
4 1.18347 0.18347
8 1.09496 0.09496
16 1.04828 0.04828
32 1.02434 0.02434
64 1.01222 0.01222
128 1.00612 0.00612
256 1.00306 0.00306
512 1.00153 0.00153
1024 1.00077 0.00077
Table 11.1: Rectangular approximations of the integral of cosx from x = 0 to x = π/2 as a
function of n, the number of intervals. The error ∆n is the diﬀerence between the rectangular
approximation and the exact result of unity. Note that the error ∆n decreases approximately as
n−1; that is, if n is increased by a factor of 2, ∆n is decreased by a factor 2.
Problem 11.1. The rectangular and midpoint approximations
(a) Test NumericalIntegrationApp by reproducing the results in Table 11.1.
(b) Use the rectangular approximation to determine the approximate deﬁnite integrals of f (x) =
2x + 3x2 + 4x3 and f (x) = e−x for 0 ≤ x ≤ 1 and f (x) = 1/x for 1 ≤ x ≤ 2. What is the
approximate n-dependence of the error in each case?
(c) A straightforward modiﬁcation of the rectangular approximation is to evaluate f (x) at the
midpoint of each interval. Deﬁne a MidpointApproximation class by making the necessary
modiﬁcations and approximate the integral of f (x) = cosx in the interval 0 ≤ x ≤ π/2. Remember
that you need to change how the rectangles are drawn. How does the magnitude
of the error compare with the results shown in Table 11.1? What is the approximate dependence
of the error on n?
(d) Use the midpoint approximation to determine the deﬁnite integrals considered in part (b).
What is the approximate n-dependence of the error in each case?
Problem 11.2. The trapezoidal approximation
(a) Modify your program so that the trapezoidal approximation is computed and determine
the n-dependence of the error for the same functions as in Problem (11.1). What is the
approximate n dependence of the error in each case? Which approximation yields the best
results for the same computation time?
(b) It is possible to double the number of intervals without losing the beneﬁt of the previous
calculations. For n = 1, the trapezoidal approximation is proportional to the average of f (a)
and f (b). In the next approximation, the value of f at the midpoint is added to this average.
The next reﬁnement adds the values of f at the 1/4 and 3/4 points. Modify your program
so that the number of intervals is doubled each time and the results of previous calculations
are used. The following should be helpful:
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 412
i f (n == 1) {
double midpoint = a + 0.5 ( b−a ) ;
sum = (b − a ) ( 0 . 5 f ( a ) + 0.5 f (b ) + f ( midpoint ) ) ;
n = 2;
} else {
double delta = (b − a )/n ; / / separa tion between new points
double x = a + 0.5 delta ;
double intermediateSum = 0 . 0 ;
for ( int i = 0; i < n ; i ++) {
intermediateSum += f ( x ) ;
x += delta ;
}
/ / 0.5 d e l t a = se parat ion between points
sum = 0.5 ( intermediateSum delta + sum ) ;
n = 2;
}
The rectangular and trapezoidal algorithms converge relatively slowly and are therefore
not recommended in general. In practice, Simpson’s rule is adequate for functions that are
reasonably well behaved; that is, functions that can be adequately represented by a polynomial.
If f (x) is such a function, we can adopt the strategy of evaluating the area for a given number of
intervals n and then doubling the number of intervals and evaluating the area again. If the two
evaluations are suﬃciently close to one another, we stop. Otherwise, we again double n until
we achieve the desired accuracy. Of course, this strategy will fail if f (x) is not well behaved.
An example of a poorly behaved function is f (x) = x−1/3 at x = 0, where f (x) diverges. Another
example where this strategy might fail is when a limit of integration is equal to ±∞. In many
cases we can eliminate the problem by a change of variables.
Problem 11.3. Simpson’s rule
(a) Write a class that implements Simpson’s rule. Either adapt your program so that it uses
Simpson’s rule directly or convince yourself that the result of Simpson’s rule can be expressed
as
Sn = (4T2n − Tn)/3 (11.8)
where Tn is the result from the trapezoidal approximation for n intervals. It is not necessary
to provide a visualization of the area. Use your program to evaluate the integral of f (x) =
(2π)−1/2 e−x2
for |x| ≤ 1. Do you obtain the same result by choosing the interval [0,1] and
then multiplying by two?
(b) Determine the same integrals as in Problem 11.2 and discuss the relative merits of the various
approximations.
(c) Evaluate the integral of the function f (x) = 4
√
1 − x2 for |x| ≤ 1. What value of n is needed for
four decimal accuracy? The reason for the slow convergence can be understood by reading
Appendix 11A.
(c)∗ Our strategy for approximating the value of the deﬁnite integrals we have considered has
been to compute Fn and F2n for reasonable values of n. If the diﬀerence |F2n − Fn| is too
large, then we double n until the desired accuracy is reached. The success of this strategy
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 413
is based on the implicit assumption that the sequence Fn,F2n,... converges to the true integral
F. Is there a way of extrapolating this sequence to the limit? Explore this idea by
using the trapezoidal approximation. Because the error for this approximation decreases
approximately as n−2, we can write F = Fn +Cn−2 and plot Fn as a function of n−2 to obtain
the extrapolated result F. Apply this procedure to the integrals considered in some of the
previous problems and compare your results to those found from the trapezoidal approximation
and Simpson’s rule alone. A more sophisticated application of this idea is known
as Romberg integration (see Press et al.).
Because integration is usually a routine task, we have implemented the integration methods
discussed in this section in the Integral class in the numerics package. Listing 11.3 illustrates
how the Integral class is used. Method Integral.ode computes the integrals using an ODE
solver as discussed in the following. The solver uses an adaptive step size to achieve the desired
tolerance.
Listing 11.3: The IntegralCalcApp program tests the Integral class.
package org . opensourcephysics . sip . ch11 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . numerics . ;
public class IntegralCalcApp extends AbstractCalculation {
public void reset ( ) {
control . setValue ( "a" , 0 ) ;
control . setValue ( "b" , 1 ) ;
control . setValue ( "tolerance" , 1.0 e −2);
control . setValue ( "f(x)" , "sin(2*pi*x)^2" ) ;
}
public void calculate ( ) {
Function f ;
String fx = control . getString ( "f(x)" ) ;
try {
f = new ParsedFunction ( fx ) ;
} catch ( ParserException ex ) {
control . println ( ex . getMessage ( ) ) ;
return ;
}
double a = control . getDouble ( "a" ) ;
double b = control . getDouble ( "b" ) ;
double tolerance = control . getDouble ( "tolerance" ) ;
double area = Integral . ode ( f , a , b , tolerance ) ;
control . println ( "ODE area = "+area ) ;
area = Integral . trapezoidal ( f , a , b , 2 , tolerance ) ;
control . println ( "Trapezoidal area = "+area ) ;
area = Integral . simpson ( f , a , b , 2 , tolerance ) ;
control . println ( "Simpson area = "+area ) ;
area = Integral . romberg ( f , a , b , 2 , tolerance ) ;
control . println ( "Romberg area ="+area ) ;
}
public s t a t i c void main ( String [ ] args ) {
CalculationControl . createApp (new IntegralCalcApp ( ) ) ;
}
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 414
}
The algorithms in the Integral class are implemented using static methods so that they
are easy to invoke. These methods accept a tolerance parameter that allows the user to specify
the acceptable relative precision. Because computers store ﬂoating point numbers using a ﬁxed
number of decimal places, we use relative precision to determine the accuracy of the integration
method. However, relative precision is meaningful only if the result is diﬀerent from zero. If the
result is zero, the only possible check is for absolute precision. Because numerical integration
algorithms should check the precision of their output, the Util class in the numerics package
deﬁnes the following general purpose method for computing the relative precision from an
absolute precision and a numerical result.
public s t a t i c double r e l a t i v e P r e c i s i o n ( final double epsilon ,
final double result ) {
return ( result > defaultNumericalPrecision )
? epsilon / result / / return e p s i l o n / r e s u l t i f ( . . . ) i s true
: epsilon ; / / e l s e return e p s i l o n
}
The defaultNumericalPrecision is a named constant that is equal to Math.sqrt(Double.MIN_VALUE).
Problem 11.4. The Integral class
Add a static counter to keep track of the number of times the test function is evaluated in the
IntegralCalcApp class. Use this counter to compare the eﬃciencies of the various integration
algorithms.
(a) How many function calls are required for each numerical method if the tolerance is set to
10−3 or 10−12?
(b) How does the execution time depend on the numerical method? You can compute the time
using the statement System.currentTimeMillis(), which is the time between the current
time and January 1, 1970 measured in milliseconds. Note that the return value is long and
not int.
(c) The method Integral.ode determines when it has reached the input tolerance diﬀerently
than the other integration algorithms in the Integral class. Compare the results from these
other integrators. Is the accuracy of each method always within the tolerance set by the
user?
Another way of evaluating one-dimensional integrals is to recast them as diﬀerential equations.
Consider an indeﬁnite integral of the form
F(x) =
x
a
f (t)dt (11.9)
where F(a) = 0. If we diﬀerentiate F(x) with respect to x, we obtain the ﬁrst-order diﬀerential
equation:
dF(x)
dx
= f (x) (11.10)
with the boundary condition
F(a) = 0. (11.11)
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 415
Because the function f (x) is known, (11.10) can be solved for F(x) using the numerical algorithms
that we introduced earlier for obtaining the numerical solutions of ﬁrst-order diﬀerential
equations.
In general, the methods for solving ordinary diﬀerential equations and evaluating numerical
integrals are not equivalent. For example, we cannot use Simpson’s rule to obtain the numerical
solution of an diﬀerential equation of the form dy/dx = f (x,y). Why?
Simpson’s rule fails (or is slowly convergent) if the function has regions of rapid change. In
this case, an adaptive step-size ODE algorithm is usually eﬀective. The RK45MultiStep solver
adapts the integration step to the behavior of the integrand and allows the user to set the tolerance.
In Problem 11.5 we use it to examine an approximation to the Dirac delta function.
Problem 11.5. Delta function approximation
The Dirac delta function can be approximated by the function
δ(x) =
1
π
lim
→0 x2 + 2
. (11.12)
Test this approximation by integrating (11.12) over a suitable range of x using various values of
. Try adaptive ODE solvers with a tolerance of 10−8. How small a value of produces an error
of 10−4? 10−5?
As computers become more powerful and software to perform mathematical operations
and numerical analysis becomes more prevalent, there is a strong temptation to use libraries
written by others. This use is appropriate because usually these libraries are well written, and
there is no reason why a user should re-invent them. However, the downside is that users do not
always know or care what the libraries are doing and sometimes can obtain puzzling or even
incorrect results. In Problem 11.6 we explore this issue.
Problem 11.6. Understanding errors in integration routines
(a) Use the ode, simpson, trapezoid, and rhomberg methods in the Integral class of the Open
Source Physics library to estimate the integral of sin2
(2πx) between x = 0 and 1 with a
tolerance of 0.01. The exact answer is 0.5. Do all four methods return results within the
tolerance? Are some results much more accurate than the tolerance? Change the tolerance
to 0.1. How do the results change? Notice that for some of the methods, the results are
much better than might be expected because the positive and negative errors cancel. Why
does the trapezoid method always give the exact answer?
(b) Integrate the same function from x = 0.2 to 1.0. How accurate are the results now? Explore
how the results change with the input tolerance. Does the behavior of the ode integrator
diﬀer from the others. Why? How do you think the tolerance parameter is used for each of
the methods?
(c) Integrate f (x) = xn for various values of n. How do the diﬀerent integrators compare? Why
does the trapezoid integrator do worse than the others for n = 2?
11.2 Simple Monte Carlo Evaluation of Integrals
We now explore a much diﬀerent method of estimating integrals. Consider the following example.
Suppose an irregularly shaped pond is in a ﬁeld of known area A. The area of the pond can
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 416
be estimated by throwing stones so that they land at random within the boundary of the ﬁeld
and counting the number of splashes that occur when a stone lands in a pond. The area of the
pond is approximately the area of the ﬁeld times the fraction of stones that make a splash. This
simple procedure is an example of a Monte Carlo method.
To be more speciﬁc, imagine a rectangle of height h, width b − a, and area A = h(b − a) such
that the function f (x) is within the boundaries of the rectangle (see Figure 11.3). Compute n
pairs of random numbers xi and yi with a ≤ xi ≤ b and 0 ≤ yi ≤ h. The fraction of points xi,yi
that satisfy the condition yi ≤ f (xi) is an estimate of the ratio of the integral of f (x) to the area
of the rectangle. Hence, the estimate Fn in the hit or miss method is given by
Fn = A
ns
n
(hit or miss method), (11.13)
where ns is the number of points below the curve or “splashes,” and n is the total number of
points. The number of points chosen at random in (11.13) should not be confused with the
number of intervals used in the numerical methods discussed in Section 11.1.
Another Monte Carlo integration method is based on a mean-value theorem of integral
calculus, which states that the deﬁnite integral (11.1) is determined by the average value of the
integrand f (x) in the range a ≤ x ≤ b:
F =
b
a
f (x)dx = (b − a) f . (11.14)
To determine f , we choose the xi at random instead of at regular intervals and sample the
values of f (x). For the one-dimensional integral (11.1), the estimate Fn of the integral in the
sample mean method is given by
Fn = (b − a)
1
n
n
i=1
f (xi) ≈ (b − a) f (sample mean method). (11.15)
The xi are random numbers distributed uniformly in the interval a ≤ xi ≤ b, and n is the number
of samples. Note that the forms of (11.3) and (11.15) are formally identical except that the n
points are chosen with equal spacing in (11.3) and with random spacing in (11.15). We will ﬁnd
that for low-dimensional integrals, (11.3) is more accurate, but for higher-dimensional integrals,
(11.15) does better.
A simple program that implements the hit or miss method is given in Listing 11.4. Note the
use of the Random class and the methods setSeed and nextDouble(). As discussed in Chapter 7,
the primary reason that it is desirable to specify the seed rather than to choose it more or less
at random from the time (as is done by Math.random()) is that it is convenient to use the same
random number sequence when testing a Monte Carlo program. Suppose that your program
gives a strange result for a particular run. If we ﬁnd an error in the program, we can use
the same random number sequence to test whether our program changes make any diﬀerence.
Another reason for specifying the seed is so that other users can reproduce your results.
Listing 11.4: MonteCarloEstimatorApp displays the estimate of the integral for the number of
samples equal to 2p.
package org . opensourcephysics . sip . ch11 ;
import java . u t i l .Random;
import org . opensourcephysics . controls . ;
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 417
h
a b
f(x)
Figure 11.3: The function f (x) is in the domain determined by the rectangle of height H and
width (b − a).
public class MonteCarloEstimatorApp extends AbstractSimulation {
Random rnd = new Random ( ) ;
int n ; / / current t r i a l number
int nTotal ; / / t o t a l number of t r i a l s
long seed ;
double a , b ; / / i n t e r v a l l i m i t s
double ymax ;
int hits = 0;
public void reset ( ) {
control . setValue ( "lower limit a" , 0 ) ;
control . setValue ( "upper limit b" , 1 . 0 ) ;
control . setValue ( "upper limit on y" , 1 . 0 ) ;
control . setValue ( "seed" , 137933);
}
public double evaluate ( double x ) {
return Math . sqrt (1−x x ) ;
}
public void i n i t i a l i z e ( ) {
a = control . getDouble ( "lower limit a" ) ;
b = control . getDouble ( "upper limit b" ) ;
ymax = control . getDouble ( "upper limit on y" ) ;
n = 0;
nTotal = 2;
seed = ( long ) control . getInt ( "seed" ) ;
hits = 0;
rnd . setSeed ( seed ) ;
}
public void doStep ( ) {
/ / nextDouble returns random double between
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 418
/ / 0 ( i n c l u s i v e ) and 1 ( e x c l u s i v e )
for ( int i = n ; i <nTotal ; i ++) {
double x = a+rnd . nextDouble ( ) ( b−a ) ;
double y = rnd . nextDouble ( ) ymax ;
i f ( y<=evaluate ( x ) ) {
hits ++;
}
}
control . println ( "nTotal = "+nTotal+" estimated area = "
+( hits ( b−a ) ymax)/ nTotal ) ;
n = nTotal ;
nTotal = 2; / / i n c r e a s e number of t r i a l s by a f a c t o r of 2
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl . createApp (new MonteCarloEstimatorApp ( ) ) ;
}
}
Problem 11.7. Monte Carlo integration in one dimension
(a) Use MonteCarloEstimatorApp to estimate Fn, the integral of f (x) = 4
√
1 − x2 in the interval
0 ≤ x ≤ 1, using the hit or miss method. Choose a = 0, b = 1, h = 1 and compute the mean
value of the function
√
1 − x2. Multiply the estimate by 4 to determine Fn. Calculate the
diﬀerence between Fn and the exact result of π. This diﬀerence is a measure of the error
associated with the Monte Carlo estimate. Make a log-log plot of the error as a function of
n. What is the approximate dependence of the error on n for large n, for example, n ≥ 106?
(b) Estimate the same integral using the sample mean method (11.15) and compute the error
as a function of the number of samples n for n ≥ 106. How many samples are needed to
determine Fn to two decimal places? What is the approximate dependence of the error on n
for large n?
(c) Determine the computational time per sample using the two Monte Carlo methods. Which
Monte Carlo method is preferable?
11.3 Multidimensional Integrals
Many problems in physics involve averaging over many variables. For example, suppose we
know the position and velocity dependence of the total energy of ten interacting particles. In
three dimensions each particle has three velocity components and three position components.
Hence the total energy is a function of 60 variables, and a calculation of the average energy per
particle involves computing a d = 60 dimensional integral. (More accurately, the total energy is
a function of 60−6 = 54 variables if we use center of mass and relative coordinates.) If we divide
each coordinate into p intervals, there would be p60 points to sum. Clearly, standard numerical
methods such as Simpson’s rule would be impractical for this example.
A discussion of the n-dependence of the error associated with the standard numerical methods
for d-dimensional integrals is given in Appendix 11A. We show that if the error decreases
as n−a for d = 1, then the error decreases as n−a/d in d dimensions. In contrast, we ﬁnd (see
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 419
Section 11.4) that the error for all Monte Carlo integration methods decreases as n−1/2 independently
of the dimension of the integral. Because the computational time is roughly proportional
to n in both the classical and Monte Carlo methods, we conclude that for low dimensions, the
classical numerical methods such as Simpson’s rule are preferable to Monte Carlo methods unless
the domain of integration is very complicated. However, Monte Carlo methods are essential
for higher dimensional integrals.
To illustrate the evaluation of multidimensional integrals, we consider the two-dimensional
integral
F =
R
f (x,y)dxdy (11.16)
where R denotes the region of integration. The extension to higher dimensions is straightforward
but tedious. Form a rectangle that encloses the region R and divide this rectangle into
squares of length h. Assume that the rectangle runs from xa to xb in the x direction and from
ya to yb in the y direction. The total number of squares is nxny, where nx = (xb − xa)/h and
ny = (yb − ya)/h. If we use the midpoint approximation, the integral F is estimated by
F ≈
nx
i=1
ny
j=1
f (xi,yj)H(xi,yj)h2
(11.17)
where xi = xa + (i − 1
2 )h, yj = ya + (j − 1
2 )h, and the (Heaviside) function H(x,y) = 1 if (x,y) is in R
and equals 0 otherwise.
A simple Monte Carlo method for evaluating a two-dimensional integral uses the same rectangular
region as in (11.17), but the n points (xi,yi) are chosen at random within the rectangle.
The estimate for the integral is then
Fn =
A
n
n
i=1
f (xi,yi)H(xi,yi) (11.18)
where A is the area of the rectangle. Note that if f (x,y) = 1 everywhere, then (11.18) is equivalent
to the hit or miss method of calculating the area of the region R. In general, (11.18)
represents the area of the region R multiplied by the average value of f (x,y) in R.
Problem 11.8. Two-dimensional numerical integration
(a) Write a program to implement the midpoint approximation in two dimensions and integrate
the function f (x,y) = x2 +6xy +y2 over the region deﬁned by the condition x2 +y2 ≤ 1.
Use h = 0.1, 0.05, 0.025, and 0.0125.
(b) Repeat part (a) using a Monte Carlo method and the same number of points n. For each
value of n, repeat the calculation several times to obtain a crude estimate of the random
error.
Problem 11.9. Volume of a hypersphere
(a) The interior of a d-dimensional hypersphere of unit radius is deﬁned by the condition x2
1 +
x2
2 +···+x2
d ≤ 1. Write a program that ﬁnds the volume of a hypersphere using the midpoint
approximation. If you are clever, you can write a program that applies to any dimension
using a recursive method. Test your program for d = 2 and d = 3 and then ﬁnd the volume
for d = 4 and d = 5. Begin with h = 0.2, and decrease h until your results do not change by
more than 1%.
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 420
(b) Repeat part (a) using a Monte Carlo method. For each value of n, repeat the calculation
several times to obtain a rough estimate of the random error. Is your program applicable
for any d easier to write than in part (a)?
11.4 Monte Carlo Error Analysis
Both the classical numerical integration methods and the Monte Carlo methods yield approximate
answers whose accuracy depends on the number of intervals or on the number of samples,
respectively. So far, we have used our knowledge of the exact values of various integrals to determine
that the error in the Monte Carlo methods approaches zero as n−1/2 for large n, where
n is the number of samples. In the following, we will learn how to estimate the error when the
exact answer is unknown. Our main result is that the n−1/2-dependence of the error is a general
result and is independent of the nature of the integrand and, most importantly, independent of
the number of dimensions.
As before, we ﬁrst determine the error for an explicit example. Consider the Monte Carlo
evaluation of the integral of f (x) = 4
√
1 − x2 in the interval [0,1] (see Problem 11.7). Our result
for a particular sequence of n = 104 random numbers using the sample mean method is
Fn = 3.1489. How does this result for Fn compare with your result found in Problem 11.7 for
the same value of n? By comparing Fn to the exact result of F = π ≈ 3.1416, we ﬁnd that the
error associated with n = 104 samples is approximately 0.0073. How do we know if n = 104
samples are suﬃcient to achieve the desired accuracy? We cannot answer this question deﬁnitively
because if the actual error were known, we could correct Fn by the required amount and
obtain F. The best we can do is to calculate the probability that the true value F is within a
certain range centered about Fn.
We know that if the integrand is a constant, then the error would be zero; that is, Fn would
equal F for any n. This limiting behavior suggests that a possible measure of the error is the
sample variance ˜σ2 deﬁned by
˜σ2
=
1
n − 1
n
i=1
[f (xi) − f ]2
(11.19)
where
f =
1
n
n
i=1
f (xi). (11.20)
The reason for the factor of 1/(n − 1) in (11.19) rather than 1/n is similar to the reason for the
expression 1/
√
n − 2 in the error estimates of the least squares ﬁts [see (7.43)]. To compute ˜σ2,
we need to use n samples to compute the mean, f , and, loosely speaking, we have only n − 1
independent samples remaining to calculate ˜σ2. Because we will always be considering values
of n >> 1, we will replace ˜σ2 by the variance σ2, which is given by
σ2
= f 2
− f 2
(11.21)
where
f 2
=
1
n
n
i=1
[f (xi)]2
. (11.22)
For our example and the same sequence of random numbers that we used to obtain Fn = 3.1489,
we obtain the standard deviation σ = 0.8850. Because this value of σ is two orders of magnitude
larger than the actual error, we conclude that σ is not a direct measure of the error.
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 421
n Fn actual error σ σ/
√
n
102 3.0692 0.0724 0.8550 0.0855
103 3.1704 0.0288 0.8790 0.0270
104 3.1489 0.0073 0.8850 0.0089
Table 11.2: Examples of Monte Carlo measurements of the mean value of f (x) = 4
√
1 − x2 in the
interval [0,1]. The actual error is given by the diﬀerence |Fn − π|. The standard deviation σ is
estimated using (11.21).
Another clue to ﬁnding an appropriate measure of the error can be found by increasing
n and seeing how the actual error decreases as n increases. In Table 11.2 we see that as n is
increased from 102 to 104, the actual error decreased by a factor of approximately 10; that is,
as ∼ 1/n1/2. We also see that the actual error is approximately given by σ/
√
n. In Appendix 11B
we show that the standard error of the means σm is given by
σm =
σ
√
n − 1
(11.23a)
≈
σ
√
n
. (11.23b)
The interpretation of σm is that if we make many independent measurements of Fn, each with n
data points, then the probable error associated with any single measurement is σm. The more precise
interpretation of σm is that Fn, our estimate for the mean, has a 68% chance of being within
σm of the “true” mean and a 97% chance of being within 2σm. This interpretation assumes a
Gaussian distribution of the various measurements.
The quantity Fn is an estimate of the average value of the data points. As we increase n,
the number of data points, we do not expect our estimate of the mean to change much. What
changes, as we increase n, is our conﬁdence in our estimate of the mean. Similar considerations
hold for our estimate of σ, which is why σ is not a direct measure of the error.
The error estimate in (11.23) assumes that the data points are independent of each other.
However, in many situations the data is correlated, and we have to be careful about how we
estimate the error. For example, suppose that instead of choosing n random values for x, we
instead start with a particular value x0 and then randomly add increments such that the ith
value of x is given by xi = xi−1 +(2r −1)δ, where r is uniformly distributed between 0 and 1, and
δ = 0.01. Clearly, the xi are now correlated. We can still obtain an estimate for the integral, but
we cannot use σ/
√
n as the estimate for the error because this estimate would be smaller than
the actual error. However, we expect that two data points, xi and xj, will become uncorrelated
if |j −i| is suﬃciently large. How can we tell when |j −i| is suﬃciently large? One way is to group
the data by averaging over m data points. We take f
(m)
1 to be the average of the ﬁrst m values of
f (xi), f
(m)
2 to be the average of the next m values, and so forth. Then we compute σs/
√
s, where
s = n/m is the number of f
(m)
i data points, each of which is an average over m of the original data
points, and σs is the standard deviation of the s data points, We do this grouping for diﬀerent
values of m (and s) and ﬁnd the value of m for which σs/
√
s becomes approximately independent
of m. This ratio is our estimate of the error of the mean.
We see that we can make the probable error as small as we wish by either increasing n, the
number of data points, or by reducing the variance σ2. Several reduction of variance methods
are introduced in Sections 11.6 and 11.7.
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 422
Problem 11.10. Estimate of the Monte Carlo error
(a) Estimate the integral of f (x) = e−x in the interval 0 ≤ x ≤ 1 using the sample mean Monte
Carlo method with n = 104, n = 106, and n = 108. Determine the exact integral analytically
and estimate the n dependence of the actual error. How does your estimated error compare
with the error estimate obtained from the relation (11.23)?
(b) Generate 19 additional measurements of the integral each with n = 106 samples. Compute
the standard deviation of the 20 measurements. Is the magnitude of this standard deviation
of the means consistent with your estimates of the error obtained in part (a)? Compute the
histogram of the additional measurements and conﬁrm that the distribution of the measurements
is consistent with a Gaussian distribution.
(c) Divide your ﬁrst measurement of n = 106 samples into s = 10 subsets of 105 samples each.
Is the value of σs/
√
s consistent with your previous error estimates?
(d) Estimate the integral
1
0
e−x2
dx (11.24)
to two decimal places using σ/
√
n as an estimate of the probable error.
(e) Estimate the integral
2π
0
cos2 θ dθ using n = 106, where θi = θi−1 + (2r − 1)δ, r is uniformly
distributed between 0 and 1, and δ = 0.1. Note that because cosθ = cosθ + 2kπ for any
integer k, we do not have to restrict the range of θi. Estimate the error using (11.23). Is this
error estimate accurate? Also, estimate the error by grouping the data into m = 10, 102, 103,
104, and 105 data points and compute σs/
√
s for s = 105, 104, 103, 102, and 10, respectively.
How large must m be so that the error estimates for diﬀerent values of m are approximately
the same? Discuss the relation between this result and the correlation of the data points.
∗Problem 11.11. Importance of randomness
We learned in Chapter 7 that the random number generator included with many programming
languages is based on the linear congruential method. In this method each term in the sequence
can be found from the preceding one by the relation
xn+1 = (axn + c) modm (11.25)
where x0 is the seed, and a, c, and m are nonnegative integers. The random numbers r in the
unit interval 0 ≤ r < 1 are given by rn = xn/m. To examine the eﬀect of a poor random number
generator, we choose values of x0, m, a, and c such that (11.25) has poor statistical properties,
for example, a short period. What is the period for x0 = 1, a = 5, c = 0, and m = 32? Estimate the
integral in Problem 11.10 by making a single measurement of n = 104 samples using the linear
congruential method (11.25) with these values of x0, a, c, and m. Analyze your measurement by
computing σs/s1/2 for s = 20 subsets. Then divide your data into s = 10 subsets. Is the value of
σs/s1/2 consistent with what you obtained for s = 20?
∗Problem 11.12. Error estimating by bootstrapping
Suppose that we have made a series of measurements but do not know the underlying probability
distribution of the data. How can we estimate the errors of the quantities of interest in an
unbiased way? One way is to use a method known as bootstrapping, a method that uses random
sampling to estimate the errors.
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 423
Consider a set of n measurements, such as n values of the pairs (xi,yi), and suppose we
want to ﬁt this data to the best straight line. If we label the original set of measurements
M = {m1,m2,...,mn}, then the kth resampled data set Mk consists of n measurements that are
randomly chosen from the original set. This procedure means that some of the mi may not
appear in Mk and some may appear more than once. We then compute the quantity Gk from the
resampled data set. For example, Gk could be the slope found from a least squares calculation.
If we do this resampling nr times, a measure of the error in the quantity G is given by σ2
G, where
σG =
1
nr − 1
nr
k=1
Gk − Gk
2
(11.26)
with
Gk =
1
nr
nr
k=1
Gk. (11.27)
(a) To see how this procedure works, consider n = 15 pairs of points xi randomly distributed
between 0 and 1, with the corresponding values of y given by yi = 2xi + 3 + si, where si is a
uniform random number between −1 and +1. First compute the slope m and the intercept
b using the least squares method and their corresponding errors using (7.42).
(b) Resample the same set of data 200 times, computing the slope and intercept each time using
the least squares method. From your results estimate the probable error for the slope and
intercept using (11.26). How well do the estimates from bootstrapping compare with the
direct error estimates found in part (a)? Does the average of the bootstrap values for the
slope and intercept equal m and b, respectively, from the least squares ﬁts. If not, why not?
Do your conclusions change if you resample 800 times?
11.5 Nonuniform Probability Distributions
In Sections 11.2 and 11.4, we learned how uniformly distributed random numbers can be used
to estimate deﬁnite integrals. We will ﬁnd that it is more eﬃcient to sample the integrand f (x)
more often in regions of x where the magnitude of f (x) is large or rapidly varying. Because
importance sampling methods require nonuniform probability distributions, we ﬁrst consider
several methods for generating random numbers that are not distributed uniformly. In the
following, we will denote r as a member of a uniform random number sequence in the unit
interval 0 ≤ r < 1.
Suppose that two discrete events 1 and 2 occur with probabilities p1 and p2 such that
p1 + p2 = 1. How can we choose the two events with the correct probabilities using a uniform
probability distribution? For this simple case, it is clear that we choose event 1 if r < p1;
otherwise, we choose event 2. If there are three events with probabilities p1, p2, and p3, then if
r < p1, we choose event 1; else if r < p1 + p2, we choose event 2; else we choose event 3. We can
visualize these choices by dividing a line segment of unit length into three pieces whose lengths
are shown in Figure 11.4.
Now consider n discrete events. How do we determine which event i to choose given the
value of r? The generalization of the procedure we have followed for n = 2 and n = 3 is to ﬁnd
the value of i that satisﬁes the condition
i−1
j=0
pj ≤ r ≤
i
j=0
pj (11.28)
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 424
p1 p2 p3
Figure 11.4: The unit interval is divided into three segments of lengths p1 = 0.2, p2 = 0.5, and
p3 = 0.3. Sixteen random numbers are represented by the ﬁlled circles uniformly distributed
on the unit interval. The fraction of circles within each segment is approximately equal to the
value of pi for that segment.
where we have deﬁned p0 ≡ 0. Verify that (11.28) reduces to the correct procedure for n = 2 and
n = 3.
We now consider a continuous nonuniform probability distribution. One way to generate
such a distribution is to take the limit of (11.28) and associate pi with p(x)∆x, where the probability
density p(x) is deﬁned such that p(x)∆x is the probability that the event x is in the interval
between x and x + ∆x. The probability density p(x) is normalized such that
+∞
−∞
p(x)dx = 1. (11.29)
In the continuum limit, the two sums in (11.28) become the same integral, and the inequalities
become equalities. Hence, we can write
P (x) ≡
x
−∞
p(x )dx = r. (11.30)
From (11.30) we see that the uniform random number r corresponds to the cumulative probability
distribution function P (x), which is the probability of choosing a value less than or equal to
x. The function P (x) should not be confused with the probability density p(x) or the probability
p(x)∆x. In many applications the meaningful range of values of x is positive. In that case we
have p(x) = 0 for x < 0.
The relation (11.30) leads to the inverse transform method for generating random numbers
distributed according to the function p(x). This method involves generating a random number
r and solving (11.30) for the corresponding value of x. As an example of the method, we use
(11.30) to generate a random number sequence according to the uniform probability distribution
on the interval a ≤ x ≤ b. The desired probability density p(x) is
p(x) =



(1/(b − a) a ≤ x ≤ b
0 otherwise.
(11.31)
The cumulative probability distribution function P (x) for a ≤ x ≤ b can be found by substituting
(11.31) into (11.30) and performing the integral. The result is
P (x) =
x − a
b − a
. (11.32)
If we substitute the form (11.32) for P (x) into (11.30) and solve for x, we ﬁnd the desired relation
x = a + (b − a)r. (11.33)
The variable x given by (11.33) is distributed according to the probability distribution p(x)
given by (11.31). Of course, the relation (11.33) is obvious, and we already have used it in our
programs.
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 425
We next apply the inverse transform method to the probability density function
p(x) =



(1/λ)e−x/λ 0 ≤ x ≤ ∞
0 x < 0.
(11.34)
If we substitute (11.34) into (11.30) and integrate, we ﬁnd the relation
r = P (x) = 1 − e−x/λ
. (11.35)
The solution of (11.35) for x yields x = −λln(1 − r). Because 1 − r is distributed in the same way
as r, we can write
x = −λlnr. (11.36)
The variable x found from (11.36) is distributed according to the probability density p(x) given
by (11.34). Because the computation of the natural logarithm in (11.36) is relatively slow, the
inverse transform method might not be the most eﬃcient method to use in this case.
From the above examples, we see that two conditions must be satisﬁed in order to apply
the inverse transform method: the form of p(x) must allow us to perform the integral in (11.30)
analytically, and it must be practical to invert the relation P (x) = r for x.
The Gaussian probability density
p(x) =
1
(2πσ2)1/2
e−x2/2σ2
(11.37)
is an example of a probability density for which the cumulative distribution P (x) cannot be
obtained analytically. However, we can generate the two-dimensional probability p(x,y)dxdy
given by
p(x,y)dxdy =
1
2πσ2
e−(x2+y2)/2σ2
dxdy. (11.38)
First, we make a change of variables to polar coordinates:
r = (x2
+ y2
)1/2
, θ = tan−1 y
x
. (11.39)
We let ρ = r2/2 and write the two-dimensional probability as
p(ρ,θ)dρdθ =
1
2π
e−ρ
dρdθ (11.40)
where we have set σ = 1. If we generate ρ according to the exponential distribution (11.34) and
generate θ uniformly in the interval 0 ≤ θ < 2π, then the quantities
x = (2ρ)1/2
cosθ and y = (2ρ)1/2
sinθ (Box–Muller method) (11.41)
will each be generated according to (11.37) with zero mean and σ = 1. (Note that the twodimensional
density (11.38) is the product of two independent one-dimensional Gaussian distributions.)
This way of generating a Gaussian distribution is known as the Box–Muller method.
We discuss other methods for generating the Gaussian distribution in Problems 11.14 and 11.17
and in Appendix 11C.
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 426
Problem 11.13. Nonuniform probability densities
(a) Write a program to simulate the simultaneous rolling of two dice. In this case the events are
discrete and occur with nonuniform probability.
(b) Write a program to verify that the sequence of random numbers {xi} generated by (11.36) is
distributed according to the exponential distribution (11.34).
(c) Generate random variables according to the probability density function
p(x) =



2(1 − x) 0 ≤ x ≤ 1
0 otherwise.
(11.42)
(d) Verify that the variables x and y in (11.41) are distributed according to the Gaussian distribution.
What are the mean value and the standard deviation of x and y?
(e) How can you use the relations (11.41) to generate a Gaussian distribution with arbitrary
mean and standard deviation?
Problem 11.14. Generating normal distributions
Fernández and Criado have suggested another method of generating normal distributions that
is much faster than the Box–Muller method. We will just summarize the algorithm; the proof
that the algorithm leads to a normal distribution is given in their paper.
(a) Begin with N numbers vi in an array. Set all the vi = σ, where σ is the desired standard
deviation for the normal distribution.
(b) Update the array by randomly choosing two diﬀerent entries vi and vj from the array. Then
let vi = (vi + vj)/
√
2 and use the new vi to set vj = −vi + vj
√
2.
(c) Repeat step (b) many times to bring the array of numbers to “equilibrium.”
(d) After equilibration, the entries vi will have a normal distribution with the desired standard
deviation and zero mean.
Write a program to produce a series of random numbers according to this algorithm. Your program
should allow the user to enter N and σ, and a button should be implemented to allow
for equilibration before various averages are computed. The desired output is the probability
distribution of the random numbers that are produced as well as their mean and standard deviation.
First make sure that the standard deviation of the probability distribution approaches
the desired input σ for suﬃciently long times. What is the order of magnitude of the equilibration
time? Does it depend on N? Plot the natural log of the probability distribution versus v2
and check that you obtain a straight line with the appropriate slope.
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 427
11.6 Importance Sampling
In Section 11.4 we found that the error associated with a Monte Carlo estimate is proportional
to the standard deviation σ of the integrand and inversely proportional to the square root of
the number of samples. Hence, there are only two ways of reducing the error in a Monte
Carlo estimate—either increase the number of samples or reduce the variance. Clearly the latter
choice is desirable because it does not require much more computer time. In this section
we introduce importance sampling techniques that reduce σ and improve the eﬃciency of each
sample.
To do importance sampling in the context of numerical integration, we introduce a positive
function p(x) such that
b
a
p(x)dx = 1 (11.43)
and rewrite the integral (11.1) as
F =
b
a
f (x)
p(x)
p(x)dx. (11.44)
We can evaluate the integral (11.44) by sampling according to the probability distribution p(x)
and constructing the sum
Fn =
1
n
n
i=1
f (xi)
p(xi)
. (11.45)
The sum (11.45) reduces to (11.15) for the uniform case p(x) = 1/(b − a).
The idea is to choose a form for p(x) that minimizes the variance of the ratio f (x)/p(x).
To do so we choose a form of p(x) that mimics f (x) as much as possible, particularly where
f (x) is large. A suitable choice of p(x) would make the integrand f (x)/p(x) slowly varying, and
hence reduce the variance. Because we cannot evaluate the variance analytically in general, we
determine σ a posteriori.
As an example, we again consider the integral (see Problem 11.10d)
F =
1
0
e−x2
dx. (11.46)
The estimate of F with p(x) = 1 for 0 ≤ x ≤ 1 is shown in the second column of Table 11.3. A
simple choice for the weight function is p(x) = Ae−x, where A is chosen such that p(x) is normalized
on the unit interval. Note that this choice of p(x) is positive deﬁnite and is qualitatively
similar to f (x). The results are shown in the third column of Table 11.3. We see that although
the computation time per sample for the nonuniform case is larger, the smaller value of σ makes
the use of the nonuniform probability distribution more eﬃcient.
Problem 11.15. Importance sampling
(a) Choose f (x) =
√
1 − x2 and consider p(x) = A(1 − x) for x ≥ 0. What is the value of A that
normalizes p(x) in the unit interval [0,1]? What is the relation for the random variable x
in terms of r for this form of p(x)? What is the variance of f (x)/p(x) in the unit interval?
Evaluate the integral
1
0
f (x)dx using n = 106 and estimate the probable error of your result.
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 428
p(x) = 1 p(x) = Ae−x
n (samples) 5 × 106 4 × 105
Fn 0.74684 0.74689
σ 0.2010 0.0550
σ/
√
n 0.00009 0.00009
Total CPU time (s) 20 2.5
CPU time per sample (s) 4 × 10−6 6 × 10−6
Table 11.3: Comparison of the Monte Carlo estimates of the integral (11.46) using the uniform
probability density p(x) = 1 and the nonuniform probability density p(x) = Ae−x. The normalization
constant A is chosen such that p(x) is normalized on the unit interval. The value of the
integral to ﬁve decimal places is 0.74682. The estimate Fn, variance σ of f /p, and the probable
error σ/n1/2 are shown. The CPU time (in seconds) is shown for comparison only. (The number
of samples was chosen so that the error estimates are comparable.)
(b) Choose p(x) = Ae−λx and evaluate the integral
π
0
1
x2 + cos2 x
dx. (11.47)
Determine the value of λ that minimizes the variance of the integrand.
Problem 11.16. An adaptive approach to importance sampling
An alternative approach is to use the known values of f (x) at regular intervals to sample more
often where f (x) is relatively large. Because the idea is to use f (x) itself to determine the probability
of sampling, we only consider integrands that are nonnegative. To compute a rough
estimate of the relative values of f (x), we ﬁrst compute its average value by taking k equally
spaced points si and computing the sum
S =
k
i=1
f (si). (11.48)
This sum divided by k gives an estimate of the average value of f in the interval. The approximate
value of the integral is given by F ≈ Sh, where h = (b − a)/k. This approximation of the
integral is equivalent to the rectangular or midpoint approximation depending on where we
compute the values of f (x). We then generate n random samples as follows. The probability of
choosing subinterval (bin) i is given by the probability
pi =
f (si)
S
. (11.49)
Note that the sum of pi over all subintervals is normalized to unity.
To choose a subinterval with the desired probability, we generate a random number ¡ uniformly
in the interval [a,b] and determine the subinterval i that satisﬁes the inequality (11.28).
Now that the subinterval has been chosen with the desired probability, we generate a random
number xi in the subinterval [si,si + h] and compute the ratio f (xi)/p(xi). The estimate of the
integral is given by the following considerations. The probability pi in (11.49) is the probability
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 429
of choosing the subinterval i, not the probability p(x)∆x of choosing a value of x between x and
x +∆x. The latter is pi times the probability of picking the particular value of x in subinterval i:
p(xi)∆x = pi
∆x
h
. (11.50)
Hence, we have that
Fn =
1
n
n
i=1
f (xi)
p(xi)
=
h
n
n
i=1
f (xi)
pi
. (11.51)
Apply this method to estimate the integral of f (x) =
√
1 − x2 in the unit interval. Under
what circumstances would this approach be most useful?
11.7 Metropolis Algorithm
Another way of generating an arbitrary nonuniform probability distribution was introduced by
Metropolis, Rosenbluth, Rosenbluth, Teller, and Teller in 1953. The Metropolis algorithm is a
special case of an importance-sampling procedure in which certain possible sampling attempts
are rejected (see Appendix 11C). The Metropolis method is useful for computing averages of
the form
f =
f (x)p(x)dx
p(x)dx
(11.52)
where p(x) is an arbitrary function that need not be normalized. In Chapter 15 we will discuss
the application of the Metropolis algorithm to problems in statistical mechanics.
For simplicity, we introduce the Metropolis algorithm in the context of estimating onedimensional
deﬁnite integrals. Suppose that we wish to use importance sampling to generate
random variables according to p(x). The Metropolis algorithm produces a random walk of
points {xi} whose asymptotic probability distribution approaches p(x) after a large number of
steps. The random walk is deﬁned by specifying a transition probability T (xi → xj) from one
value xi to another value xj such that the distribution of points x0, x1, x2, ... converges to p(x).
It can be shown that it is suﬃcient (but not necessary) to satisfy the detailed balance condition
p(xi)T (xi → xj) = p(xj)T (xj → xi). (11.53)
The relation (11.53) does not specify T (xi → xj) uniquely. A simple choice of T (xi → xj) that is
consistent with (11.53) is
T (xi → xj) = min 1,
p(xj)
p(xi)
. (11.54)
If the “walker” is at position xi and we wish to generate xi+1, we can implement this choice of
T (xi → xi+1) by the following steps:
1. Choose a trial position xtrial = xi +δi, where δi is a uniform random number in the interval
[−δ,δ].
2. Calculate w = p(xtrial)/p(xi).
3. If w ≥ 1, accept the change and let xi+1 = xtrial.
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 430
4. If w < 1, generate a random number r.
5. If r ≤ w, accept the change and let xi+1 = xtrial.
6. If the trial change is not accepted, then let xi+1 = xi.
It is necessary to sample many points of the random walk before the asymptotic probability
distribution p(x) is attained. How do we choose the maximum step size δ? If δ is too large, only
a small percentage of trial steps will be accepted, and the sampling of p(x) will be ineﬃcient.
On the other hand, if δ is too small, a large percentage of trial steps will be accepted, but
again the sampling of p(x) will be ineﬃcient. A rough criterion for the magnitude of δ is that
approximately one third to one half of the trial steps should be accepted. We also wish to
choose the value of x0 such that the distribution {xi} will approach the asymptotic distribution
as quickly as possible. An obvious choice is to begin the random walk at a value of x at which
p(x) is a maximum. A code fragment that implements the Metropolis algorithm is given below.
double x t r i a l = x + (2 rnd . nextDouble ( ) − 1.0) delta ;
double w = p( x t r i a l )/p( x ) ;
i f (w > 1 | | w > rnd . nextDouble ( ) ) {
x = x t r i a l ;
naccept ++; / / number of acceptances
}
Problem 11.17. Generating the Gaussian distribution
(a) Use the Metropolis algorithm to generate the Gaussian distribution p(x) = Ae−x2/2. Is the
value of the normalization constant A relevant? Determine the qualitative dependence of
the acceptance ratio and the equilibration time on the maximum step size δ. One possible
criterion for equilibrium is that x2 ≈ 1. What is a reasonable choice for δ? How many trials
are needed to reach equilibrium for your choice of δ?
(b) Modify your program so that it plots the asymptotic probability distribution generated by
the Metropolis algorithm.
(c)∗ Calculate the autocorrelation function C(j) deﬁned by (see Problem 7.31)
C(j) =
xi+jxi − xi
2
x2
i − xi
2
(11.55)
where ... indicates an average over the random walk. What is the value of C(j = 0)? What
would be the value of C(j 0) if xi were completely random? Calculate C(j) for diﬀerent
values of j and determine the value of j for which C(j) is close to zero.
Problem 11.18. Application of the Metropolis algorithm
(a) Although the Metropolis algorithm is not the most eﬃcient method in this case, write a
program to estimate the average
x =
∞
0
xe−x dx
∞
0
e−x dx
(11.56)
with p(x) = Ae−x for x ≥ 0 and p(x) = 0 for x < 0. Compute the histogram H(x) showing the
number of points in the random walk in the region x to x + ∆x with ∆x = 0.2. Begin with
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 431
n ≥ 1000 and maximum step size δ = 1. Allow the system to equilibrate for at least 200
steps before computing averages. Is the integrand sampled uniformly? If not, what is the
approximate region of x where the integrand is sampled more often?
(b) Calculate analytically the exact value of x in (11.56). How do your Monte Carlo results
compare with the exact value for n = 100 and n = 1000 with δ = 0.1, 1, and 10? Estimate the
standard error of the mean. Does this error give a reasonable estimate of the error?
(c) In part (b) you should have found that the estimated error is much smaller than the actual
error. The reason is that the {xi} are not statistically independent. The Metropolis algorithm
produces a random walk whose points are correlated with each other over short times (measured
by the number of steps of the random walker). The correlation of the points decays
exponentially with time. If τ is the characteristic time for this decay, then only points separated
by approximately 2 to 3τ can be considered statistically independent. Compute C(j)
as deﬁned in (11.55) and make a rough estimate of τ. Rerun your program with the data
grouped into 20 sets of 50 points each and 10 sets of 100 points each. If the sets of 50
points each are statistically independent (that is, if τ is signiﬁcantly smaller than 50), then
your estimate of the error for the two groupings should be approximately the same. The
importance of correlations between sampled points is discussed further in Section 15.7.
11.8 *Neutron Transport
We consider the application of a nonuniform probability distribution to the simulation of the
transmission of neutrons through bulk matter, one of the original applications of a Monte Carlo
method. Suppose that a neutron is incident on a plate of thickness t. We assume that the plate
is inﬁnite in the x and y directions and the z-axis is normal to the plate. At any point within the
plate, the neutron can either be captured with probability pc or scattered with probability ps.
These probabilities are proportional to the capture cross section and scattering cross section,
respectively. If the neutron is scattered, we need to ﬁnd its new direction as speciﬁed by the
polar angle θ (see Figure 11.5). Because we are not interested in how far the neutron moves in
the x or y direction, the value of the azimuthal angle φ is irrelevant.
If the neutrons are scattered equally in all directions, then the probability p(θ,φ)dθdφ
equals dΩ/4π, where dΩ is an inﬁnitesimal solid angle and 4π is the total solid angle. Because
dΩ = sinθ dθdφ, we have
p(θ,φ) =
sinθ
4π
. (11.57)
We can ﬁnd the probability density for θ and φ separately by integrating over the other angle.
For example,
p(θ) =
2π
0
p(θ,φ)dφ =
1
2
sinθ (11.58)
and
p(φ) =
π
0
p(θ,φ)dθ =
1
2π
. (11.59)
Because the point probability p(θ,φ) is the product of the probabilities p(θ) and p(φ), θ and φ
are independent variables. Although we do not need to generate a random angle φ, we note
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 432
x
z
θ
v
v’
Figure 11.5: The deﬁnition of the scattering angle θ. The velocity before scattering is v and the
velocity after scattering is v . The scattering angle θ is independent of v and is deﬁned relative
to the z-axis.
that because p(φ) is a constant, φ can be found from the relation
φ = 2πr. (11.60)
To ﬁnd θ according to the distribution (11.58), we substitute (11.58) in (11.30) and obtain
r =
1
2
θ
0
sinxdx. (11.61)
The integration in (11.61) gives
cosθ = 1 − 2r. (11.62)
Note that (11.60) implies that φ is uniformly distributed between 0 and 2π, and (11.62) implies
that cosθ is uniformly distributed between −1 and +1. We could invert the cosine in (11.62) to
solve for θ. However, to ﬁnd the z-component of the path of the neutron through the plate, we
need to multiply the path length by cosθ, and hence we need cosθ rather than θ.
The path length, which is the distance traveled between subsequent scattering events, is
obtained from the exponential probability density p( ) ∝ e− /λ [see (11.34)]. From (11.36), we
have
= −λlnr (11.63)
where λ is the mean free path.
Now we have all the necessary ingredients for calculating the probabilities for a neutron to
pass through the plate, to be reﬂected oﬀ the plate, or to be captured and absorbed in the plate.
The input parameters are the thickness of the plate t, the capture probability pc and the mean
free path λ. The scattering probability is ps = 1 − pc. We begin with z = 0 and implement the
following steps:
1. Determine if the neutron is captured or scattered. If it is captured, then add one to the
number of captured neutrons and go to step 5.
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 433
2. If the neutron is scattered, compute cosθ from (11.62) and from (11.63). Change the
z-coordinate of the neutron by cosθ.
3. If z < 0, add one to the number of reﬂected neutrons. If z > t, add one to the number of
transmitted neutrons. In either case skip to step 5 below.
4. Repeat steps 1–3 until the fate of the neutron has been determined.
5. Repeat steps 1–4 with additional incident neutrons until suﬃcient data has been obtained.
Problem 11.19. Elastic neutron scattering
(a) Write a program to implement the above algorithm for neutron scattering through a plate.
Assume t = 1 and pc = ps/2. Find the transmission, reﬂection, and absorption probabilities
for λ = 0.01, 0.05, 0.1, 1, and 10. Begin with 1000 incident neutrons and increase this
number until satisfactory statistics are obtained. Give a qualitative explanation of your
results.
(b) Choose t = 1, pc = ps, and λ = 0.05 and compare your results with the analogous case considered
in part (a).
(c) Repeat part (b) with t = 2 and λ = 0.1. Do the various probabilities depend on λ and t
separately or only on their ratio? Answer this question before doing the simulation.
(d) Draw some typical paths of the neutrons. From the nature of these paths, explain the results
in parts (a)–(c). For example, how does the number of scattering events change as the
absorption probability changes?
Problem 11.20. Inelastic neutron scattering
(a) In Problem 11.19 we assumed elastic scattering; that is, no energy is lost during scattering.
Here we assume that some of the neutron energy E is lost, and that the mean free path
is proportional to the speed and hence to
√
E. Modify your program so that a neutron
loses a fraction f of its energy at each scattering event and assume that λ =
√
E. Consider
f = 0.05,0.1, and 0.5 and compare your results with those found in Problem 11.19a using
the values for λ in Problem 11.19a to determine the initial values for E.
(b) Make a histogram for the path lengths between scattering events and plot the path length
distribution function for f = 0.1, 0.5, and 0 (elastic scattering).
This procedure for simulating neutron scattering and absorption is more computer intensive
than necessary. Instead of considering a single neutron at a time, we can consider a collection
of M neutrons. Then, instead of determining whether one neutron is captured or scattered,
we determine the number that is captured and the number that is scattered. For example, at the
ﬁrst scattering site, pcM of the neutrons are captured and psM are scattered. We also assume
that all the scattered neutrons move in the same direction with the same path length, both of
which are generated at random as before. At the next scattering site, there are p2
s M scattered
neutrons and pspcM captured neutrons. At the end of m steps, the number of neutrons remaining
is w = pm
s M, and the number of captured neutrons is (pc +pcps +pcp2
s +···+pcpm−1
s )M. If the
new position at the mth step is at z < 0, we add w to the sum for the reﬂected neutrons; if z > t,
we add w to the neutrons transmitted. When the neutrons are reﬂected or transmitted, we start
over again at z = 0 with another collection of neutrons.
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 434
Problem 11.21. More eﬃcient neutron scattering method
Apply the improved Monte Carlo method to neutron transmission through a plate. Repeat the
simulations suggested in Problem 11.19 and compare your new and previous results. Also,
compare the computational times for the two approaches to obtain comparable statistics.
The power of the Monte Carlo method becomes apparent for more complicated geometries
or when the material is spatially nonuniform so that the cross sections vary from point to point.
A problem of current interest is the absorption of various forms of radiation in the human body.
Problem 11.22. Transmission through layered materials
Consider two plates with the same thickness t = 1 that are stacked on top of one another with
no space between them. For one plate pc = ps, and for the other pc = 2ps; that is, the top plate
is a better absorber. Assume that λ = 1 in both plates. Find the transmission, reﬂection, and
absorption probabilities for elastic scattering. Does it matter which plate receives the incident
neutrons?
Appendix 11A: Error Estimates for Numerical Integration
We derive the dependence of the truncation error on the number of intervals for the numerical
integration methods considered in Sections 11.1 and 11.3. These estimates are based on the
assumed adequacy of the Taylor series expansion of the integrand f (x):
f (x) = f (xi) + f (xi)(x − xi) +
1
2
f (xi)(x − xi)2
+ ··· (11.64)
and the integration of (11.1) in the interval xi ≤ x ≤ xi+1:
xi+1
xi
f (x)dx = f (xi)∆x +
1
2
f (xi)(∆x)2
+
1
6
f (xi)(∆x)3
+ ··· . (11.65)
We ﬁrst estimate the error associated with the rectangular approximation with f (x) evaluated
at the left side of each interval. The error ∆i in the interval [xi,xi+1] is the diﬀerence
between (11.65) and the estimate f (xi)∆x:
∆i =
xi+1
xi
f (x)dx − f (xi)∆x ≈
1
2
f (xi)(∆x)2
. (11.66)
We see that to leading order in ∆x, the error in each interval is order (∆x)2. Because there
are a total of n intervals and ∆x = (b − a)/n, the total error associated with the rectangular
approximation is n∆i ∼ n(∆x)2 ∼ n−1.
The estimated error associated with the trapezoidal approximation can be found in the
same way. The error in the interval [xi,xi+1] is the diﬀerence between the exact integral and the
estimate 1
2 [f (xi) + f (xi+1)]∆x:
∆i =
xi+1
xi
f (x)dx −
1
2
[f (xi) + f (xi+1)]∆x. (11.67)
If we use (11.65) to estimate the integral and (11.64) to estimate f (xi+1) in (11.67), we ﬁnd that
the term proportional to f cancels, and that the error associated with one interval is order
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 435
(∆x)3. Hence, the total error in the interval [a,b] associated with the trapezoidal approximation
is order n−2.
Because Simpson’s rule is based on ﬁtting f (x) in the interval [xi−1,xi+1] to a parabola, error
terms proportional to f cancel. We might expect that error terms of order f (xi)(∆x)4 contribute,
but these terms cancel by virtue of their symmetry. Hence the (∆x)4 term of the Taylor
expansion of f (x) is adequately represented by Simpson’s rule. If we retain the (∆x)4 term in
the Taylor series of f (x), we ﬁnd that the error in the interval [xi,xi+1] is of order f (xi)(∆x)5,
and that the total error in the interval [a,b] associated with Simpson’s rule is O(n−4).
The error estimates can be extended to two dimensions in a similar manner. The twodimensional
integral of f (x,y) is the volume under the surface determined by f (x,y). In the
“rectangular” approximation, the integral is written as a sum of the volumes of parallelograms
with cross sectional area ∆x∆y and a height determined by f (x,y) at one corner. To determine
the error, we expand f (x,y) in a Taylor series:
f (x,y) = f (xi,yi) +
∂f (xi,yi)
∂x
(x − xi) +
∂f (xi,yi)
∂y
(y − yi) + ··· , (11.68)
and write the error as
∆i = f (x,y)dxdy − f (xi,yi)∆x∆y. (11.69)
If we substitute (11.68) into (11.69) and integrate each term, we ﬁnd that the term proportional
to f cancels and the integral of (x − xi)dx yields 1
2 (∆x)2. The integral of this term with respect
to dy gives another factor of ∆y. The integral of the term proportional to (y −yi) yields a similar
contribution. Because ∆y is also order ∆x, the error associated with the intervals [xi,xi+1] and
[yi,yi+1] is to leading order in ∆x:
∆i ≈
1
2
[fx (xi,yi) + fy (xi,yi)](∆x)3
. (11.70)
We see that the error associated with one parallelogram is order (∆x)3. Because there are n
parallelograms, the total error is order n(∆x)3. However, in two dimensions, n = A/(∆x)2, and
hence the total error is order n−1/2. In contrast, the total error in one dimension is order n−1, as
we saw earlier.
The corresponding error estimates for the two-dimensional generalizations of the trapezoidal
approximation and Simpson’s rule are order n−1 and n−2, respectively. In general, if the
error goes as order n−a in one dimension, then the error in d dimensions goes as n−a/d. In contrast,
Monte Carlo errors vary as n−1/2, independent of d. Hence, for large enough d, Monte
Carlo integration methods will lead to smaller errors for the same choice of n.
Appendix 11B: The Standard Deviation of the Mean
In Section 11.4 we found empirically that the probable error associated with a single measurement
consisting of n samples is σ/
√
n, where σ is the standard deviation associated with n data
points. We derive this relation in the following. The quantity of experimental interest is denoted
as x. Consider m sets of measurements each with n samples for a total of mn samples. For
simplicity, we will assume that n >> 1. We use the index α to denote a particular measurement
and the index i to designate the ith sample within a measurement. We denote xα,i as sample i
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 436
in the measurement α. The value of a measurement is given by
Mα =
1
n
n
i=1
xα,i. (11.71)
The mean M of the total mn individual samples is given by
M =
1
m
m
α=1
Mα =
1
mn
m
α=1
n
i=1
xα,i. (11.72)
The diﬀerence between measurement α and the mean of all the measurements is given by
eα = Mα − M. (11.73)
We can write the variance of the means as
σ2
m =
1
m
m
α=1
eα
2
. (11.74)
We now wish to relate σm to the variance of the individual measurements. The discrepancy
dα,i between an individual sample xα,i and the mean is given by
dα,i = xα,i − M. (11.75)
Hence, the variance σ2 of the mn individual samples is
σ2
=
1
mn
m
α=1
n
i=1
dα,i
2
. (11.76)
We write
eα = Mα − M =
1
n
m
i=1
xα,i − M (11.77a)
=
1
n
n
i=1
dα,i. (11.77b)
If we substitute (11.77b) into (11.74), we ﬁnd
σ2
m =
1
m
m
α=1
1
n
m
i=1
dα,i
1
n
m
j=1
dα,j . (11.78)
The sum in (11.78) over samples i and j in set α contains two kinds of terms—those with i = j
and those with i j. We expect that dα,i and dα,j are independent and equally positive or
negative on the average. Hence in the limit of a large number of measurements, we expect that
only the terms with i = j in (11.78) survive, and we write
σ2
m =
1
mn2
m
α=1
m
i=1
dα,i
2
. (11.79)
If we combine (11.79) with (11.76), we arrive at the result
σ2
m =
σ2
n
. (11.80)
We intepret σm as the error in the original n measurments because σm provides an estimate of
how much an average over n measurments will deviate from the exact mean.
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 437
Appendix 11C: The Acceptance-Rejection Method
Although the inverse transform method discussed in Section 11.5 can be used in principle to
generate any desired probability distribution, in practice the method is limited to functions for
which the equation r = P (x) can be solved analytically for x or by simple numerical approximation.
Another method for generating nonuniform probability distributions is the acceptancerejection
method due to von Neumann. Suppose that p(x) is the (normalized) probability density
function that we wish to generate. For simplicity, we assume p(x) is nonzero in the unit interval.
Consider a positive deﬁnite comparison function w(x) such that w(x) > p(x) in the entire range of
interest. A simple, although not generally optimum, choice of w is a constant greater than the
maximum value of p(x). Because the area under the curve p(x) in the range x to x + ∆x is the
probability of generating x in that range, we can follow a procedure similar to that used in the
hit or miss method. Generate two numbers at random to deﬁne the location of a point in two
dimensions which is distributed uniformly in the area under the comparison function w(x). If
this point is outside the area under p(x), the point is rejected; if it lies inside the area, we accept
it. This procedure implies that the accepted points are uniform in the area under the curve
p(x), and that their x values are distributed according to p(x). One procedure for generating a
uniform random point (x,y) under the comparison function w(x) is as follows:
1. Choose a form of w(x). One convenient choice would be to choose w(x) such that the values
of x distributed according to w(x) can be generated by the inverse transform method. Let
the total area under the curve w(x) be equal to A.
2. Generate a uniform random number in the interval [0,A] and use it to obtain a corresponding
value of x distributed according to w(x).
3. For the value of x generated in step 2, generate a uniform random number y in the interval
[0,w(x)]. The point (x,y) is uniformly distributed in the area under the comparison
function w(x). If y ≤ p(x), then accept x as a random number distributed according to p(x).
Repeat steps 2 and 3 many times. Note that the acceptance-rejection method is eﬃcient only if
the comparison function w(x) is close to p(x) over the entire range of interest.
Appendix 11D: Polynomials and Interpolation
Interpolation is a technique that allows us to estimate a function within the range of a tabulated
set of sample points. For example, Fourier analysis (see Chapter 9) generates a trigonometric series
that can be evaluated between the points that are used to calculate the coeﬃcients. We now
describe how polynomials are implemented and used to interpolate between sample points.
A polynomial is a function that is expressed as
p(x) =
n
i=0
aixi
(11.81)
where n is the degree of the polynomial and the n constants ai are the coeﬃcients. The evaluation
of (11.81) as written is very ineﬃcient because x is repeatedly multiplied by itself and the entire
sum requires O(N2) multiplications. A more eﬃcient algorithm was published in 1819 by W.
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 438
Polynomial Methods
add(double a) Adds a scalar and returns a new polynomial.
add(Polynomial p) Adds a polynomial and returns a new polynomial.
deflate(double r) Reduces the degree by removing the given root r.
derivative() Returns the derivative.
integral(double a) Returns the integral having the value a at x = 0.
roots(double tol) Gets the roots using Newton’s method.
subtract(double a) Subtracts a scalar and returns a new polynomial.
subtract(Polynomial p) Subtracts a polynomial and returns a new polynomial.
Table 11.4: Some of the methods for manipulating polynomials in the Open Source Physics
library.
G. Horner.1 It uses a factored polynomial and requires only n multiplications and n additions
and is known as Horner’s rule. It is written as follows:
p(x) = a0 + x a1 + x a2 + x[a3 + ···] . (11.82)
This algorithm can dramatically reduce processor time if large polynomials are repeatedly eval-
uated.
Polynomials are important computationally because most analytic functions can be approximated
as a polynomial using a Taylor series expansion. Polynomials can be added, multiplied,
integrated, and diﬀerentiated analytically and the result is still a polynomial. The Polynomial
class in the numerics package implements many of these algebraic operations (see Table 11.4).
Listing 11.5 shows how this class is used to calculate and display a polynomial’s roots.
Listing 11.5: The PolynomialApp class tests the Polynomial class.
package org . opensourcephysics . sip . ch11 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . frames . ;
import org . opensourcephysics . numerics . ;
public class PolynomialApp extends AbstractCalculation {
PlotFrame plotFrame = new PlotFrame ( "x" , "f(x)" ,
"Polynomial Visualization" ) ;
double xmin , xmax ;
Polynomial p ;
public void resetCalculation ( ) {
control . setValue ( "coefficients" , "-2,0,1" ) ;
control . setValue ( "xmin" , −10);
control . setValue ( "xmax" , 10);
}
public void calculate ( ) {
xmin = control . getDouble ( "xmin" ) ;
xmax = control . getDouble ( "xmax" ) ;
String [ ] c o e f f i c i e n t s =
1This method of evaluating polynomials by factoring was already known to Newton.
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 439
control . getString ( "coefficients" ) . s p l i t ( "," ) ;
p = new Polynomial ( c o e f f i c i e n t s ) ;
plotAndCalculateRoots ( ) ;
}
void plotAndCalculateRoots ( ) {
plotFrame . clearDrawables ( ) ;
plotFrame . addDrawable (new FunctionDrawer (p ) ) ;
double [ ] range = Util . getDomain (p , xmin , xmax , 100);
plotFrame . setPreferredMinMax (xmin , xmax , range [ 0 ] , range [ 1 ] ) ;
plotFrame . repaint ( ) ;
double [ ] roots = p . roots ( 0 . 0 0 1 ) ;
control . clearMessages ( ) ;
control . println ( "polynomial = "+p ) ;
for ( int i = 0 , n = roots . length ; i <n ; i ++) {
control . println ( "root = "+roots [ i ] ) ;
}
}
public void derivative ( ) {
p = p . derivative ( ) ;
plotAndCalculateRoots ( ) ;
}
public s t a t i c void main ( String [ ] args ) {
CalculationControl control =
CalculationControl . createApp (new PolynomialApp ( ) ) ;
control . addButton ( "derivative" , "Derivative" ,
"The derivative of the polynomial." ) ;
}
}
Exercise 11.23. Taylor series
Use the PolynomialApp class to plot the ﬁrst three nonzero terms of the Taylor series expansion
of the sine function. How accurate is this expansion in the interval |x| < π/2?
Exercise 11.24. Polynomials
Write a test program to do the following:
(a) Create the polynomial x4 − 5x3 + 5x2 + 5x − 6 and divide this polynomial by x − 2. Is 2 a root
of the original polynomial?
(b) Find the roots of x5 − 6x4 + x3 − 7x2 − 7x + 12.
Problem 11.25. Chebyshev polynomials
Orthogonal polynomials often can be written in terms of simple recurrence relations. For example,
the Chebyshev polynomials of the ﬁrst kind Tn(x) can be written as
Tn(x) = 2xTn−1(x) − Tn−2(x) (11.83)
where T0(x) = 1 and T1(x) = x. Write and test a class using a static method that creates the
Chebyshev polynomials. To improve eﬃciency, your class should store the polynomials as they
are created during recursion.
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 440
It is always possible to construct a polynomial that passes through a set of n data points
(xi,yi) by creating a Lagrange interpolating polynomial as follows:
p(x) =
n
i=0
i j(x − xj)
i j(x−xj)
yi. (11.84)
For example, three data points generate the second-degree polynomial (see (11.5)):
p(x) =
(x − x1)(x0 − x2)
(x0 − x1)(x0 − x2)
y0 +
(x − x0)(x − x2)
(x1 − x0)(x1 − x2)
y1 +
(x − x0)(x − x1)
(x2 − x0)(x2 − x1)
y3. (11.85)
Note that terms multiplying the y values are zero at the sample data points except for the term
multiplying the sample data point’s abscissa yi. Various computational tricks can be used to
speed the evaluation of (11.84), but these will not be discussed here (see Besset or Press et al.).
We have implemented Lagrange’s polynomial interpolation formula using a generalized Horner
expansion in the LagrangeInterpolator class in the numerics package. Listing 11.6 tests this
class.
Listing 11.6: The LagrangeInterpolatorApp class tests the LagrangeInterpolator class by
sampling an arbitrary function and ﬁtting the samples by a polynomial.
package org . opensourcephysics . sip . ch11 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . frames . ;
import org . opensourcephysics . numerics . ;
public class LagrangeInterpolatorApp extends AbstractCalculation {
PlotFrame plotFrame = new PlotFrame ( "x" , "f(x)" ,
"Legendre Interpolation" ) ;
public void resetCalculation ( ) {
control . setValue ( "f(x)" , "sin(x)" ) ;
control . setValue ( "sample start" , −2);
control . setValue ( "sample stop" , 2 ) ;
control . setValue ( "n" , 5 ) ;
control . setValue ( "random y-error" , 0 ) ;
calculate ( ) ;
}
public void calculate ( ) {
String f s t r i n g = control . getString ( "f(x)" ) ;
double a = control . getDouble ( "sample start" ) ;
double b = control . getDouble ( "sample stop" ) ;
double err = control . getDouble ( "random y-error" ) ;
int n = control . getInt ( "n" ) ; / / number of i n t e r v a l s
double [ ] xData = new double [n ] ;
double [ ] yData = new double [n ] ;
double dx = (n>1) ? (b−a ) / ( n−1) : 0;
Function f ;
try {
f = new ParsedFunction ( f s t r i n g ) ;
} catch ( ParserException ex ) {
control . println ( ex . getMessage ( ) ) ;
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 441
return ;
}
plotFrame . clearData ( ) ;
double [ ] range = Util . getDomain ( f , a , b , 100);
plotFrame . setPreferredMinMax ( a −(b−a )/4 , b+(b−a )/4 , range [ 0 ] ,
range [ 1 ] ) ;
FunctionDrawer func = new FunctionDrawer ( f ) ;
func . color = java . awt . Color .RED;
plotFrame . addDrawable ( func ) ;
double x = a ;
for ( int i = 0; i <n ; i ++) {
xData [ i ] = x ;
yData [ i ] = f . evaluate ( x ) (1+ err ( −0.5+Math . random ( ) ) ) ;
plotFrame . append (0 , xData [ i ] , yData [ i ] ) ;
x += dx ;
}
LagrangeInterpolator interpolator =
new LagrangeInterpolator ( xData , yData ) ;
plotFrame . addDrawable (new FunctionDrawer ( interpolator ) ) ;
double [ ] coef = interpolator . getCoefficients ( ) ;
for ( int i = 0; i <coef . length ; i ++) {
control . println ( "c["+i+"]="+coef [ i ] ) ;
}
}
public s t a t i c void main ( String [ ] args ) {
CalculationControl . createApp (new LagrangeInterpolatorApp ( ) ) ;
}
}
Problem 11.26. Lagrange interpolation
Use the LagrangeInterpolatorApp class to answer the following questions.
(a) How do the interpolating polynomial’s coeﬃcients compare to the series expansion coeﬃcients
of the sine and exponential functions?
(b) How well does an interpolating polynomial match a unit step function? You will need to
write a step function class that implements the Function interface and thus contains an
evaluate method.
(c) Do your answers depend on the number of sample points?
(d) Add random error to the sample data points for each function. How sensitive is the ﬁt to
random errors?
Lagrange polynomials should be used cautiously. If the degree of the polynomial is high, the
distance between points is large; or if the points are subject to experimental error, the resulting
polynomial can oscillate wildly. Press et al. recommend that interpolating polynomials be small.
If the data is accurate but the amount of data is large, we often use a polynomial constructed
from a small number of nearest neighbors. Cubic spline interpolation uses polynomials in this
way.
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 442
A cubic spline is a third-order polynomial that is required to have a continuous second
derivative with neighboring splines at its end points. Because it would be ineﬃcient to store a
large number of Polynomial objects, the CubicSpline class in the numerics package stores the
coeﬃcients for the multiple polynomials needed to ﬁt a data set in a single array. The CubicSplineApp
program tests this class, but it is not shown here because it is similar to Lagrange-
InterpolatorApp.
Exercise 11.27. Cubic splines
Compare the cubic spline interpolating function to the Lagrange polynomial interpolating function
using the same samples as were used in Problem 11.26.
If the sample data is inaccurate, we often compute the coeﬃcients for a polynomial of lower
degree that passes as close as possible to the sample points. This ﬁtting procedure is often
used to construct an ad hoc function that describes the experimental data. The PolynomialLeastSquareFit
class in the numerics package implements such a ﬁtting algorithm (see Besset),
and the PolynomialFitApp program tests this class. It is not shown because it is similar to
LagrangeInterpolatorApp.
Exercise 11.28. Polynomial ﬁtting
The PolynomialFitApp simulates experimental data from a particle trajectory near the Earth.
How large a relative error in the y-values can be tolerated if we wish to determine the acceleration
of gravity to within ten%? How does this answer change if the number of samples is
increased by a factor of two? four? Discuss the eﬀects of changing the the degree of the ﬁtting
polynomial.
Suppose you are given a table of yi = f (xi) and are asked to determine the value of x that
corresponds to a given y. In other words, how do you ﬁnd the inverse function x = f −1(x)? An
interpolation routine that does not require evenly spaced ordinates, such as the CubicSpline
class, provides an easy and eﬀective solution. The following code uses this technique to deﬁne
the arcsin function.
Listing 11.7: The Arcsin class demonstrates how to use interpolation to deﬁne an inverse func-
tion.
package org . opensourcephysics . sip . ch11 ;
import org . opensourcephysics . numerics . CubicSpline ;
import org . opensourcephysics . numerics . Function ;
public class Arcsin {
s t a t i c Function arcsin ;
/ / p r o b i b i t i n s t a n t i a t i o n because a l l methods are s t a t i c
private Arcsin ( ) { }
s t a t i c public double evaluate ( double x ) {
i f ( ( x < −1)||(x >1)) {
return Double .NaN;
} else {
return arcsin . evaluate ( x ) ;
}
}
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 443
s t a t i c { / / c r e a t e s a s t a t i c function .
int n = 10;
double [ ] xd = new double [n ] ;
double [ ] yd = new double [n ] ;
double
x = −Math . PI /2 , dx = Math . PI /(n−1);
for ( int i = 0; i <n ; i ++) {
xd [ i ] = x ;
yd [ i ] = Math . sin ( x ) ;
x += dx ;
}
arcsin = new CubicSpline (yd , xd ) ;
}
}
Problem 11.29. Inverse functions
(a) How accurate is the arcsinx function shown in Listing 11.7 in the interval |x| < 0.5?
(b) Compare the number of tabulated points needed to produce relative accuracies of 1 : 102,
1 : 103, and 1 : 104 in the interval −0.5 < x < 0.5.
(c) Is polynomial interpolation more or less eﬃcient than spline interpolation for evaluating
inverse functions?
(d) Discuss the accuracy of the inverse interpolation of sinx if the interval is extended to |x| ≤
1.
References and Suggestions for Further Reading
Forman S. Acton, Numerical Methods That Work (Harper & Row, 1970); corrected edition,
Mathematical Association of America (1990). A delightful book on numerical methods.
Didier H. Besset, Object-Oriented Implementation of Numerical Methods (Morgan Kaufmann,
2001).
Isabel Beichl and Francis Sullivan, “The importance of importance sampling,” Computing in
Science and Engineering 1 (2), 71–73 (1999).
P. H. Borcherds, “Importance sampling: An illustrative introduction,” Eur. J. Phys. 21, 405–411
(2000).
Bradley Efron and Robert J. Tibshirani, An Introduction to the Bootstrap (Chapman and Hall,
1993).
Julio Fernández and Carlos Criado, “Algorithm for normal random numbers,” Phys. Rev. E 60,
3361–3365 (1999).
Steven E. Koonin and Dawn C. Meredith, Computational Physics (Addison–Wesley, 1990).
Chapter 8 covers much of the same material on Monte Carlo methods as discussed in this
chapter.
CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 444
Malvin H. Kalos and Paula A. Whitlock, Monte Carlo Methods, Vol. 1: Basics (John Wiley &
Sons, 1986). The authors are well-known experts on Monte Carlo methods.
William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery, Numerical
Recipes, 2nd ed. (Cambridge University Press, 1992).
Reuven Y. Rubinstein, Simulation and the Monte Carlo Method John Wiley & Sons, 1981). An
advanced, but clearly written treatment of Monte Carlo methods.
I. M. Sobol, The Monte Carlo Method (Mir Publishing, 1975). A very readable short text with
excellent sections on nonuniform probability densities and the neutron transport prob-
lem.
Chapter 12
Percolation
Christian
We introduce several geometrical concepts associated with percolation, including the percolation
threshold, clusters, and cluster ﬁnding algorithms. We also introduce the ideas of critical
phenomena in the context of the percolation transition, including critical exponents, scaling
relations, and the renormalization group.
12.1 Introduction
If a container is ﬁlled with small glass beads, and a battery is connected to the ends of the
container, no current would pass and the system would be an insulator. Suppose that we choose
a glass bead at random and replace it by a small steel ball. Clearly, the system would still be an
insulator. If we continue randomly replacing glass beads with steel balls, eventually a current
would pass. What percentage of steel balls is needed for the container to become a conductor?
The change from the insulating to the conducting state that occurs as the percentage of steel
balls is increased is an example of a percolation phase transition.
Another example of percolation is from the kitchen. Imagine a large metal sheet on which
we randomly place drops of cookie dough. Assume that each drop of cookie dough spreads
while the cookies are baking in an oven. If two cookies touch, they coalesce to form one cookie.
If we are not careful, we might ﬁnd a very large cookie that spans from one edge of the sheet
to the opposite edge (see Figure 12.1). If such a spanning cookie exists, we say that there has
been a percolation transition. As we will discuss in more detail, percolation has to do with
connectivity.
Our discussion of percolation will require little background in physics, for example, no
classical or quantum mechanics and little statistical physics. All that is required is some understanding
of geometry and probability. Much of the appeal of percolation is its game-like
aspects and intuitive simplicity. If you have a background in physics, this chapter will be more
meaningful and can serve as an introduction to phase transitions and to important ideas such
as scaling relations, critical exponents, and the renormalization group.
We ﬁrst discuss a simple model of the cookie example to make the concept of percolation
more explicit. We represent the cookie sheet by a lattice where each site can be in one of two
states, occupied or empty. Each site is occupied independently of its neighbors with probability
445
CHAPTER 12. PERCOLATION 446
Figure 12.1: Cookies (circles) placed at random on a large sheet. Note that in this case there is a
path of overlapping circles that connects the bottom and top edges of the cookie sheet. If such
a path exists, we say that the cookies percolate, and there is a spanning path. See Problem 12.4e
for a discussion of the algorithm used to generate this conﬁguration.
p. This model of percolation is called site percolation. The occupied sites form clusters, which
are groups of occupied nearest neighbor lattice sites (see Figure 12.2).
An easy way to study site percolation is to generate a uniform random number r in the unit
interval 0 < r ≤ 1 for each site in the lattice. A site is occupied if its random number satisﬁes
the condition r ≤ p. If p is small, we expect that only small isolated clusters will be present (see
Figure 12.3a). If p is near unity, we expect that most of the lattice will be occupied, and the
occupied sites will form a large cluster that extends from one end of the lattice to the other (see
Figure 12.3c). Such a cluster is said to be a spanning cluster. Because there is no spanning cluster
for small p, and there is a spanning cluster for p near unity, there must be an intermediate value
of p at which a spanning cluster ﬁrst exists (see Figure 12.3b). We shall see that in the limit of
an inﬁnite lattice, there exists a well-deﬁned threshold probability pc such that:
For p < pc, no spanning cluster exists and all clusters are ﬁnite.
For p > pc, a spanning cluster exists.
For p = pc, a spanning cluster exists with a probability greater than zero and less
than unity.
We emphasize that the deﬁning characteristic of percolation is connectedness. Because the connectedness
exhibits a qualitative change at a well deﬁned value of a continuous parameter, we
shall see that the transition from a state with no spanning cluster to a state with a spanning
cluster is an example of a phase transition.
An example of percolation that can easily be observed in the laboratory has been done with a
wire mesh. Watson and Leath measured the electrical conductivity of a uniform metallic screen
as the nodes connecting wire links were removed. The coordinates of the nodes to be removed
were determined by a random number generator. The measured electrical conductivity is a
CHAPTER 12. PERCOLATION 447
Figure 12.2: Example of a site percolation cluster on a square lattice of linear dimension L = 2.
The two nearest neighbor occupied sites (shaded) in (a) are part of a cluster of size two; the two
occupied sites in (b) are not nearest neighbor sites and do not belong to the same cluster; each
occupied site is a cluster of size one.
p = 0.2 p = 0.59 p = 0.8
Figure 12.3: Examples of site percolation clusters on a square lattice of linear dimension L = 16
for p = 0.2, 0.59, and 0.8. On average, the fraction of occupied sites (shaded squares) is equal
to p. Note that in this example, there exists a cluster that spans the lattice horizontally and
vertically for p = 0.59.
rapidly decreasing function of the fraction of nodes p that are still present and vanishes below
a critical threshold. A related conductivity measurement on a sheet of conducting paper with
random holes has been performed (see Mehr et al.).
The applications of percolation phenomena go beyond metal-insulator transitions and the
conductivity of wire mesh and include the spread of disease in a population, the behavior
of magnets diluted by nonmagnetic impurities, the ﬂow of oil through porous rock, the microstructure
of ﬁber-reinforced concrete, and the characterization of gels. Percolation ideas
have also been used to understand clusters in such diverse systems as granular matter and
social networks. We concentrate on understanding several simple models of percolation that
have an intuitive appeal of their own. Some of the applications of percolation phenomena are
discussed in the references.
12.2 The Percolation Threshold
PercolationApp in Listing 12.1 generates site percolation conﬁgurations and initially shows
the occupied sites as red squares of unit area. The state of each site is stored in lattice, which
is an object of type LatticeFrame. Method LatticeFrame.resizeLattice is used to initialize
the lattice to the desired size; LatticeFrame.setAtIndex(site,value) sets site to value. Al-
CHAPTER 12. PERCOLATION 448
though the lattice is two-dimensional, it is easier to represent the lattice as a one-dimensional
array. (The site index at (x,y) is x + L*y.) An unoccupied site has value −2, and an occupied
site that has not yet been assigned to a cluster has value −1. Values 0–6 are used to color the
clusters. The InteractiveMouseHandler interface is used to allow the user to click on an occupied
site. Then the colorCluster method iteratively colors all the sites in the associated cluster.
That is, after a site is added to the cluster, we add it to the array sitesToTest so that we can
test the site’s neighbors for cluster membership. When sitesToTest is empty, all the possible
sites have been added to the cluster.
Listing 12.1: The PercolationApp class.
package org . opensourcephysics . sip . ch12 ;
import java . awt . Color ;
import java . awt . event . MouseEvent ;
import java . u t i l .Random;
import org . opensourcephysics . display . ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . LatticeFrame ;
public class PercolationApp extends AbstractCalculation implements
InteractiveMouseHandler {
LatticeFrame l a t t i c e = new LatticeFrame ( "Percolation" ) ;
Random random = new Random ( ) ;
int L ;
int clusterNumber ; / / used to c o l o r c l u s t e r s
public PercolationApp ( ) {
l a t t i c e . setInteractiveMouseHandler ( this ) ;
/ / unoccupied s i t e s are black and have value −2
l a t t i c e . setIndexedColor ( −2 , Color .BLACK) ;
/ / occupied s i t e s that are not part of an i d e n t i f i e d c l u s t e r are red
and have value −1
l a t t i c e . setIndexedColor ( −1 , Color .RED) ;
/ / f ol l o wi n g c o l o r s used to i d e n t i f y c l u s t e r s when user c l i c k s on
/ / an occupied s i t e
l a t t i c e . setIndexedColor (0 , Color .GREEN) ;
l a t t i c e . setIndexedColor (1 , Color .YELLOW) ;
l a t t i c e . setIndexedColor (2 , Color .BLUE ) ;
l a t t i c e . setIndexedColor (3 , Color .CYAN) ;
l a t t i c e . setIndexedColor (4 , Color .MAGENTA) ;
l a t t i c e . setIndexedColor (5 , Color . PINK ) ;
/ / r e c y c l e s c l u s t e r c o l o r s s t a r t i n g from green
l a t t i c e . setIndexedColor (6 , Color .LIGHT_GRAY ) ;
}
/ / uses mouse c l i c k events to i d e n t i f y an occupied s i t e
/ / and i d e n t i f y i t s c l u s t e r
public void handleMouseAction ( InteractivePanel panel , MouseEvent evt ) {
panel . handleMouseAction ( panel , evt ) ;
i f ( panel . getMouseAction ()== InteractivePanel .MOUSE_PRESSED) {
int s i t e = l a t t i c e . indexFromPoint ( panel . getMouseX ( ) , panel . getMouseY ( ) ) ;
/ / t e s t i f a v a l i d s i t e was c l i c k e d ( index non−negative ) ,
/ / and i f s i t e i s occupied , but not yet c l u s t e r c o l o r e d ( value −1).
i f ( site >=0&&l a t t i c e . getAtIndex ( s i t e )==−1) {
CHAPTER 12. PERCOLATION 449
/ / c o l o r c l u s t e r to which s i t e belongs
colorCluster ( s i t e ) ;
/ / c y c l e through 7 c l u s t e r c o l o r s
clusterNumber = ( clusterNumber+1)%7;
l a t t i c e . repaint ( / / d i s p l a y l a t t i c e with c o l o r e d c l u s t e r
}
}
}
/ / Occupies a l l s i t e s with p r o b a b i l i t y p
public void calculate ( ) {
L = control . getInt ( "Lattice size" ) ;
l a t t i c e . r e s i z e L a t t i c e (L , L ) ; / / r e s i z e l a t t i c e
/ / same seed w i l l generate same s e t of random numbers
random . setSeed ( control . getInt ( "Random seed" ) ) ;
double p = control . getDouble ( "Site occupation probability" ) ;
/ / occupy l a t t i c e s i t e s with p r o b a b i l i t y p
for ( int i = 0; i <L L ; i ++) {
l a t t i c e . setAtIndex ( i , random . nextDouble () <p ? −1 : −2);
}
/ / f i r s t c l u s t e r w i l l have c o l o r green ( value 0)
clusterNumber = 0;
}
/ / returns j t h neighbor of s i t e s , where j can be 0 ( l e f t ) ,
/ / 1 ( r i g h t ) , 2 (down ) , or 3 ( above ) . I f no neighbor e x i s t s
/ / because of boundary , return −1.
/ / Change t h i s method f o r p e r i o d i c boundary c o n d i t i o n s .
int getNeighbor ( int s , int j ) {
switch ( j ) {
case 0 : / / l e f t
i f ( s%L==0) {
return −1;
} else {
return s −1;
}
case 1 : / / r i g h t
i f ( s%L==L−1) {
return −1;
} else {
return s +1;
}
case 2 : / / down
i f ( s /L==0) {
return −1;
} else {
return s−L ;
}
case 3 : / / above
i f ( s /L==L−1) {
return −1;
} else {
return s+L ;
CHAPTER 12. PERCOLATION 450
}
default :
return −1;
}
}
void colorCluster ( int i n i t i a l S i t e ) { / / c o l o r a l l s i t e s in c l u s t e r
/ / c l u s t e r s i t e s whose neighbors have not yet been examined
int [ ] sitesToTest = new int [L L ] ;
int numSitesToTest = 0; / / number of s i t e s in s i t e s T o T e s t [ ]
/ / c o l o r i n i t i a l S i t e according to clusterNumber
l a t t i c e . setAtIndex ( i n i t i a l S i t e , clusterNumber ) ;
/ / add i n i t i a l S i t e to s i t e s T o T e s t [ ]
sitesToTest [ numSitesToTest++] = i n i t i a l S i t e ;
/ / grow c l u s t e r u n t i l numSitesToTest = 0
while ( numSitesToTest >0) {
/ / get next s i t e to t e s t and remove i t from l i s t
int s i t e = sitesToTest [−−numSitesToTest ] ;
for ( int j = 0; j <4; j ++) { / / v i s i t four p o s s i b l e neighbors
int neighborSite = getNeighbor ( site , j ) ;
/ / t e s t i f n e i g h b o r S i t e i s occupied , and not yet added to c l u s t e r
i f ( neighborSite >=0&&l a t t i c e . getAtIndex ( neighborSite )==−1) {
/ / c o l o r n e i g h b o r S i t e according to clusterNumber
l a t t i c e . setAtIndex ( neighborSite , clusterNumber ) ;
/ / add n e i g h b o r S i t e to s i t e s T o T e s t [ ]
sitesToTest [ numSitesToTest++] = neighborSite ;
}
}
}
}
public void reset ( ) {
control . setValue ( "Lattice size" , 32);
control . setValue ( "Site occupation probability" , 0.5927);
control . setValue ( "Random seed" , 100);
calculate ( ) ;
}
public s t a t i c void main ( String args [ ] ) {
CalculationControl control =
CalculationControl . createApp (new PercolationApp ( ) ) ;
}
}
The percolation threshold pc is deﬁned as the site occupation probability p at which a spanning
cluster ﬁrst appears in an inﬁnite lattice. However, for a ﬁnite lattice, there is a nonzero
probability of a spanning cluster connecting one side of the lattice to the opposite side for any
value of p > 0. For small p, this probability is order pL (see Figure 12.4), and the probability of
spanning goes to zero as L becomes large. Hence, for small p and suﬃciently large L, only ﬁnite
clusters exist.
For a ﬁnite lattice, the deﬁnition of spanning is arbitrary. For example, we can deﬁne a
spanning cluster as one that (i) spans the lattice either horizontally or vertically; (ii) spans the
lattice in a ﬁxed direction, for example, vertically; or (iii) spans the lattice both horizontally and
CHAPTER 12. PERCOLATION 451
Figure 12.4: An example of a spanning cluster with a probability proportional to pL on a L = 8
lattice. The probability of a spanning cluster with more sites will be proportional to a higher
power of p.
vertically. These spanning rules are based on open (nonperiodic) boundary conditions, which
we will use because the resulting clusters are easier to visualize and determine. The criterion
for deﬁning pc(L) for a ﬁnite lattice is also somewhat arbitrary. One possibility is to deﬁne pc(L)
as the mean value of p at which a spanning cluster ﬁrst appears. Another possibility is to deﬁne
pc(L) as the value of p for which half of the conﬁgurations generated at random span the lattice.
These criteria will lead to the same extrapolated value for pc in the limit L → ∞. In Problem 12.1
we will ﬁnd an estimated value for pc(L) that is accurate to about 10%. A more sophisticated
analysis discussed in Project 12.13 allows us to extrapolate our results for pc(L) to L → ∞. In
Project 12.17 we will discuss the use of periodic boundary conditions to deﬁne the clusters.
Problem 12.1. Site percolation on the square lattice
(a) Use PercolationApp to generate random site percolation conﬁgurations on a square lattice.
Estimate pc(L) by ﬁnding the mean value of p at which a spanning cluster ﬁrst occurs. For
a given seed, the calculate method assigns a random number to each site and determines
the occupancy of each site by comparing the sites’s random number to p. Choose one of
the spanning rules and begin with a value of p for which a spanning cluster is unlikely to
be present. Then systematically increase p until you ﬁnd a spanning cluster. Then choose
a new seed and, hence, a new set of random numbers. Repeat this procedure for at least
ten conﬁgurations and ﬁnd the average value of pc(L). (Each conﬁguration corresponds to a
diﬀerent set of random numbers.)
(b) Repeat part (a) for larger values of L. Is pc(L) better deﬁned for larger L; that is, are the
values of pc(L) spread over a smaller range of values? How quickly can you visually determine
the existence of a spanning cluster? Describe your visual algorithm for determining if
a spanning cluster exists.
(c) Choose L ≥ 1024 and generate a conﬁguration of sites at p = pc. For this value of L, you won’t
be able to distinguish the individual sites. Click on the lattice until you generate some large
clusters. Describe their visual appearance. For example, are they compact or ramiﬁed?
The value of pc depends on the symmetry of the lattice and on its dimension. In addition
to the square lattice, the most common two-dimensional lattice is the triangular lattice. As
discussed in Chapter 8, the essential diﬀerence between the square and triangular lattices is the
number of nearest neighbors.
CHAPTER 12. PERCOLATION 452
∗Problem 12.2. Site percolation on the triangular lattice
Modify PercolationApp to simulate random site percolation on a triangular lattice. Assume
that a connected path connects the top and bottom sides of the lattice (see Figure 12.5). Do you
expect pc for the triangular lattice to be smaller or larger than the value of pc for the square
lattice? Estimate pc(L) for increasing values of L. Are your results for pc consistent with your
expectations? As we will discuss in the following, the exact value of pc for the triangular lattice
is pc = 1/2.
In bond percolation each lattice site is occupied, but only a fraction of the sites have connections
or occupied bonds between them and their nearest neighbor sites (see Figure 12.6). Each
bond is either occupied with probability p or not occupied with probability 1 − p. A cluster is
a group of sites connected by occupied bonds. The wire mesh described in Section 12.1 is an
example of bond percolation if we imagine cutting the bonds between the nodes rather than removing
the nodes themselves. An application of bond percolation to the description of gelation
is discussed in Problem 12.3.
For bond percolation on the square lattice, the exact value of pc can be obtained by introducing
the dual lattice. The nodes of the dual lattice are the centers of the squares between
the nodes in the original lattice (see Figure 12.7). The occupied bonds of the dual lattice are
those that do not cross an occupied bond of the original lattice. Because every occupied bond
on the dual lattice crosses exactly one unoccupied bond of the original lattice, the probability ˜p
of an occupied bond on the dual lattice is 1 − p, where p is the probability of an occupied bond
on the original lattice. If we assume that the dual lattice percolates if and only if the original
lattice does not, and vice versa, then pc = 1 − pc or pc = 1/2. This assumption holds for bond
percolation on a square lattice because if a cluster in the original lattice spans in both directions,
then because the occupied dual lattice bonds can only cross unoccupied bonds of the original
lattice, the dual lattice clusters are blocked from spanning. An example is shown in Figure 12.7.
This argument does not apply to cubic lattices in three dimensions, but it can be used for site
percolation on a triangular lattice to yield pc = 1/2.
∗Problem 12.3. Bond percolation on a square lattice
Suppose that all the lattice sites of a square lattice are occupied by monomers, each with functionality
four; that is, each monomer can form a maximum of four bonds. This model is equivalent
to bond percolation on a square lattice. Assume that the presence or absence of a bond
between a given pair of monomers is random and is characterized by the probability p. For
small p, the system consists of only ﬁnite polymers (groups of monomers) and the system is in
the sol phase. For some threshold value pc, there will be a single polymer that spans the lattice.
We say that for p ≥ pc, the system is in the gel phase. How does a bowl of jello, an example of
a gel, diﬀer from a bowl of broth? Write a program to simulate bond percolation on a square
lattice and determine the bond percolation threshold. Are your results consistent with the exact
result pc = 1/2?
We can also consider continuum percolation models. For example, we can place disks at
random into a two-dimensional box. Two disks are in the same cluster if they touch or overlap.
A typical continuum (oﬀ-lattice) percolation conﬁguration is depicted in Figure 12.8. One
quantity of interest is the quantity φ, the fraction of the area (volume in three dimensions) in
the system that is covered by disks. In the limit of an inﬁnite size box, it can be shown that
φ = 1 − e−ρπr2
(12.1)
CHAPTER 12. PERCOLATION 453
Figure 12.5: Example of a spanning site cluster on a L = 4 triangular lattice. The ﬁlled circles
represent the occupied sites.
Figure 12.6: Two examples of bond clusters. The occupied bonds are shown as bold lines.
where ρ is the number of disks per unit area, and r is the radius of a disk (see Xia and Thorpe).
Equation (12.1) is not accurate for small boxes because disks located near the edge of the box
are a signiﬁcant fraction of the total number of disks.
Problem 12.4. Continuum percolation
(a) Suppose that disks of unit diameter are placed at random on the sites of a square lattice with
unit lattice spacing. Deﬁne φ as the area fraction covered by the disks. Convince yourself
that φc = πpc/4.
(b) Modify PercolationApp to simulate continuum percolation. Instead of placing the disks on
regular lattice sites, place their centers at random in a square box of area L2. The relevant
parameter is the density ρ, the number of disks per unit area, instead of the probability
p. We can no longer use the LatticeFrame class. Instead, two arrays are needed to store
the x and y locations of the disks. When the mouse is clicked on a disk, your program will
need to determine which disk is at the location of the mouse, and then check all the other
disks to see if they overlap or touch the disk you have chosen. This check is recursively
continued for all overlapping disks. It is also useful to have an array that keeps track of
the clusterNumber for each disk. Only disks that have not been assigned a cluster number
need to be checked for overlaps.
(c) Estimate the value of the density ρc at which a spanning cluster ﬁrst appears. Given this
value of ρc, use a Monte Carlo method to estimate the corresponding area fraction φc (see
Section 11.2). Choose points at random in the box and compute the fraction of points that
lie within any disk. Explain why φc is larger for continuum percolation than it is for site
percolation. Compare your direct Monte Carlo estimate of φc with the indirect value of φc
obtained from (12.1) using the value of ρc. Explain any discrepancy.
CHAPTER 12. PERCOLATION 454
Figure 12.7: Occupied bonds on a bond percolation lattice are shown by heavy dark lines. The
dual lattice consists of the open circles. The dashed lines are the occupied bonds on the dual
lattice. The original lattice contains a cluster that spans both vertically and horizontally, which
prevents the dual lattice from having a spanning cluster.
(d)∗ Consider the simple model of the cookie problem discussed in Section 12.1. Write a program
that places disks at random into a square box and chooses their diameter randomly
between 0 and 1. Estimate the value of ρc at which a spanning cluster ﬁrst appears and
compare its value to your estimate found in part (c)? Is your value for φc more or less than
what was found in part (c)?
(e)∗ Another variation of the cookie problem is to place disks with unit diameter at random
in a box with the constraint that the disks do not overlap. Continue to add disks until the
fraction of successful attempts becomes less than 1%, that is, when one hundred successive
attempts at adding a disk are not successful. Does a spanning cluster exist? If not, increase
the diameters of all the disks at a constant rate (in analogy to the baking of the cookies)
until a spanning cluster is attained. How does φc for this model compare with the value of
φc found in part (d)?
A continuum model that is applicable to random porous media is known as the Swiss cheese
model. In this model the relevant quantity (the cheese) is the space between the disks. For the
Swiss cheese model in two dimensions, the cheese area fraction at the percolation threshold, ψc,
is given by ψc = 1 − φc, where φc is the disk area fraction at the percolation threshold of the
disks. Does such a relation hold in three dimensions (see Project 12.14)?
So far, we have emphasized the existence of the percolation threshold pc and the appearance
of a spanning cluster or path for p ≥ pc. Another quantity that characterizes percolation is
P∞(p), the probability that an occupied site belongs to the spanning cluster. The probability P∞
is deﬁned as
P∞(p) =
number of sites in the spanning cluster
total number of occupied sites
. (12.2)
As an example, P∞(p = 0.59) = 140/154 for the single conﬁguration shown in Figure 12.3b. An
accurate estimate of P∞ involves an average over many conﬁgurations for a given value of p. For
an inﬁnite lattice, P∞(p) = 0 for p < pc and P∞(p) = 1 for p = 1. Between pc and 1, P∞(p) increases
monotonically.
More information can be obtained from the cluster size distribution ns(p) deﬁned as
ns(p) =
average number of clusters of size s
total number of lattice sites
. (12.3)
For p ≥ pc, the spanning cluster is excluded from ns. (For historical reasons, the size of a cluster
refers to the number of sites in the cluster rather than to its spatial extent.) As an example, we
CHAPTER 12. PERCOLATION 455
Figure 12.8: A model of continuum (oﬀ-lattice) percolation realized by placing disks of unit
diameter at random into a square box of linear dimension L. If we concentrate on the voids
between the disks rather than the disks, then this model of continuum percolation is known as
the Swiss cheese model.
see from Figure 12.3a that ns(1) = 20/256, ns(2) = 4/256, ns(3) = 5/256, and ns(7) = 1/256 for
p = 0.2 and is zero otherwise.
Because N s sns is the total number of occupied sites (N is the total number of lattice sites),
and Nsns is the number of occupied sites in clusters of size s, the quantity
ws =
sns
s sns
(12.4)
is the probability that an occupied site chosen at random is part of an s-site cluster. The mean
cluster size S is deﬁned as
S(p) =
s
sws = s s2ns
s sns
. (12.5)
The sum in (12.5) is over ﬁnite clusters only. As an example, the weights corresponding to the
clusters in Figure 12.3a are ws(1) = 20/50, ws(2) = 8/50, ws(3) = 15/50, and ws(7) = 7/50, and
hence, S = 130/50.
Problem 12.5. Qualitative behavior of ns(p), S(p), and P∞(p)
(a) Use PercolationApp to visually determine the cluster size distribution ns(p) for a square
lattice with L = 16 and p = 0.4, p = pc, and p = 0.8. Take pc = 0.5927. Consider at least ten
conﬁgurations for each value of p and average ns(p) over the conﬁgurations. For each value
of p, plot ns as a function of s and describe the observed s-dependence. Does ns decrease
more rapidly with increasing s for p = pc or for p pc? Plot lnns versus s and versus lns.
Does either of these plots suggest the form of the s-dependence of ns? Is there a qualitative
change near pc? You probably will not be able to obtain deﬁnitive answers to these questions
at this point, but we will discuss a more quantitative approach later. Better results for ns
can also be found if periodic boundary conditions are used (see Project 12.17).
CHAPTER 12. PERCOLATION 456
(b) Use the same conﬁgurations considered in part (a) to compute the mean cluster size S as a
function of p. Remember that for p > pc, the spanning cluster is excluded.
(c) Similarly, compute P∞(p) for various values of p ≥ pc and plot P (p) as a function of p and
discuss its qualitative behavior.
(d) Verify that s sns(p) = p for p < pc and explain this relation. How is this relation modiﬁed
for p ≥ pc?
It is useful to associate a characteristic linear dimension or connectedness length ξ(p) with
the clusters. One way to do so is to deﬁne the radius of gyration Rs of a single cluster of s particles
as
R2
s =
1
s
s
i=1
(ri − r)2
(12.6)
where
r =
1
s
s
i=1
ri (12.7)
and ri is the position of the ith site in the same cluster. The quantity r is the familiar deﬁnition
of the center of mass of the cluster. From (12.6), we see that Rs is the root mean square radius
of the cluster measured from its center of mass.
The connectedness length ξ can be deﬁned as an average over the radii of gyration of all
the ﬁnite clusters. To ﬁnd the appropriate average for ξ, consider a site in a cluster of s sites.
The site is connected to s −1 other sites, and the mean square distance to these sites is the order
of R2
s . The probability that a site belongs to a cluster of site s is ws = sns. These considerations
suggest that a reasonable deﬁnition of ξ is
ξ2
= s(s − 1)ws R2
s
s(s − 1)ws
(12.8)
where R2
s is the average of R2
s over all clusters of s sites. To simplify the expression for ξ, we
write s instead of s − 1 and let ws = sns:
ξ2
= s s2 ns R2
s
s s2 ns
. (12.9)
As before, the sum in (12.9) is over only nonspanning clusters.
Problem 12.6. Simple calculation of the connectedness length
To obtain a feel for how to compute the connectedness length ξ, calculate it for the conﬁguration
shown in Figure 12.3a for p = 0.2.
12.3 Finding Clusters
So far we we have visually determined the clusters for a given conﬁguration at a particular value
of p. We now discuss an algorithm due to Newman and Ziﬀ for ﬁnding clusters at many values
of p. This algorithm is based on one that is well known in computer science in the context of
the union-ﬁnd problem. In the Newman–Ziﬀ algorithm we begin with an empty lattice and
keep track of the clusters as we randomly occupy new lattice sites. As each site is occupied, we
CHAPTER 12. PERCOLATION 457
determine whether it becomes a new cluster or whether it is a neighbor of an existing cluster (or
clusters). Because p = n/L2, where n is the number of occupied sites, p increases by 1/L2 each
time we occupy a new site. The algorithm can be summarized as follows (see class Clusters in
Listing 12.2).
1. Precompute the occupation order. One way to occupy the sites is to choose sites at random
until we ﬁnd an unoccupied site. However, this procedure will become ineﬃcient when
most of the sites are already occupied. Instead, we store the order in which the sites are to
be occupied in order[] and generate the order by randomly permuting the integers from
0 to N − 1 in method setOccupationOrder. For example, order[0] = 2 means that we
occupy site 2 ﬁrst.
2. Add sites according to predetermined order. When a new site is added, we check all its
neighbors to determine if the new site is an isolated cluster (all neighbors empty) or if it
joins one or more existing clusters.
3. Determine the clusters. The clusters are organized in a tree-like structure, with one site of
each cluster designated as the root. All sites in a given cluster, other than the root, point to
another site in the same cluster, so that the root can be reached by recursively following
the pointers. The “pointers”1 are stored in the parent array. To join two clusters, we add
a pointer from the root of the smaller cluster to the root of the larger one.
In the following example we use order = {2, 6, 8, 4, 5,...} to illustrate the method.
(i) Because order[0] = 2, we ﬁrst occupy site 2 and set parent[2] = −1. The negative sign
distinguishes site 2 as a root. The size of the cluster is stored as −parent[root]. In this
case -parent[2] = 1, and because no other sites are occupied, there is no possibility of
merging clusters.
(ii) For our example, order[1] = 6, and we initially set parent[6] = −1. We then consider the
neighbor sites of site 6. Sites 5 (left), 7 (right), and 10 (up) are unoccupied (parent[5] =
parent[7] = parent[10] = EMPTY), but site 2 (down) is occupied. Hence, we need to
merge the two clusters, and we set parent[6] = 2 and parent[2] = −2. That is, the value
of parent[6] points to site 2, and the value of parent[2] indicates that site 2 is the root
of a cluster of size 2.
(iii) The next sites to be occupied are 8 and 4, as shown in Figure 12.9. These two sites
form a size 2 cluster as before. We have parent[8] = −1 and then parent[4] = 8 and
parent[8] = −2. We see that the value of each element of the parent array has three
functions: nonroot occupied sites contain the index for the site’s parent in the cluster tree;
root sites are negative and equal to the negative of the number of sites in the cluster; and
unoccupied sites have the value EMPTY.
(iv) We next add site 5 and set parent[5] = −1. From Figure 12.9 we see that we have to
merge two clusters. We (arbitrarily) check the left neighbor of site 5 ﬁrst, and hence, we
ﬁrst merge the cluster of size 1 at site 5 with the cluster at site 8 of size 2. Hence, we set
parent[5] = 8 and parent[8] = −3. We next check the right neighbor of site 5 and ﬁnd
that we need to merge two clusters again with root sites at 8 and 2. Because the cluster at
site 8 is bigger, we set parent[2] = 8 and parent[8] = −5.
1We use the term “pointer” as it is used by Newman and Ziﬀ, that is, a link to an array index. A true pointer stores
a memory address and does not exist in Java.
CHAPTER 12. PERCOLATION 458
0
14
2
3
15
20 1 3
4 5 6 7
8 9 10 11
12 13 14
Figure 12.9: Illustration of the Newman–Ziﬀ algorithm. The order array is given by
{2, 6, 8, 4, 5,. . . }; the number below a site denotes the order in which that site was occupied.
When site 5 is occupied, we have to merge the two clusters as explained in the text.
Listing 12.2: Implementation of Newman–Ziﬀ algorithm for identifying clusters.
package org . opensourcephysics . sip . ch12 ;
public class Clusters {
s t a t i c private final int EMPTY = Integer .MIN_VALUE;
public int L ; / / l i n e a r dimension of l a t t i c e
public int N; / / N = L L
public int numSitesOccupied ; / / number of occupied l a t t i c e s i t e s
public int [ ] numClusters ; / / number of c l u s t e r s of s i z e s , n_s
/ / secondClusterMoment s t o r e s sum{ s^2 n_s } , where sum i s over
/ / a l l c l u s t e r s ( not counting spanning c l u s t e r )
/ / f i r s t c l u s t e r moment , sum{ s n_s } equals numSitesOccupied
/ / mean c l u s t e r s i z e S i s defined as
/ / S = secondClusterMoment / numSitesOccupied
private int secondClusterMoment ;
/ / spanningClusterSize , number of s i t e s in a spanning c l u s t e r ; 0 i f
/ / i t doesn ’ t e x i s t . Assume at most one spanning c l u s t e r
private int spanningClusterSize ;
/ / order [n] g i v e s index of nth occupied s i t e ; contains a l l numbers
/ / from [ 1 . . .N] , but in random order . For example , order [0] = 3 means
/ / we w i l l occupy s i t e 3 f i r s t . An a l t e r n a t i v e to using order array i s
/ / to choose s i t e s at random u n t i l we find an unoccupied s i t e
private int [ ] order ;
/ / parent [ ] array s e r v e s t h r e e purposes : s t o r e s c l u s t e r s i z e
/ / when s i t e i s root . Otherwise , i t s t o r e s index of
/ / the s i t e ’ s parent or i s EMPTY. The root i s found from an
/ / occupied s i t e by r e c u r s i v e l y f o ll o wi n g the parent array
/ / Recursion terminates when we encounter a negative value in the
/ / parent array , which i n d i c a t e s we have found the unique c l u s t e r
/ / root
/ / i f ( parent [ s ] >= 0) parent [ s ] i s parent s i t e index
CHAPTER 12. PERCOLATION 459
/ / i f (0 > parent [ s ] > EMPTY) s i s root of s i z e −parent [ s ]
/ / i f ( parent [ s ] == EMPTY) s i t e s i s empty ( unoccupied )
private int [ ] parent ;
/ / A spanning c l u s t e r touches both l e f t and r i g h t boundaries of the
/ / l a t t i c e . As c l u s t e r s are merged , we maintain t h i s information in
/ / f ol l o wi n g arrays at r o o t s . For example , i f root of a
/ / c l u s t e r i s at s i t e 7 and t h i s c l u s t e r touches the l e f t side ,
/ / then t o u c h e s L e f t [7] == true
private boolean [ ] touchesLeft , touchesRght ;
public Clusters ( int L) {
this . L = L ;
N = L L ;
numClusters = new int [N+1];
order = new int [N] ;
parent = new int [N] ;
touchesLeft = new boolean [N] ;
touchesRght = new boolean [N] ;
}
public void newLattice ( ) {
setOccupationOrder ( ) ; / / choose order in which s i t e s are occupied
/ / i n i t i a l l y a l l s i t e s are empty , and t h e r e are no c l u s t e r s
numSitesOccupied = secondClusterMoment = spanningClusterSize = 0;
for ( int s = 0; s<N; s++) {
numClusters [ s ] = 0;
parent [ s ] = EMPTY;
}
/ / i n i t i a l l y l e f t boundary t o u c h e s L e f t
/ / r i g h t boundary touchesRight
for ( int s = 0; s<N; s++) {
touchesLeft [ s ] = ( s%L==0);
touchesRght [ s ] = ( s%L==L−1);
}
}
/ / adds s i t e to l a t t i c e and updates c l u s t e r s
public void addRandomSite ( ) {
/ / i f a l l s i t e s are occupied , we can ’ t add anymore
i f ( numSitesOccupied==N) {
return ;
}
/ / newSite i s index of random s i t e to be occupied
int newSite = order [ numSitesOccupied ++];
/ / c r e a t e s a new c l u s t e r containing only s i t e newSite
numClusters [1]++;
secondClusterMoment++;
/ / s t o r e new c l u s t e r ’ s s i z e in parent [ ] ; negative sign
/ / d i s t i n g u i s h e s newSite as a root , with a s i z e value
/ / P o s i t i v e values correspond to nonroot s i t e s with
/ / index p o i n t e r s
parent [ newSite ] = −1;
CHAPTER 12. PERCOLATION 460
/ / merge newSite with occupied neighbors . root i s index of
/ / merged c l u s t e r root at each st ep
int root = newSite ;
for ( int j = 0; j <4; j ++) {
/ / n e i g h b o r S i t e i s j t h s i t e neighboring newly added s i t e newSite
int neighborSite = getNeighbor ( newSite , j ) ;
i f ( ( neighborSite !=EMPTY)&&(parent [ neighborSite ]!=EMPTY) ) {
root = mergeRoots ( root , findRoot ( neighborSite ) ) ;
}
}
}
/ / g e t s s i z e of c l u s t e r to which s i t e s belongs
public int getClusterSize ( int s ) {
return ( parent [ s]==EMPTY) ? 0 : −parent [ findRoot ( s ) ] ;
}
/ / returns s i z e of spanning c l u s t e r i f i t e x i s t s , otherwise 0
public int getSpanningClusterSize ( ) {
return spanningClusterSize ;
}
/ / returns S (mean c l u s t e r s i z e ) ; s i t e s belonging to spanning
/ / c l u s t e r not counted in c l u s t e r moments
public double getMeanClusterSize ( ) {
int spanSize = getSpanningClusterSize ( ) ;
/ / s u b t r a c t s i t e s in spanning c l u s t e r
double correctedSecondMoment = secondClusterMoment−spanSize spanSize ;
double correctedFirstMoment = numSitesOccupied−spanSize ;
i f ( correctedFirstMoment >0) {
return correctedSecondMoment/ correctedFirstMoment ;
} else {
return 0;
}
}
/ / given a s i t e index s , returns s i t e index r e p r e s e n t i n g the root
/ / of c l u s t e r to which s belongs
private int findRoot ( int s ) {
i f ( parent [ s ] <0) {
return s ; / / root s i t e ( with s i z e −parent [ s ] )
} else {
/ / f i r s t l i n k parent [ s ] to the c l u s t e r ’ s root to improve performance
/ / ( path compression ) ; then return t h i s value
parent [ s ] = findRoot ( parent [ s ] ) ;
}
return parent [ s ] ;
}
/ / returns j t h neighbor of s i t e s ; j can be 0 ( l e f t ) , 1 ( r i g h t ) ,
/ / 2 (down ) , or 3 ( above ) . I f no neighbor e x i s t s because of
/ / boundary , return value EMPTY. Change t h i s method f o r
CHAPTER 12. PERCOLATION 461
/ / p e r i o d i c boundary c o n d i t i o n s
private int getNeighbor ( int s , int j ) {
switch ( j ) {
case 0 :
return ( s%L==0) ? EMPTY : s −1; / / l e f t
case 1 :
return ( s%L==L−1) ? EMPTY : s +1; / / r i g h t
case 2 :
return ( s /L==0) ? EMPTY : s−L ; / / down
case 3 :
return ( s /L==L−1) ? EMPTY : s+L ; / / above
default :
return EMPTY;
}
}
/ / f i l l s order [ ] array with random permutation of s i t e i n d i c e s
/ / F i r s t order [ ] i s s e t to the i d e n t i t y permutation . Then f o r
/ / values of i in { 1 . . . N−1} , swap values of order [ i ] with
/ / order [ r ] , where r i s a random index in { i +1 . . . N}
private void setOccupationOrder ( ) {
for ( int s = 0; s<N; s++) {
order [ s ] = s ;
}
for ( int s = 0; s<N−1; s++) {
int r = s +( int ) (Math . random ( ) (N−s ) ) ;
int temp = order [ s ] ;
order [ s ] = order [ r ] ;
order [ r ] = temp ;
}
}
/ / u t i l i t y method to square an i n t e g e r
private int sqr ( int x ) {
return x x ;
}
/ / merges two root s i t e s into one to r e p r e s e n t c l u s t e r merging
/ / use h e u r i s t i c that root of smaller c l u s t e r points to
/ / root of l a r g e r c l u s t e r to improve performance
/ / parent [ root ] s t o r e s negative c l u s t e r s i z e
private int mergeRoots ( int r1 , int r2 ) {
/ / c l u s t e r s are uniquely i d e n t i f i e d by t h e i r root s i t e s . I f they
/ / are the same , c l u s t e r s are already merged , and we
/ / need do nothing
i f ( r1==r2 ) {
return r1 ;
/ / i f r1 has smaller c l u s t e r s i z e than r2 , r e v e r s e ( r1 , r2 ) l a b e l s
} else i f (− parent [ r1]<−parent [ r2 ] ) {
return mergeRoots ( r2 , r1 ) ;
} else { / / (− parent [ r1 ] > −parent [ r2 ] )
/ / update c l u s t e r count , and second c l u s t e r moment to account
/ / f o r l o s s of two small c l u s t e r s and gain of
CHAPTER 12. PERCOLATION 462
/ / one bigger c l u s t e r
numClusters[− parent [ r1 ]] − −;
numClusters[− parent [ r2 ]] − −;
numClusters[− parent [ r1]− parent [ r2 ]]++;
secondClusterMoment += sqr ( parent [ r1 ]+ parent [ r2 ] )
−sqr ( parent [ r1 ]) − sqr ( parent [ r2 ] ) ;
/ / c l u s t e r at r1 now i n c l u d e s s i t e s of old c l u s t e r at r2
parent [ r1 ] += parent [ r2 ] ;
/ / make r1 new parent of r2
parent [ r2 ] = r1 ;
/ / i f r2 touched l e f t or right , then so does merged c l u s t e r r1
touchesLeft [ r1 ] |= touchesLeft [ r2 ] ;
touchesRght [ r1 ] |= touchesRght [ r2 ] ;
/ / i f c l u s t e r at r1 spans l a t t i c e , then remember i t s s i z e
i f ( touchesLeft [ r1]&&touchesRght [ r1 ] ) {
spanningClusterSize = −parent [ r1 ] ;
}
/ / return new root s i t e r1
return r1 ;
}
}
}
Listing 12.3: ClustersApp.
package org . opensourcephysics . sip . ch12 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . ;
public class ClustersApp extends AbstractSimulation {
Scalar2DFrame grid = new Scalar2DFrame ( "Newman-Ziff cluster algorithm" ) ;
PlotFrame plot1 = new PlotFrame ( "p" , "Mean Cluster Size" ,
"Mean cluster size" ) ;
PlotFrame plot2 = new PlotFrame ( "p" , "P_?" , "P_?" ) ;
PlotFrame plot3 = new PlotFrame ( "p" , "P_span" , "P_span" ) ;
PlotFrame plot4 = new PlotFrame ( "s" , "<n_s>" ,
"Cluster size distribution" ) ;
Clusters l a t t i c e ;
double pDisplay ;
double [ ] meanClusterSize ;
double [ ] P_infinity ;
double [ ] P_span ; / / p r o b a b i l i t y of a spanning c l u s t e r
double [ ] numClustersAccum ; / / number of c l u s t e r s of s i z e s
int numberOfTrials ;
public void i n i t i a l i z e ( ) {
int L = control . getInt ( "Lattice size L" ) ;
grid . resizeGrid (L , L ) ;
l a t t i c e = new Clusters (L ) ;
pDisplay = control . getDouble ( "Display lattice at this value of p" ) ;
grid . setMessage ( "p = "+pDisplay ) ;
plot4 . setMessage ( "p = "+pDisplay ) ;
plot4 . setLogScale ( true , true ) ;
meanClusterSize = new double [L L ] ;
CHAPTER 12. PERCOLATION 463
P_infinity = new double [L L ] ;
P_span = new double [L L ] ;
numClustersAccum = new double [L L+1];
numberOfTrials = 0;
}
public void doStep ( ) {
control . clearMessages ( ) ;
control . println ( "Trial "+numberOfTrials ) ;
/ / adds s i t e s to new c l u s t e r , and accumulate r e s u l t s
l a t t i c e . newLattice ( ) ;
for ( int i = 0; i <l a t t i c e .N; i ++) {
l a t t i c e . addRandomSite ( ) ;
meanClusterSize [ i ] += ( double ) l a t t i c e . getMeanClusterSize ( ) ;
P_infinity [ i ] += ( double ) l a t t i c e . getSpanningClusterSize ( )
/ l a t t i c e . numSitesOccupied ;
P_span [ i ] += ( l a t t i c e . getSpanningClusterSize ()==0 ? 0 : 1 ) ;
i f ( ( int ) ( pDisplay l a t t i c e .N)== i ) {
for ( int j = 0; j <l a t t i c e .N; j ++) {
numClustersAccum [ j ] += l a t t i c e . numClusters [ j ] ;
}
displayLattice ( ) ;
}
}
/ / d i s p l a y accumulated r e s u l t s
numberOfTrials ++;
plotAverages ( ) ;
}
private void plotAverages ( ) {
plot1 . clearData ( ) ;
plot2 . clearData ( ) ;
plot3 . clearData ( ) ;
plot4 . clearData ( ) ;
for ( int i = 0; i <l a t t i c e .N; i ++) {
double p = ( double ) i / l a t t i c e .N; / / occupation p r o b a b i l i t y
plot1 . append (0 , p , meanClusterSize [ i ]/ numberOfTrials ) ;
plot2 . append (0 , p , P_infinity [ i ]/ numberOfTrials ) ;
plot3 . append (0 , p , P_span [ i ]/ numberOfTrials ) ;
i f ( numClustersAccum [ i +1]>0) {
plot4 . append (0 , i +1, numClustersAccum [ i +1]/ numberOfTrials ) ;
}
}
}
private void displayLattice ( ) {
double display [ ] = new double [ l a t t i c e .N] ;
for ( int s = 0; s<l a t t i c e .N; s++) {
display [ s ] = l a t t i c e . getClusterSize ( s ) ;
}
grid . setAll ( display ) ;
}
CHAPTER 12. PERCOLATION 464
public void reset ( ) {
control . setValue ( "Lattice size L" , 128);
control . setValue ( "Display lattice at this value of p" , 0.5927);
}
public s t a t i c void main ( String args [ ] ) {
SimulationControl . createApp (new ClustersApp ( ) ) ;
}
}
Problem 12.7. Qualitative behavior of various percolation quantities
(a) Read the code for class Clusters and explain how the Newman–Ziﬀ algorithm is imple-
mented.
(b) Collect data for P∞(p), the probability that an occupied site belongs to the spanning cluster,
S(p), the mean cluster size, and Pspan(p), the probability of a spanning cluster. Consider L =
8, 32, 128, and 256, and average over at least 100 conﬁgurations. How does the qualitative
behavior of these quantities change with increasing L? Discuss the qualitative dependence
of P∞ and S(p) on p for the largest lattice that you can simulate in a reasonable time.
(c) At what value of p is Pspan ≈ 0.5 for each value of L? Call this value pc(L). How strongly does
pc(L) depend on L? Extrapolate your results for pc(L) to L → ∞. For example, try ﬁtting your
data for pc(L) to the form pc(L) = pc −cL−x, where pc, c, and x are ﬁtting parameters. Because
you will likely have insuﬃcient data to determine three parameters with reasonable accuracy,
take x = 3/4 and plot pc(L) versus L−3/4. How sensitive is your result for the intercept
pc on the assumed value of x? A more sophisticated analysis is discussed in Project 12.13.
(d) Consider the cluster distribution ns(p). Why is ns a decreasing function of s? Does ns decrease
more quickly for p = pc or for p pc? Why is there so much scatter in ns for large s?
Plot lnns versus s and lnns versus lns for each value of p. Which form ﬁts best? Assume
that a power law (straight line on a log-log plot) works for s less than some cutoﬀ. Estimate
the cutoﬀ as a function of p and show that this cutoﬀ diverges as p → pc.
12.4 Critical Exponents and Finite Size Scaling
We are familiar with diﬀerent phases of matter from our everyday experience. The most familiar
example is water which can exist as a gas, liquid, or solid. It is well known that water
changes from one phase to another at a well-deﬁned temperature and pressure, for example,
the transition from ice to liquid water occurs at 0◦C at atmospheric pressure. Such a change of
phase is an example of a thermodynamic phase transition. Most substances also exhibit a critical
point. For example, beyond a particular temperature and pressure, it is not possible to distinguish
between the liquid and gaseous phases, and the phase boundary terminates.
Another example of a critical point occurs in magnetic systems at the Curie temperature
Tc and zero magnetic ﬁeld. We know that at low temperatures some substances such as iron
exhibit ferromagnetism, a spontaneous magnetization in the absence of an external magnetic
ﬁeld. If we raise the temperature of a ferromagnet, the spontaneous magnetization decreases
and vanishes continuously at a critical temperature Tc. For T > Tc, the system is a paramagnet.
In Chapter 15 we will use Monte Carlo methods to investigate the behavior of a magnetic system
near the magnetic critical point.
CHAPTER 12. PERCOLATION 465
0.0 0.2 0.4 0.6 0.8 1.0
0
10
20
30
p
ξ(p)
Figure 12.10: The qualitative p-dependence of the connectedness length ξ(p) for a square lattice
with L = 128. The results were averaged over approximately 2000–6000 conﬁgurations for each
value of p. Note that ξ is ﬁnite for a ﬁnite lattice.
In the following, we will ﬁnd that the properties of the geometrical phase transition in percolation
are qualitatively similar to the properties of the critical point in thermodynamic transitions.
We will see that in the vicinity of a critical point, the qualitative behavior of the system
is governed by the occurrence of long-range correlations.
We have found that the essential physics near the percolation threshold is associated with
the existence of large clusters. For example, for p pc, we found in Problem 12.7 that ns decays
rapidly with s. However for p = pc, the s-dependence of ns is qualitatively diﬀerent, and ns
decreases much more slowly. This diﬀerent behavior of ns at p = pc is due to the presence of
clusters of all length scales, for example, the “inﬁnite” spanning cluster and the ﬁnite clusters of
all sizes. In Figure 12.10 we show the mean connectedness length ξ(p) for a lattice with L = 128.
We see that ξ is ﬁnite, and an increasing function of p for p < pc, and a decreasing function of
p for p > pc. Moreover, we know that ξ(p = pc) is approximately equal to L and hence, diverges
as L → ∞. These qualitative considerations lead us to conjecture that in the limit L → ∞, ξ(p)
grows rapidly in the critical region, |p − pc| 1.
We can describe the quantitative behavior of ξ(p) for p near pc by introducing the critical
exponent ν deﬁned by the relation
ξ(p) ∼ |p − pc|−ν
. (12.10)
Of course, there is no a priori reason why the divergence of ξ(p) can be characterized by a simple
power law. Note that the exponent ν is assumed to be the same above and below pc.
How do the other quantities that we have considered behave in the critical region in the
limit L → ∞? According to the deﬁnition (12.2) of P∞, P∞ = 0 for p < pc and is an increasing
function of p for p > pc. We conjecture that in the critical region, the increase of P∞ with
increasing p is characterized by the exponent β deﬁned by the relation
P∞(p) ∼ (p − pc)β
. (12.11)
Note that P∞ is assumed to approach zero continuously as p approaches pc from above; that
is, the percolation transition is an example of a continuous phase transition. In the language of
critical phenomena, P∞ is an example of an order parameter; that is, P∞ is nonzero in the ordered
phase, p > pc and zero in the disordered phase p < pc. We will see that at p = pc, the spanning
cluster is fractal and approaches zero density as the size of the system becomes larger.
CHAPTER 12. PERCOLATION 466
Quantity Functional form Exponent d = 2 d = 3
Percolation
order parameter P∞ ∼ (p − pc)β β 5/36 0.41
mean size of ﬁnite clusters S(p) ∼ |p − pc|−γ γ 43/18 1.80
connectedness length ξ(p) ∼ |p − pc|−ν ν 4/3 0.88
cluster numbers ns ∼ s−τ (p = pc) τ 187/91 2.19
Ising model
order parameter M(T ) ∼ (Tc − T )β β 1/8 0.32
susceptibility χ(T ) ∼ |T − Tc|−γ γ 7/4 1.24
correlation length ξ(T ) ∼ |T − Tc|−ν ν 1 0.63
Table 12.1: Several of the critical exponents for the percolation and magnetism phase transitions
in d = 2 and d = 3 dimensions. Ratios of integers correspond to known exact results. The critical
exponents for the Ising model are discussed in Chapter 15.
The mean number of sites in the ﬁnite clusters S(p) also diverges in the critical region. Its
critical behavior is written as
S(p) ∼ |p − pc|−γ
, (12.12)
which deﬁnes the critical exponent γ. The common critical exponents for percolation are summarized
in Table 12.1. The analogous critical exponents of a magnetic critical point are also
shown.
Because we can simulate only ﬁnite lattices, a direct ﬁt of the measured quantities ξ, P∞,
and S(p) to their assumed critical behavior for an inﬁnite lattice would not yield good estimates
for the corresponding exponents ν, β, and γ (see Problem 12.8). The problem is that if p is close
to pc, the connectedness length of the largest cluster becomes comparable to L, and the nature
of the clusters is aﬀected by the ﬁnite size of the system. In contrast, for p far from pc, ξ(p) is
small in comparison to L, and the measured values of ξ, and hence, the values of other physical
quantities, are not appreciably aﬀected by the ﬁnite size of the lattice. Hence, for p pc and
p pc, the properties of the system are indistinguishable from the corresponding properties of
a truly macroscopic system (L → ∞). However, if p is close to pc, ξ(p) is comparable to L and
the nature of the system diﬀers from that of an inﬁnite system. In particular, a ﬁnite lattice
cannot exhibit a true phase transition characterized by divergent physical quantities. Instead,
ξ reaches a ﬁnite maximum at p = pc(L).
The eﬀects of the ﬁnite system size can be made more quantitative by the following argument.
Consider, for example, the critical behavior (12.11) of P∞. If ξ 1 but is much less than
L, the power law behavior given by (12.11) is expected to hold. However, if ξ is comparable to
L, ξ cannot change appreciably and (12.11) is no longer applicable. This qualitative change in
the behavior of P∞ and other physical quantities occurs for
ξ(p) ∼ L ∼ |p − pc|−ν
. (12.13)
We invert (12.13) and write
|p − pc| ∼ L−1/ν
. (12.14)
The diﬀerence |p − pc| in (12.14) is the “distance” from the percolation threshold point at which
ﬁnite size eﬀects occur. Hence, if ξ and L are approximately the same size, we can replace
(12.11) by the relation
P∞(p = pc) ∼ L−β/ν
. (L → ∞). (12.15)
CHAPTER 12. PERCOLATION 467
The relation (12.15) between P∞ and L at p = pc is consistent with the fact that a phase transition
is deﬁned only for inﬁnite systems.
One implication of (12.15) is that we can use it to determine the ratio β/ν. This method of
analysis is known as ﬁnite size scaling. Suppose that we generate percolation conﬁgurations at
p = pc for diﬀerent values of L and analyze P∞ as a function of L. If our values of L are suﬃciently
large, we can use the asymptotic relation (12.15) to estimate the ratio β/ν. A similar analysis
can be used for S(p) and other quantities of interest. We use this method in Problem 12.8.
Problem 12.8. Finite size scaling analysis of critical exponents
(a) Compute P∞ at p = pc for at least 100 conﬁgurations for L = 10, 20, 40, and 80. Include
in your average only those conﬁgurations that have a spanning cluster. Best results are
obtained using the value of pc for the inﬁnite square lattice, pc ≈ 0.5927. Plot lnP∞ versus
lnL and estimate the ratio β/ν.
(b) Use ﬁnite size scaling to determine the dependence of the mean cluster size S on L at p = pc.
Average S over the same conﬁgurations considered in part (a). Remember that S is the mean
number of sites in the nonspanning clusters.
(c) Find the size (number of particles) M in the spanning cluster at p = pc as a function of
L. Use the same conﬁgurations as in part (a). Determine an exponent from the slope of a
plot of lnM versus lnL. This exponent is called the fractal dimension and is discussed in
Chapter 13.
Finite size scaling is particularly useful at the percolation threshold in comparison to thermal
critical points where, as we will learn in Chapter 15, critical slowing down occurs. Critical
slowing down makes it very time consuming to sample statistically independent conﬁgurations.
No such slowing down occurs at the percolation threshold because we can easily create
new conﬁgurations at any value of p by simply using a new set of random numbers.
We found in Section 12.2 that the numerical value of the percolation threshold pc depends
on the symmetry and dimension of the lattice, for example, pc ≈ 0.5927 for the square lattice
and pc = 1/2 for the triangular lattice. A remarkable feature of the power law dependencies
summarized in Table 12.1 is that the values of the critical exponents do not depend on the
symmetry of the lattice and are independent of the existence of the lattice itself, for example,
they are identical for site percolation, bond percolation, and continuum percolation. Moreover,
it is not necessary to distinguish between the exponents for site and bond percolation. In the
vocabulary of critical phenomena, we say that site, bond, and continuum percolation all belong
to the same universality class and that their critical exponents are identical for the same spatial
dimension.
Another important idea in critical phenomena is the existence of relations between the critical
exponents. An example of such a scaling law is
2β + γ = νd (12.16)
where d is the spatial dimension of the lattice. The scaling law (12.16) indicates that the universality
class depends on the spatial dimension. A more detailed discussion of ﬁnite size scaling
and the scaling laws can be found in Chapter 15 and in the references.
CHAPTER 12. PERCOLATION 468
12.5 The Renormalization Group
In Section 12.4 we studied the properties of various quantities on diﬀerent length scales to determine
the values of the critical exponents. The idea of examining physical quantities near the
critical point on diﬀerent length scales can be extended beyond ﬁnite size scaling and is the basis
of the renormalization group method, one of the more important new methods in theoretical
physics developed in the past several decades.2 Although the method originated in the theory
of elementary particles and was ﬁrst applied to thermodynamic critical points, it is simpler to
understand the method in the context of the percolation transition. We will ﬁnd that the renormalization
group method yields the critical exponents directly and in combination with Monte
Carlo methods, is more powerful than Monte Carlo methods alone.
The basic idea of the renormalization group method is the following. Imagine a percolation
conﬁguration generated at p = p0. What would happen if we average the conﬁguration over
groups of sites to obtain a conﬁguration of occupied and empty cells? For example, the cells
could be groups of four sites such that each cell is occupied or empty according to a mapping
rule from the sites to the cell. If the original group of 2 × 2 sites spans, the cell would be
occupied; otherwise, the cell would be empty. What value of p = p1 would describe the new
conﬁguration of cells? If p0 < pc, we would expect p1 < p0. To understand why, consider a
value of p0 near p = 0 where almost all the clusters are of size one. Clearly, the occupied sites
would be mapped into empty cells, and there would be a lower percentage of occupied cells
than before. For p0 > pc we would ﬁnd p1 > p0 because the rare isolated unoccupied sites would
be grouped into occupied cells. At p = pc we might expect that this blocking procedure would
lead to conﬁgurations that look like they were generated at the same value of p because of the
existence of clusters of all length scales.
Given the new conﬁguration of cells at probability p1, we can group the cells according
to the same mapping rule, leading to a new p = p2. The sequence p0, p1, p2,... is called a
renormalization group ﬂow. We expect for p0 < pc the ﬂow will move to the trivial ﬁxed point
p = 0, and for p > pc the ﬂow will move to the other trivial ﬁxed point p = 1. At p = pc, there is
a nontrivial ﬁxed point. We will see that by analyzing the renormalization group ﬂow, we can
determine the location of the critical point and the critical exponent ν.
We now consider a way of using a computer to change the conﬁgurations in a way that is
similar to the procedure that we have just described. Consider a square lattice that is partitioned
into cells or blocks that cover the lattice (see Figure 12.11). Note that we have deﬁned the cells
so that the new lattice of cells has the same symmetry as the original lattice. However, the
replacement of sites by the cells has changed the length scale— all distances are now smaller by
a factor of b, where b is the linear dimension of the cell. Hence, the eﬀect of a “renormalization”
is to replace each group of sites by a single renormalized site and to rescale the connectedness
length for the renormalized lattice by a factor of b.
Because we want to preserve the main features of the original lattice and hence its connectedness
(and its symmetry), we assume that a renormalized site is occupied if the original group
of sites spans the cell. For simplicity, we adopt the vertical spanning criterion. The eﬀect of
performing a renormalization transformation on typical percolation conﬁgurations for p above
and below pc is illustrated in Figures 12.12 and 12.13, respectively. In both cases the eﬀect of
the successive transformations is to move the system away from pc. We see that for p = 0.7, the
eﬀect of the transformations is to drive the system toward p = 1. For p = 0.5, the trend is to
2Kenneth Wilson was honored with the Nobel prize in physics in 1981 for his contributions to the development of
the renormalization group method.
CHAPTER 12. PERCOLATION 469
Figure 12.11: An example of a b = 4 cell used on the square lattice. The cell contains b2 sites
which are rescaled to a single supersite or cell after a renormalization group transformation.
drive the system toward p = 0. Because we began with a ﬁnite lattice, we cannot continue the
renormalization transformation indeﬁnitely.
The class RGApp implements a visual interpretation of the renormalization group. This
class creates four windows with the original lattice in the ﬁrst window and three renormalized
lattices in the other three windows.
Listing 12.4: The visual renormalization group.
package org . opensourcephysics . sip . ch12 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . ;
import java . awt . Color ;
public class RGApp extends AbstractCalculation {
LatticeFrame o r i g i n a l L a t t i c e = new LatticeFrame ( "Original Lattice" ) ;
LatticeFrame block1 = new LatticeFrame ( "First Blocked Lattice" ) ;
LatticeFrame block2 = new LatticeFrame ( "Second Blocked Lattice" ) ;
LatticeFrame block3 = new LatticeFrame ( "Third Blocked Lattice" ) ;
public RGApp( ) {
setLatticeColors ( o r i g i n a l L a t t i c e ) ;
setLatticeColors ( block1 ) ;
setLatticeColors ( block2 ) ;
setLatticeColors ( block3 ) ;
}
public void calculate ( ) {
int L = control . getInt ( "L" ) ;
double p = control . getDouble ( "p" ) ;
newLattice (L , p , o r i g i n a l L a t t i c e ) ;
block ( originalLattice , block1 , L / 2 ) ; / / block o r i g i n a l l a t t i c e
block ( block1 , block2 , L / 4 ) ; / / next blocking
block ( block2 , block3 , L / 8 ) ; / / f i n a l blocking
o r i g i n a l L a t t i c e . s e t V i s i b l e ( true ) ;
block1 . s e t V i s i b l e ( true ) ;
block2 . s e t V i s i b l e ( true ) ;
block3 . s e t V i s i b l e ( true ) ;
}
public void reset ( ) {
control . setValue ( "L" , 64);
control . setValue ( "p" , 0 . 6 ) ;
CHAPTER 12. PERCOLATION 470
L = 16 L' = 8
L' = 4 L' = 2
Figure 12.12: A percolation conﬁguration generated at p = 0.7. The original conﬁguration has
been renormalized three times by transforming cells of four sites into one new supersite. What
would be the eﬀect of an additional transformation?
}
/ / new l a t t i c e
public void newLattice ( int L , double p , LatticeFrame l a t t i c e ) {
l a t t i c e . r e s i z e L a t t i c e (L , L ) ;
for ( int i = 0; i <L ; i ++) {
for ( int j = 0; j <L ; j ++) {
i f (Math . random() <p) {
l a t t i c e . setValue ( i , j , 1 ) ;
}
}
}
}
public void block ( LatticeFrame l a t t i c e , LatticeFrame blockedLattice ,
int Lb) {
blockedLattice . r e s i z e L a t t i c e (Lb , Lb ) ;
for ( int ib = 0; ib<Lb ; ib ++) {
for ( int jb = 0; jb<Lb ; jb ++) {
int leftCellsProduct = l a t t i c e . getValue (2 ib , 2 jb )
l a t t i c e . getValue (2 ib , 2 jb +1);
int rightCellsProduct = l a t t i c e . getValue (2 ib +1, 2 jb )
l a t t i c e . getValue (2 ib +1, 2 jb +1);
CHAPTER 12. PERCOLATION 471
L = 16
L' = 4 L' = 2
L' = 8
Figure 12.13: A percolation conﬁguration generated at p = 0.5 (shaded cells are occupied). The
original conﬁguration has been renormalized three times by transforming blocks of four sites
into one new site and rescaling all lengths by a factor of b = 2. What would be the eﬀect of an
additional transformation?
i f ( leftCellsProduct ==1|| rightCellsProduct ==1) {
/ / v e r t i c a l spanning rule
blockedLattice . setValue ( ib , jb , 1 ) ;
}
}
}
}
public void setLatticeColors ( LatticeFrame l a t t i c e ) {
l a t t i c e . setIndexedColor (0 , Color .WHITE) ;
l a t t i c e . setIndexedColor (1 , Color .BLUE ) ;
}
public s t a t i c void main ( String [ ] args ) {
CalculationControl . createApp (new RGApp ( ) ) ;
}
}
Problem 12.9. Visual renormalization group
Use RGApp with L = 64 to estimate the value of the percolation threshold. For example, conﬁrm
that for small p, such as p = 0.4, the renormalized lattice almost always renormalizes to a
CHAPTER 12. PERCOLATION 472
Figure 12.14: The seven (vertically) spanning conﬁgurations on a b = 2 cell.
nonspanning cluster. What happens for p = 0.8? How can you use the properties of the renormalized
lattices to estimate pc?
Although a visual implementation of the renormalization group allows us to estimate pc, it
does not allow us to estimate the critical exponents. In the following, we present a renormalization
group method that allows us to obtain pc and the critical exponent ν associated with the
connectedness length. This analysis follows closely the method presented by Reynolds et al.
We adopt the same procedure as before; that is, we replace the bd sites within a cell of
linear dimension b by a single site that represents whether or not the original lattice sites span
the cell. The second step is to determine the parameters that specify the new renormalized
conﬁguration. We make the simple approximation that each cell is independent of all the other
cells and is characterized only by the probability p that the cell is occupied. The relation
between p and p reﬂects the fact that the basic physics of percolation is connectedness, because
we deﬁne a cell to be occupied only if it contains a set of sites that span the cell. If the sites are
occupied with probability p, then the cells are occupied with probability p , where p is given
by a renormalization transformation or a recursion relation of the form
p = R(p). (12.17)
The quantity R(p) is the total probability that the sites form a spanning path.
An example will make the formal relation (12.17) more clear. In Figure 12.14, we show
the seven vertically spanning site conﬁgurations for a b = 2 cell. The probability p that the
renormalized site is occupied is given by the sum of the probabilities of all the spanning con-
ﬁgurations:
p = R(p) = p4
+ 4p3
(1 − p) + 2p2
(1 − p)2
. (12.18)
(Note that q = 1 − p is the probability that a site is empty.) In general, the probability p of
the occupied renormalized sites is diﬀerent than the occupation probability p of the original
sites. For example, suppose that we begin with p = p0 = 0.5. After a single renormalization
transformation, the value of p from (12.18) is p1 = p = R(p0 = 0.5) = 0.44. If we perform a
second renormalization transformation, we have p2 = R(p1) = 0.35. It is easy to see that further
transformations drive the system to the ﬁxed point p = 0. Similarly, if we begin with p = p0 =
0.7, we ﬁnd that successive transformations drive the system to the ﬁxed point p = 1. This
behavior is qualitatively similar to what we observed in the visual renormalization group.
CHAPTER 12. PERCOLATION 473
To ﬁnd the nontrivial ﬁxed point associated with the critical threshold pc, we need to ﬁnd
the special value of p such that
p∗
= R(p∗
). (12.19)
For the recursion relation (12.18), we ﬁnd that the solution of the fourth–degree equation for p∗
yields the two trivial ﬁxed points, p∗ = 0 and p∗ = 1, and the nontrivial ﬁxed point p∗ = 0.61804
which we associate with pc. This calculated value of p∗ for b = 2 should be compared with
pc ≈ 0.5927.
To calculate the critical exponent ν, we recall that all lengths are reduced on the renormalized
lattice by a factor of b in comparison to the lengths in the original system. Hence, the
connectedness length transforms as
ξ = ξ/b. (12.20)
Because ξ(p) = A|p − pc|−ν for p near pc, where A is a constant, we have
|p − p∗
|−ν
= b−1
|p − p∗
|−ν
(12.21)
where we have identiﬁed pc with p∗. To ﬁnd the relation between p and p near pc, we expand
the renormalization transformation (12.17) in a Taylor series about p∗ and obtain to ﬁrst order
in (p − p∗)
p − p∗
= R(p) − R(p∗
) ≈ λ(p − p∗
) (12.22)
where
λ =
dR
dp p=p∗
. (12.23)
We need to do a little algebra to obtain an explicit expression for ν. We ﬁrst raise both sides
of (12.22) to the νth power and write
|p − p∗
|ν
= λν
(p − p∗
)ν
. (12.24)
We then compare (12.24) and (12.21) and obtain
b = λν
. (12.25)
Finally, we take the logarithm of both sides of (12.25) and obtain the desired relation for the
critical exponent ν:
ν =
logb
logλ
. (12.26)
As an example, let us calculate λ for a square lattice with b = 2. We write (12.18) in the
form R(p) = − p4 + 2p2. The derivative of R(p) with respect to p yields λ = 4p(1 − p2) = 1.5279
at p = p∗ = 0.61804. We then use the relation (12.26) to obtain
ν =
log2
log1.5279
≈ 1.635. (12.27)
A comparison of (12.27) with the exact result ν = 4/3 (see Table 12.1) for two dimensions shows
reasonable agreement for such a simple calculation. (What would we be able to conclude if we
were to measure ξ(p) directly on a 2 × 2 lattice?)
Our calculation of ν does not give us an estimate of the error. What is the nature of our
approximations? Our major assumption has been that the occupancy of each cell is independent
of all other cells. This assumption is correct for the original sites, but after one renormalization,
CHAPTER 12. PERCOLATION 474
Figure 12.15: Example of the interface problem between cells. Two cells that are not connected
at the original site level but that are connected at the cell level.
we lose some of the original connecting paths and gain connecting paths that are not present
in the original lattice. An example of this interface problem is shown in Figure 12.15. Because
this surface eﬀect becomes less important with increasing cell size, one way to improve the
renormalization group calculation is to consider larger cells. In Project 12.12 we consider a
cell-to-cell method that does not require large cells and yields comparable accuracy.
Problem 12.10. Renormalization group method for small cells
(a) Enumerate the spanning conﬁgurations for a b = 2 cell assuming that a cell is occupied if a
spanning path exists in either the vertical or the horizontal directions. Obtain the recursion
relation and solve for the ﬁxed point p∗. Use either a root ﬁnding algorithm or simple trial
and error to ﬁnd the value of p = p∗ such that R(p) − p is zero. How do p∗ and ν compare to
their values using the vertical spanning criterion?
(b) Repeat the simple renormalization group calculation in part (a) using the criterion that a
cell is occupied only if a spanning path exists in both directions.
(c)∗ The association of pc with p∗ is not the only possible one. Two alternatives involve the
derivative R (p) = dR/dp. For example, we could let pc =
1
0
pR (p)dp. Alternatively, we
could choose pc = pmax, where pmax is the value of p at which R (p) has its maximum value.
Compute pc using these two alternative deﬁnitions and the various spanning criteria. In
the limit of large cells, all three deﬁnitions should lead to the same values of pc.
(d)∗ Enumerate the possible spanning conﬁgurations of a b = 3 cell, assuming that a cell is occupied
if a cluster spans the cell vertically. Determine the probability of each conﬁguration,
and verify the renormalization transformation R(p) = p9 +9p8q+36p7q2 +67p6q3 +59p5q4 +
22p4q5 +3p3q6, where q = 1−p is the probability of an empty site. Solve the recursion relation
(12.19) for p∗. Use this value of p∗ to ﬁnd the slope λ and the exponent ν. Then assume
a cell is occupied if a cluster spans the cell both vertically and horizontally and obtain R(p).
Determine p∗(b = 3) and ν(b = 3) for the two spanning criteria. Are your results for p∗ and
ν closer to their known values than for b = 2 for the same spanning criteria?
Problem 12.11. Renormalization group method for the triangular lattice
(a) For the triangular lattice, a cell can be formed by grouping three sites that form a triangle
into one renormalized site. The only reasonable spanning criterion is that the cell spans if
any two sites are occupied. Verify that R(p) = p3 + 3p2(1 − p) and ﬁnd pc = p∗. How does p∗
compare to the exact result pc = 1/2?
(b) Calculate the critical exponent ν and compare its value with the exact result. Explain why
b is given by b2 = 3. Give a qualitative argument as to why the renormalization group
argument might work better for small cells on a triangular lattice than on a square lattice.
CHAPTER 12. PERCOLATION 475
It is possible to improve our renormalization group results for pc and ν by enumerating the
spanning clusters for larger b. However, because the 2N possible conﬁgurations for a N = b2 cell
increase rapidly with b, exact enumeration is not practical for b > 7, and we must use Monte
Carlo methods if we wish to proceed further. Two Monte Carlo approaches are discussed in
Project 12.13. The combination of Monte Carlo and renormalization group methods provides a
powerful tool for obtaining information on phase transitions and other properties of materials.
As summarized in Table 12.1, the various critical exponents for percolation in two dimensions
are known exactly. For example, the exponent ν, corresponding to the divergence of the
connectedness length, is ν = 4/3. It is interesting that the theory for this result is based on algebraic
reasoning (too abstract to be summarized here), even though percolation is a geometrical
phenomena. The most accurate estimate of pc for the square lattice is pc = 0.59274621(13). We
note that although there has been much work on percolation, only numerical estimates for pc
are known for most lattices.
12.6 Projects
Most of the following projects require larger systems and more computer resources than the
problems that we have considered so far, but most are not much more diﬃcult conceptually.
More ideas for projects can be obtained from the references.
Project 12.12. Cell-to-cell renormalization group method
In Section 12.5 we discussed the cell-to-site renormalization group transformation for a system
of cells of linear dimension b. An alternative transformation is to go from cells of linear
dimension b1 to cells of linear dimension b2. For this cell-to-cell transformation, the rescaling
length b1/b2 can be made close to unity. Many errors in a cell-to-cell renormalization group
transformation cancel, resulting in a transformation that is more accurate in the limit in which
the change in length scale is inﬁnitesimal. We can use the fact that the connectedness lengths
of the two systems are related by ξ(p2) = (b1/b2)−1ξ(p1) to derive the relation
ν =
lnb1/b2
lnλ1/λ2
(12.28)
where λi = dR(p∗,bi)/dp is evaluated at the solution to the ﬁxed point equation, R(b2,p∗) =
R(b1,p∗). Note that (12.28) reduces to (12.26) for b2 = 1. Use the results you found in Problem
12.10d for one of the spanning criteria to estimate ν from a b1 = 3 to b2 = 2 transforma-
tion.
Project 12.13. Estimates for two-dimensional percolation
One way to estimate RL(p), the total probability of all the spanning clusters on a lattice of linear
dimension L, can be understood by writing RL(p) in the form
RL(p) =
N
n=1
S(n)PN (n,p) (12.29)
where
PN (n,p) =
N
n
pn
q(N−n)
(12.30)
and N = L2. The binomial coeﬃcient N
n = N!/ (N − n)!n! represents the number of possible
conﬁgurations of n occupied sites and N − n empty sites; PN (n,p) is the probability that n sites
CHAPTER 12. PERCOLATION 476
out of N are occupied with probability p. The quantity S(n) is the probability that a random
conﬁguration of n occupied sites spans the lattice. A comparison of (12.18) and (12.29) shows
that for L = 2 and the vertical spanning criterion, S(1) = 0, S(2) = 2/6, S(3) = 1, and S(4) = 1.
What are the values of S(n) for L = 3?
We can estimate the probability S(n) by Monte Carlo methods. One way to sample S(n)
is to add a particle at random to an unoccupied site and check if a spanning cluster exists. If
a spanning cluster does not exist, add another particle at random to a previously unoccupied
site. If a spanning cluster exists after s particles are added, then let S(n) = S(n) + 1 for all n ≥ s,
and generate a new conﬁguration. After a reasonable number of conﬁgurations, the results for
S(n) can be normalized. Of course, this procedure can be made more eﬃcient by checking for a
spanning cluster only after the total number of particles added is near s ∼ pcN.
(a) Write a Monte Carlo program to sample S(n). Store the location of the unoccupied sites in
a separate array. To check your program, ﬁrst sample S(n) for L = 2 and L = 3 and compare
your results to the exact results for S(n). Consider larger values of L and determine S(n)
for L = 4, 5, 8, 16, and 32. Because the number of sites in the lattice can become very
large, the direct evaluation of the binomial coeﬃcients using factorials is not possible. One
way to proceed is to approximate the probability of a conﬁguration of n occupied sites by a
Gaussian:
PN (n,p) ≈
N
n
pn
q(N−n)
≈ (2πNpq)− 1
2 e−(n−pN)2/2Npq
. (12.31)
(b) As pointed out by Newman and Ziﬀ, the Gaussian approximation for PN (n,p) is not suﬃciently
accurate for high precision studies. Instead, they used the following method. The
binomial distribution is a maximum for a given N and p when n = nmax = pN. Set this value
to 1 for the moment. Then compute PN (n) iteratively for all other n using
PN (n,p) =



PN (N,n − 1,p)N−n+1
n
p
1−p (n > nmax)
PN (N,n + 1,p) n+1
N−n
1−p
p . (n < nmax).
(12.32)
Then calculate the normalization coeﬃcient C = n PN (n,p) and divide all the PN (n,p) by C
to normalize the probability distribution.
(c) Compute ν from the cell-to-cell transformation discussed in Project 12.13 for b1 = 5 and
b2 = 4.
(d) The article by Ziﬀ and Newman discusses the convergence of various estimates of the percolation
threshold in two dimensions. Some examples of these estimates include:
(i) The cell-to-site renormalization group ﬁxed point:
RL(p) = p (12.33)
where p∗ is the solution to (12.33).
(ii) The average value of p at which spanning ﬁrst occurs:
p =
1
0
p
dRL(p)
dp
dp = 1 −
1
0
RL(p)dp (12.34)
where we have integrated by parts to obtain the second integral.
CHAPTER 12. PERCOLATION 477
(iii) The estimate pmax, which is the value of p at which dRL/dp reaches a maximum:
d2RL(p)
dp2
= 0. (12.35)
(iv) The cell-to-cell renormalization group ﬁxed point:
RL(p) = RL−1(p) (12.36a)
or
RL(p) = RL/2(p). (12.36b)
(v) The value of p for which RL(p) = R∞(pc). For a square lattice, R∞(pc) = 1/2.
Verify that the various estimates of the percolation threshold converge to the inﬁnite lattice
value pc either as
pest(L) − pc ≈ cL−1/ν
(12.37a)
or
pest(L) − pc ≈ cL−1−1/ν
(12.37b)
where the constant c is a ﬁt parameter that depends on the criterion and ν = 4/3 for percolation
in two dimensions. Determine which estimates converge more quickly.
Project 12.14. Percolation in three dimensions
(a) The value of pc for site percolation on the simple cubic lattice is approximately 0.3112. Do a
simulation to verify this value. Compute φc, the volume fraction occupied at pc, if a sphere
with a diameter equal to the lattice spacing is placed at each occupied site.
(b) Consider continuum percolation in three dimensions where spheres of unit diameter are
placed at random in a cubical box of linear dimension L. Two spheres that overlap are in
the same cluster. The volume fraction occupied by the spheres is given by
φ = 1 − e−ρ4πr3/3
(12.38)
where ρ is the number density of the spheres, and r is their radius. Write a program to
simulate continuum percolation in three dimensions and ﬁnd the percolation threshold ρc.
Use the Monte Carlo procedure discussed in Problem 12.4 to estimate φc and compare its
value with the value determined from (12.38). How does φc for continuum percolation
compare with the value of φc found for site percolation in part (a)? Which do you expect to
be larger and why?
(c) In the Swiss cheese model in three dimensions, we are concerned with the percolation of
the space between the spheres. This model is appropriate for porous rock with the spheres
representing solid material and the space between the spheres representing the pores. Because
we need to compute the connectivity properties of the space between the spheres, we
superimpose a regular grid with lattice spacing equal to 0.1r on the system, where r is the
radius of the spheres. If a point on the grid is not within any sphere, it is “occupied.” The
use of the grid allows us to determine the connectivity between diﬀerent regions of the pore
space. Use a cluster labeling algorithm to label the clusters and determine ˜φc, the volume
fraction occupied by the pores at threshold. You might be surprised to ﬁnd that ˜φc is relatively
small. If time permits, use a ﬁner grid and repeat the calculation to improve the
accuracy of your results.
CHAPTER 12. PERCOLATION 478
(d)∗ Use ﬁnite-size scaling to estimate the critical percolation exponents for the three models
presented in parts (a)–(c). Are they the same within the accuracy of your calculation?
Project 12.15. Fluctuations of the stock market
Although the ﬂuctuations of the stock market are believed to be Gaussian for long time intervals,
they are not Gaussian for short time intervals. The model of Cont and Bouchaud assumes
that percolation clusters act as groups of traders who inﬂuence each other. The sites are occupied
with probability p as usual. Each occupied site is a trader, and clusters are groups of
traders (agents) who buy and sell together an amount proportional to the number s of traders in
the cluster. At each time step, each cluster is independently active with probability 2pa and is
inactive with probability 1 − 2pa. If a cluster is active, it buys with probability pb and sells with
probability ps = 1 − pb. In the simplest version of the model the change in the price of a stock is
proportional to the diﬀerence between supply and demand; that is,
R =
buy
sns −
sell
sns, (12.39)
where the constant of proportionality is taken to be one. If the probability pa is small, at most
one cluster trades at a time, and the distribution P (R) of relative price changes or “returns”
scales as ns(p). In contrast, for large pa, the relative price variation is the sum of many clusters
(not counting the spanning cluster), and the central limit theorem implies that P (R) converges
to a Gaussian for large systems (except at p = pc). Conﬁrm these statements and ﬁnd the shape
of P (R) for p = pc and pa = 0.25. Variations of the Cont–Bouchaud model can be found in the
references. The application of methods of statistical physics and simulations to economics and
ﬁnance is now an active area of research and is commonly known as econophysics.
Project 12.16. The connectedness length
(a) Modify class Clusters so that the connectedness length ξ deﬁned in (12.9) is computed.
One way to do so is to introduce four additional arrays, xAccum, yAccum, xSquaredAccum,
and ySquaredAccum, with the data stored at indices corresponding to the root sites. We visit
each occupied site in the lattice and determine its root site. For example, if the site x,y is
occupied and its root is root, we set xAccum[root] += x, xSquaredAccum[root] += x*x,
yAccum[root] += y, and ySquaredAccum[root] += y*y. Then R2
s for an individual cluster
is given by
R2
s = xSquaredAccum[root]/s + ySquaredAccum[root]/s
− (xAccum[root]/s)2
− (xAccum/s[root])2
(12.40)
where s is the number of sites in the cluster, which is given by -parent[root].
(b) What is the qualitative behavior of ξ(p) as a function of p for diﬀerent size lattices? Is ξ(p) a
monotonically increasing or decreasing function of p for p < pc and p > pc? Remember that
ξ does not include the spanning cluster.
Project 12.17. Spanning clusters and periodic boundary conditions
For simplicity, we have used open boundary conditions, partly for historical reasons and partly
because a spanning cluster is easier to visualize for open boundary conditions. An alternative
CHAPTER 12. PERCOLATION 479
(a) (b) (c)
Figure 12.16: (a) Example of a cluster that wraps vertically. (b) Example of a cluster that wraps
vertically and horizontally. (c) Example of a single cluster that does not wrap. Periodic boundary
conditions are used in each case.
is to use periodic boundary conditions and deﬁne a spanning cluster as one that wraps all the
way around the lattice (see Figure 12.16).
A method for detecting cluster wrapping has been proposed by Machta et al. In addition
to the parent array introduced on page 457, we deﬁne two integer arrays that give the net
displacement in the x and y direction of each site to its parent site. When we traverse a site’s
cluster tree, we sum these displacements to ﬁnd the total displacement to the root site. When
an added site neighbors two (or more) sites that belong to the same cluster, we compare the total
displacements to the root site for these two sites. If these displacements diﬀer by an amount
that does not equal the minimum displacement between these two sites, then cluster wrapping
has occurred (see Figure 12.17).
Modify the Newman–Ziﬀ algorithm so that periodic boundary conditions are used to deﬁne
the clusters and the existence of a spanning cluster. Use your program to estimate pc and ns and
show that periodic boundary conditions give better results for the percolation threshold pc and
the cluster size distribution ns for the same size lattice.
Project 12.18. Conductivity in a random resistor network
(a) An important critical exponent for percolation is the conductivity exponent t deﬁned by
σ ∼ (p − pc)t
(12.41)
where σ is the conductance (or inverse resistance) per unit length in two dimensions. Consider
bond percolation on a square lattice where each occupied bond between two neighboring
sites is a resistor of unit resistance. Unoccupied bonds have inﬁnite resistance. Because
the total current into any node must equal zero by Kirchhoﬀ’s law, the voltage at any site
(node) is equal to the average of the voltages of all nearest neighbor sites connected by resistors
(occupied bonds). Because this relation for the voltage is the same as the algorithm
for solving Laplace’s equation on a lattice, the voltage at each site can be computed using
the relaxation method discussed in Chapter 10. To compute the conductivity for a given
L × L resistor network, we ﬁx the voltage V = 0 at sites for which x = 0 and ﬁx V = 1 at sites
for which x = L + 1. In the y direction we use periodic boundary conditions. We then compute
the voltage at all sites using the relaxation method. The current through each resistor
connected to a site at x = 0 is I = ∆V /R = (V − 0)/1 = V . The conductivity is the sum of
the currents through all the resistors connected to x = 0 divided by L. In a similar way, the
conductivity can be computed from the resistors attached to the x = L + 1 boundary. Write
CHAPTER 12. PERCOLATION 480
root
new
∆x = 2
∆y = 2
∆x = -3
∆y = -1
9
24
4
Figure 12.17: Example of cluster wrapping for periodic boundary conditions. When site 4
is occupied, it is a neighbor of sites 9 and 24 which belong to a single cluster. We compare
the horizontal and vertical displacements of neighbors 9 and 24 to their root. If the diﬀerence
between these displacements is not equal to the minimum displacement between them (∆xmin =
0, ∆ymin = 2), then wrapping has occurred, as is the case here.
a program to implement the relaxation method for the conductivity of a random resistor
network on a square lattice. An indirect, but easier way of computing the conductivity, is
considered in Problem 13.8.
(b) The bond percolation threshold on a square lattice is pc = 0.5. Use your program to compute
the conductivity for a L = 30 square lattice. Average over at least ten spanning conﬁgurations
for p = 0.51,0.52, and 0.53. Note that you can eliminate all bonds that are not part
of the spanning cluster and all occupied bonds connected to only one other occupied bond.
Why? If possible, consider more values of p. Estimate the critical exponent t deﬁned in
(12.41).
(c) Fix p at p = pc = 1/2 and use ﬁnite size scaling to estimate the conductivity exponent t.
(d)∗ Use larger lattices and the multigrid method (see Project 10.26) to improve your results.
If you have suﬃcient computing resources, compute t for a simple cubic lattice for which
pc ≈ 0.249. (In general, t is not the same for lattice and continuum percolation.)
References and Suggestions for Further Reading
Joan Adler, “Series expansions,” Computers in Physics 8, 287–295 (1994). The critical exponents
and the value of pc can also be determined by doing exact enumeration.
I. Balberg, “Recent developments in continuum percolation,” Phil. Mag. 56, 991–1003 (1987).
An earlier paper on continuum percolation is by Edward T. Gawlinski and H. Eugene
Stanley “Continuum percolation in two dimensions: Monte Carlo tests of scaling and universality
for non-interacting discs,” J. Phys. A: Math. Gen. 14, L291–L299 (1981). These
workers divide the system into cells and use the Poisson distribution to place the appropriate
number of disks in each cell.
CHAPTER 12. PERCOLATION 481
Jean–Philippe Bouchaud and Marc Potters, Theory of Financial Risk and Derivative Pricing:
From Statistical Physics to Risk Management, 2nd ed. (Cambridge University Press, 2003);
Rosario N. Mantegna and H. Eugene Stanley, An Introduction to Econophysics: Correlations
and Complexity in Finance (Cambridge University Press, 2000); Johannes Voit, The
Statistical Mechanics of Financial Markets, 2nd ed. (Springer, 2004). These texts introduce
the general ﬁeld of econophysics.
Armin Bunde and Shlomo Havlin, editors, Fractals and Disordered Systems, revised edition
(Springer-Verlag, 1996). Chapter 2 by the editors is on percolation.
R. Cont and J.-P. Bouchaud, “Herd behavior and aggregate ﬂuctuations in ﬁnancial markets,”
Macroeconomic Dynamics 4, 170–196 (2000).
P. M. C. deOliveira, R. A. Nobrega, and D. Stauﬀer, “Are the tails of percolation thresholds
Gaussians?,” J. Phys. A 37, 3743–3748 (2004). The authors compute the probability that
there is a spanning cluster at p = pc.
C. Domb, E. Stoll, and T. Schneider, “Percolation clusters,” Contemp. Phys. 21, 577–592 (1980).
This review paper discusses the nature of the percolation transition using illustrations
from a ﬁlm of a Monte Carlo simulation of a percolation process.
J. W. Essam, “Percolation theory,” Reports Progress Physics 53, 833–912 (1980). A mathematically
oriented review paper.
Jens Feder, Fractals (Plenum Press, 1988). See Chapter 7 on percolation. We discuss the fractal
properties of the spanning cluster at the percolation threshold in Chapter 13.
J. P. Fitzpatrick, R. B. Malt, and F. Spaepen, “Percolation theory of the conductivity of random
close-packed mixtures of hard spheres,” Phys. Lett. A 47, 207–208 (1974). The authors
describe a demonstration experiment done in a ﬁrst year physics course.
J. Hoshen and R. Kopelman, “Percolation and cluster distribution. I. Cluster multiple labeling
technique and critical concentration algorithm,” Phys. Rev. B 14, 3438–3445 (1976).
The original paper on an eﬃcient cluster labeling algorithm. The Hoshen–Kopelman algorithm
is well suited for very large lattices in two dimensions, but, in general, the Newman–
Ziﬀ algorithm is easier to use.
Chin-Kun Hu, Chi-Ning Chen, and F. Y. Wu, “Histogram Monte Carlo position-space renormalization
group: Applications to site percolation,” J. Stat. Phys. 82, 1199–1206 (1996).
The authors use a histogram Monte Carlo method that is similar to the method discussed
in Project 12.13. A similar Monte Carlo method was used by M. Ahsan Khan, Harvey
Gould, and J. Chalupa, “Monte Carlo renormalization group study of bootstrap percolation,”
J. Phys. C 18, L223–L228 (1985).
J. Machta, Y. S. Choi, A. Lucke, T. Schweizer, and L. M. Chayes, “Invaded cluster algorithm for
Potts models,” Phys. Rev. E 54, 1332–1345 (1996). The authors discuss the deﬁnition of a
spanning cluster for periodic boundary conditions.
P. H. L. Martins and J. A. Plascak, “Percolation on two- and three-dimensional lattices,” Phys.
Rev. E 67, 046119-1–6 (2003). The authors use the Newman–Ziﬀ algorithm to compute
various quantities.
CHAPTER 12. PERCOLATION 482
Ramit Mehr, Tal Grossman, N. Kristianpoller, and Yuval Gefen,“Simple percolation experiment
in two dimensions,” Am. J. Phys. 54, 271–273 (1986). The authors discuss a simple
experiment on a sheet of conducting silver paper. This type of experiment is much easier
to do than the insulator-conductor transition discussed in Section 12.1. In the latter
case, the results are diﬃcult to interpret because the current depends on the contact area
between two spheres and thus on the applied pressure.
M. E. J. Newman and R. M. Ziﬀ, “Fast Monte Carlo algorithm for site or bond percolation,”
Phys. Rev. E 64, 016706-1–16 (2001). Our discussion of the Newman–Ziﬀ algorithm in
Section 12.3 closely follows this well-written paper.
Peter J. Reynolds, H. Eugene Stanley, and W. Klein, “Large-cell Monte Carlo renormalization
group for percolation,” Phys. Rev. B 21, 1223 (1980). Another especially well written research
paper. Our discussion on the renormalization group in Section 12.5 is based upon
this paper.
Muhammad Sahimi, Applications of Percolation Theory (Taylor & Francis, 1994). The emphasis
is on modeling various phenomena in disordered media.
Lev N. Shchur, “Incipient spanning clusters in square and cubic percolation,” in Studies in
Condensed Matter Physics, Vol. 85, edited by D. P. Landau, S. P. Lewis, and H. B. Schuettler
(Springer–Verlag, 2000). Not many years ago, it was commonly believed that only one
spanning cluster could exist at the percolation threshold. In this paper the probability
of the simultaneous occurrence of at least k spanning clusters was studied by extensive
Monte Carlo simulations and found to be in agreement with theoretical predictions.
Dietrich Stauﬀer, “Percolation models of ﬁnancial market dynamics,” Advances in Complex
Systems 4, 19–27 (2001).
D. Stauﬀer, “Percolation clusters as teaching aid for Monte Carlo simulation and critical exponents,”
Am. J. Phys. 45, 1001–1002 (1977); D. Stauﬀer, “Scaling theory of percolation
clusters,” Physics Reports 54, 1–74 (1979).
Dietrich Stauﬀer and Amnon Aharony, Introduction to Percolation Theory, 2nd ed. (Taylor &
Francis, 1994). A delightful book by two of the leading workers in the ﬁeld. An eﬃcient
Fortran implementation of the Hoshen-Kopelman algorithm is given in Appendix A.3.
B. P. Watson and P. L. Leath, “Conductivity in the two-dimensional site percolation problem,”
Phys. Rev. B 9, 4893–4896 (1974). A research paper on the conductivity of chicken wire.
John C. Wierman and Dora Passen Naor, “Criteria for evaluation of universal formulas for
percolation thresholds,” Phys. Rev. E 71, 036143-1–7 (2005). Wierman and Naor evaluate
several universal formulas that predict approximate values for pc for various lattices. Percolation
was ﬁrst conceived by chemistry Nobel laureate P. J. Flory as a model for polymer
gelation, for example, the sol-gel transition of jello [P. J. Flory, “Molecular size distribution
in three dimensional polymers,” J. Am. Chem. Soc. 63, 3083–3100 (1941)]. Further
important work in this context was done by another chemist, Walter H. Stockmayer. The
term “percolation” was coined by mathematicians Broadbent and Hammersley in 1957
who considered percolation on a lattice for the ﬁrst time. See S. R. Broadbent and J.
M. Hammersley, “Percolation processes I. Crystals and mazes,” Proceedings Cambridge
Philosophical Society 53, 629–641 (1957).
CHAPTER 12. PERCOLATION 483
Kenneth G. Wilson, “Problems in physics with many scales of length,” Sci. Am. 241 (8), 158–
179 (1979). An accessible article on the renormalization group method and its applications
in particle and condensed matter physics. See also K. G. Wilson, “The renormalization
group and critical phenomena,” Rev. Mod. Phys. 55, 583–600 (1983). The latter
article is the text of Wilson’s lecture on the occasion of the presentation of the 1982 Nobel
Prize in Physics. In this lecture he claims that he “... found it very helpful to demand
that a correctly formulated ﬁeld theory be soluble by computer, the same way an ordinary
diﬀerential equation can be solved on a computer ...” .
W. Xia and M. F. Thorpe, “Percolation properties of random ellipses,” Phys. Rev. A 38, 2650–
2656 (1988). The authors consider continuum percolation and show that the area fraction
remaining after punching out holes at random is given by φ = e−Aρ, where A is the area
of a hole and ρ is the number density of the holes. This relation does not depend on the
shape of the holes.
Richard Zallen, The Physics of Amorphous Solids (Wiley–Interscience, 1983). Chapter 4 discusses
many of the applications of percolation concepts to realistic systems.
R. M. Ziﬀ and M. E. J. Newman, “Convergence of threshold estimates for two-dimensional
percolation,” Phys. Rev. E 66, 016129-1–10 (2002).
Chapter 13
Fractals and Kinetic Growth Models
We introduce the concept of fractal dimension and discuss several processes that generate fractal
objects.
13.1 The Fractal Dimension
One of the more interesting geometrical properties of objects is their shape. As an example, we
show in Figure 13.1 a spanning cluster generated at the percolation threshold. Although the
visual description of such a cluster is subjective, such a cluster can be described as ramiﬁed,
airy, tenuous, and stringy, rather than compact or space-ﬁlling.
In the 1970s a new fractal geometry was developed by Mandelbrot and others to describe the
characteristics of ramiﬁed objects. One quantitative measure of the structure of these objects
is their fractal dimension D. To deﬁne D, we ﬁrst review some simple ideas of dimension in
ordinary Euclidean geometry. Consider a circular or spherical object of mass M and radius R. If
the radius of the object is increased from R to 2R, the mass of the object is increased by a factor
of 22 if the object is circular or by 23 if the object is spherical. We can express this relation
between mass and the radius or a characteristic length as
M(R) ∼ RD
(mass dimension), (13.1)
where D is the dimension of the object. Equation (13.1) implies that if the linear dimensions
of an object are increased by a factor of b while preserving its shape, then the mass of the
object is increased by bD. This mass-length scaling relation is closely related to our intuitive
understanding of spatial dimension.
If the dimension of the object D and the dimension of the Euclidean space in which the
object is embedded d are identical, then the mass density ρ = M/Rd scales as
ρ(R) ∝ M(R)/Rd
∼ R0
; (13.2)
that is, its density is constant. An example of a two-dimensional object is shown in Figure 13.2.
An object whose mass-length relation satisﬁes (13.1) with D = d is said to be compact.
Equation (13.1) can be generalized to deﬁne the fractal dimension. We denote objects as
fractals if they satisfy (13.1) with a value of D diﬀerent from the spatial dimension d. If an
object satisﬁes (13.1) with D < d, its density is not the same for all R but scales as
ρ(R) ∝ M/Rd
∼ RD−d
. (13.3)
484
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 485
Figure 13.1: Example of a spanning percolation cluster generated at p = 0.5927 on a L = 124
square lattice. The other occupied sites are not shown.
Because D < d, we see that a fractal object becomes less dense at larger length scales. The scale
dependence of the density is a quantitative measure of the ramiﬁed or stringy nature of fractal
objects. In addition, another characteristic of fractal objects is that they have holes of all sizes.
This property follows from (13.3) because if we replace R by Rb, where b is some constant, we
obtain the same power law dependence for ρ(R). Thus, it does not matter what scale of length
is used, and thus all hole sizes must be present.
Another important characteristic of fractal objects is that they look the same over a range
of length scales. This property of self-similarity or scale invariance means that if we take part of
a fractal object and magnify it by the same magniﬁcation factor in all directions, the magniﬁed
picture is similar to the original. This property follows from the scaling argument given for
ρ(R).
The percolation cluster shown in Figure 13.1 is an example of a random or statistical fractal
because the mass-length relation (13.1) is satisﬁed only on the average, that is, only if the
quantity M(R) is averaged over many diﬀerent origins in a given cluster and over many clusters.
In physical systems, the relation (13.1) does not extend over all length scales but is bounded
by both upper and lower cut-oﬀ lengths. For example, a lower cut-oﬀ length is provided by
the lattice spacing or the mean distance between the constituents of the object. In computer
simulations, the maximum length is usually the ﬁnite system size. The presence of these cutoﬀs
complicates the determination of the fractal dimension.
In Problem 13.1 we compute the fractal dimension of percolation clusters using straightforward
Monte Carlo methods. Remember that data extending over several decades is required to
obtain convincing evidence for a power law relationship between M and R and to determine accurate
estimates for the fractal dimension. Hence, conclusions based on the limited simulations
posed in the problems need to be interpreted with caution.
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 486
Figure 13.2: The number of dots per unit area in each circle is uniform. How does the total
number of dots (mass) vary with the radius of the circle?
Problem 13.1. The fractal dimension of percolation clusters
(a) Generate a site percolation conﬁguration on a square lattice with L ≥ 61 at p = pc ≈ 0.5927.
Why might it be necessary to generate several conﬁgurations before a spanning cluster is
obtained? Does the spanning cluster have many dangling ends?
(b) Choose a point on the spanning cluster and count the number of points in the spanning
cluster M(b) within a square of area b2 centered about that point. Then double b and count
the number of points within the larger box. Can you repeat this procedure indeﬁnitely?
Repeat this procedure until you can estimate the b-dependence of the number of points.
Use the b-dependence of M(b) to estimate D according to the deﬁnition M(b) ∼ bD, that
is, estimate D from a log-log plot of M(b) versus b. Choose another point in the cluster
and repeat this procedure. Are your results similar? A better estimate for D can be found
by averaging M(b) over several origins in each spanning cluster and averaging over many
spanning clusters.
(c) If you have not already done Problem 12.8a, compute D by determining the mean size
(mass) M of the spanning cluster at p = pc as a function of the linear dimension L of the
lattice. Consider L = 11, 21, 41, and 61 and estimate D from a log-log plot of M versus
L.
∗Problem 13.2. Renormalization group calculation of the fractal dimension
Compute M2 , the average of the square of the number of occupied sites in the spanning cluster
at p = pc, and the quantity M 2
, the average of the square of the number of occupied sites in the
spanning cluster on the renormalized lattice of linear dimension L = L/b. Because M2 ∼ L2D
and M 2
∼ (L/b)2D, we can obtain D from the relation b2D = M2 / M 2
. Choose the length
rescaling factor to be b = 2 and adopt the same blocking procedure as was used in Section 12.5.
An average over ten spanning clusters for L = 16 and p = 0.5927 is suﬃcient for qualitative
results.
In Problems 13.1 and 13.2, we were interested only in the properties of the spanning clusters.
For this reason, our algorithm for generating percolation conﬁgurations by randomly occupying
each site is ineﬃcient because it generates many clusters. A more eﬃcient way of generating
single percolation clusters is due independently to Hammersley, Leath, and Alexandrowicz.
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 487
g
g
g
g x
g
g
g x
g
g
g
g
g g
x
g
g
g
g
g
g
x
g
g
g
x
g g
g
x
g
g
g
g
x
g g
g
g
x
g
g
g
x
g g
g
g
x
g
g
g
x
g
g
Figure 13.3: An example of the growth of a percolation cluster. Sites are occupied with probability
p. Occupied sites are represented by a shaded square, growth or perimeter sites are
labeled by g, and tested unoccupied sites are labeled by x. Because the seed site is occupied but
not tested, we have represented it diﬀerently than the other occupied sites. The growth sites are
chosen at random.
This algorithm, commonly known as the Leath or the single cluster growth algorithm, is equivalent
to the following steps (see Figure 13.3):
1. Occupy a single seed site on the lattice. The nearest neighbors (four on the square lattice)
of the seed represent the perimeter sites.
2. For each perimeter site, generate a uniform random number r in the unit interval. If r ≤ p,
the site is occupied and added to the cluster; otherwise, the site is not occupied. In order
that sites be unoccupied with probability 1 − p, these sites are not tested again.
3. For each site that is occupied, determine if there are any new perimeter sites, that is,
untested neighbors. Add the new perimeter sites to the perimeter list.
4. Continue steps 2 and 3 until there are no untested perimeter sites to test for occupancy.
Class SingleCluster implements this algorithm and computes the number of occupied
sites within a radius r of the seed particle. The seed site is placed at the center of a square
lattice. Two one-dimensional arrays, pxs and pys, store the x and y positions of the perimeter
sites. The status of a site is stored in the byte array s with s(x,y) = (byte) 1 for an occupied
site, s(x,y) = (byte)2 for a perimeter site, s(x,y) = (byte)-1 for a site that has already been
tested and not occupied, and s(x,y) = (byte) 0 for an untested and unvisited site. To avoid
checking for the boundaries of the lattice, we add extra rows and columns at the boundaries
and set these sites equal to (byte)-1. We use a byte array because the array s will be sent to the
LatticeFrame class which uses byte arrays.
Listing 13.1: Class SingleCluster generates and analyzes a single percolation cluster
package org . opensourcephysics . sip . ch13 . cluster ;
public class SingleCluster {
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 488
public byte s i t e [ ] [ ] ;
public int [ ] xs , ys , pxs , pys ;
public int L ;
public double p ; / / s i t e occupation p r o b a b i l i t y
int occupiedNumber ;
int perimeterNumber ;
/ / displacement x to n e a r e s t neighbors
int nx [ ] = {1 , −1, 0 , 0 } ;
/ / displacement y to n e a r e s t neighbors
int ny [ ] = {0 , 0 , 1 , −1};
/ / mass of ring , index i s d i s t a n c e from c e n t e r of mass
double mass [ ] ;
public void i n i t i a l i z e ( ) {
s i t e = new byte [L+2][L+2]; / / g i v e s s t a t u s of each s i t e
xs = new int [L L ] ; / / l o c a t i o n of occupied s i t e s
ys = new int [L L ] ;
pxs = new int [L L ] ; / / l o c a t i o n of perimeter s i t e s
pys = new int [L L ] ;
for ( int i = 0; i <L+2; i ++) {
s i t e [ 0 ] [ i ] = ( byte ) −1; / / don ’ t occupy edge s i t e s
s i t e [L+1][ i ] = ( byte ) −1;
s i t e [ i ] [ 0 ] = ( byte ) −1;
s i t e [ i ] [ L+1] = ( byte ) −1;
}
xs [ 0 ] = 1+(L / 2 ) ;
ys [ 0 ] = xs [ 0 ] ;
s i t e [ xs [ 0 ] ] [ ys [ 0 ] ] = ( byte ) 1; / / occupy c e n t e r s i t e
occupiedNumber = 1;
for ( int n = 0;n<4;n++) { / / perimeter s i t e s
pxs [n] = xs [0]+nx [n ] ;
pys [n] = ys [0]+ny [n ] ;
s i t e [ pxs [n ] ] [ pys [n ] ] = ( byte ) 2;
}
perimeterNumber = 4;
}
public void step ( ) {
i f ( perimeterNumber >0) {
int perimeter = ( int ) (Math . random ( ) perimeterNumber ) ;
int x = pxs [ perimeter ] ;
int y = pys [ perimeter ] ;
perimeterNumber −−;
pxs [ perimeter ] = pxs [ perimeterNumber ] ;
pys [ perimeter ] = pys [ perimeterNumber ] ;
i f (Math . random() <p) { / / occupy s i t e
s i t e [ x ] [ y ] = ( byte ) 1;
xs [ occupiedNumber ] = x ;
ys [ occupiedNumber ] = y ;
occupiedNumber++;
for ( int n = 0;n<4;n++) { / / find new perimeter s i t e s
int px = x+nx [n ] ;
int py = y+ny [n ] ;
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 489
i f ( s i t e [ px ] [ py]==( byte ) 0) {
pxs [ perimeterNumber ] = px ;
pys [ perimeterNumber ] = py ;
s i t e [ px ] [ py ] = ( byte ) 2;
perimeterNumber++;
}
}
} else {
s i t e [ x ] [ y ] = ( byte ) −1;
}
}
}
public void massDistribution ( ) {
mass = new double [L ] ;
double xcm = 0;
double ycm = 0;
for ( int n = 0;n<occupiedNumber ; n++) {
xcm += xs [n ] ;
ycm += ys [n ] ;
}
xcm /= occupiedNumber ;
ycm /= occupiedNumber ;
for ( int n = 0;n<occupiedNumber ; n++) {
double dx = xs [n]−xcm ;
double dy = ys [n]−ycm ;
int r = ( int ) Math . sqrt ( dx dx+dy dy ) ;
i f ( ( r >1)&&(r<L ) ) {
mass [ r ]++;
}
}
}
}
The target class is shown in Listing 13.2 Note the use of the Open Source Physics LatticeFrame
class. When the user stops the cluster growth, a log-log plot of the mass distribution is shown.
Listing 13.2: Class SingleClusterApp displays the site percolation cluster and the mass distri-
bution
package org . opensourcephysics . sip . ch13 . cluster ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . ;
import java . awt . Color ;
public class SingleClusterApp extends AbstractSimulation {
SingleCluster cluster = new SingleCluster ( ) ;
PlotFrame plotFrame = new PlotFrame ( "ln r" , "ln M" , "Mass distribution" ) ;
LatticeFrame latticeFrame = new LatticeFrame ( "Percolation cluster" ) ;
int steps ;
public void i n i t i a l i z e ( ) {
/ / not occupied or t e s t e d
latticeFrame . setIndexedColor (0 , Color .BLACK) ;
latticeFrame . setIndexedColor (1 , Color .BLUE ) ; / / occupied
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 490
/ / perimeter or growth s i t e
latticeFrame . setIndexedColor (2 , Color .GREEN) ;
/ / permanently not occupied
latticeFrame . setIndexedColor ( −1 , Color .YELLOW) ;
cluster . L = control . getInt ( "L" ) ;
cluster . p = control . getDouble ( "p" ) ;
cluster . i n i t i a l i z e ( ) ;
latticeFrame . setAll ( cluster . s i t e ) ;
}
public void doStep ( ) {
cluster . step ( ) ;
latticeFrame . setAll ( cluster . s i t e ) ;
latticeFrame . setMessage ( "n = "+cluster . occupiedNumber ) ;
i f ( cluster . perimeterNumber==0) {
control . calculationDone ( "Computation done" ) ;
}
}
public void stopRunning ( ) {
plotFrame . clearData ( ) ;
cluster . massDistribution ( ) ;
double massEnclosed = 0;
int rPrint = 2;
for ( int r = 2; r<cluster . L/2; r ++) {
massEnclosed += cluster . mass [ r ] ;
i f ( r==rPrint ) { / / use logarithmic s c a l e
plotFrame . append (0 , Math . log ( r ) , Math . log ( massEnclosed ) ) ;
rPrint = 2;
}
}
plotFrame . s e t V i s i b l e ( true ) ;
}
public void reset ( ) {
control . setValue ( "L" , 61);
control . setValue ( "p" , 0.5927);
setStepsPerDisplay ( 1 0 ) ;
enableStepsPerDisplay ( true ) ;
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl . createApp (new SingleClusterApp ( ) ) ;
}
}
We will use the Leath or single cluster growth algorithm in Problem 13.3 to generate a
spanning cluster at the percolation threshold. The fractal dimension is determined by counting
the number of sites M in the cluster within a distance r of the center of mass of the cluster. The
center of mass is deﬁned by
rcm =
1
N
i
ri (13.4)
where N is the total number of particles in the cluster. A typical plot of lnM(r) versus lnr is
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 491
0
2
4
6
8
10
0 1 2 3 4 5
ln r
ln M
Figure 13.4: Plot of lnM versus lnr for a single spanning percolation cluster generated at p =
0.5927 on a L = 129 square lattice. The straight line is a linear least squares ﬁt to the data. The
slope of this line is 1.91 and is an estimate of the fractal dimension D. The exact value of D for
a percolation cluster at p = pc in two dimensions is D = 91/48 ≈ 1.896.
shown in Figure 13.4. Because the cluster cannot grow past the edge of the lattice, we do not
include data for r ≈ L.
Problem 13.3. Single cluster growth and the fractal dimension
(a) Explain how the Leath algorithm generates single clusters in a way that is equivalent to the
multiple clusters that are generated by visiting all sites. More precisely, the Leath algorithm
generates percolation clusters with a distribution of cluster sizes equal to sns. For example,
if you grow 10 clusters of size s = 2, then ns = 10/2 = 5. The additional factor of s is due to
the fact that each site of the cluster has an equal chance of being the seed of the cluster, and
hence the same cluster can be generated in s ways.
(b) Grow as large a spanning cluster as you can and look at it on diﬀerent length scales. One
way to do so is to divide the screen into four windows, each of which magniﬁes a part of the
cluster shown in the previous window. Does the part of the cluster shown in each window
look approximately self-similar?
(c) Choose p = 0.5927 and L ≥ 61 and generate at least ten conﬁgurations of spanning clusters.
Determine the number of occupied sites M(r) within a distance r of the seed site of each
cluster. (Better results can be found by choosing the origin to be center of mass of each
cluster.) Average M(r) over the spanning clusters. Estimate D from the log-log plot of M
versus r (see Figure 13.4). If time permits, generate percolation clusters on larger lattices.
(d) Generate clusters at p = 0.65, a value of p greater than pc, for L = 101. Make a log-log plot of
M(r) versus r. Is the slope approximately equal to the value of D found in part (c)? Does the
slope increase or decrease for larger r? Repeat for p = 0.80. Is a spanning cluster generated
at p > pc a fractal?
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 492
(e) The fractal dimension of percolation clusters is not an independent exponent but satisﬁes
the scaling relation
D = d − β/ν, (13.5)
where β and ν are deﬁned in Table 12.1. The relation (13.5) can be understood by the
following ﬁnite-size scaling argument. The number of sites in the spanning cluster on a
lattice of linear dimension L is given by
M(L) ∼ P∞(L)Ld
, (13.6)
where P∞ is the probability that an occupied site belongs to the spanning cluster, and Ld is
the total number of sites in the lattice. In the limit of an inﬁnite lattice and p near pc, we
know that P∞(p) ∼ (p−pc)β and ξ(p) ∼ (p−pc)−ν independent of L. Hence, for L ∼ ξ, we have
that P∞(L) ∼ L−β/ν [see (12.11)], and we can write
M(L) ∼ L−β/ν
Ld
∼ LD
. (13.7)
The relation (13.5) follows. Use the exact values of β and ν from Table 12.1 to ﬁnd the exact
value of D for d = 2. Is your estimate for D consistent with this value?
(f)∗ Rewrite the SingleCluster class so that the lattice is stored as a one-dimensional array as
is done for class Clusters in Chapter 12.
(g)∗ Estimate the fractal dimension for percolation clusters on a simple cubic lattice. Take pc =
0.3117.
13.2 Regular Fractals
As we have seen, one characteristic of random fractal objects is that they look the same on a
range of length scales. To gain a better understanding of the meaning of self-similarity, consider
the following example of a regular fractal, a mathematical object that is self-similar on all length
scales. Begin with a line one unit long (see Figure 13.5a). Remove the middle third of the line
and replace it by two lines of length 1/3 each so that the curve has a triangular bump in it and
the total length of the curve is 4/3 (see Figure 13.5b). In the next stage, each of the segments
of length 1/3 is divided into lines of length 1/9 and the procedure is repeated as shown in
Figure 13.5c. What is the length of the curve shown in Figure 13.5c?
The three stages shown in Figure 13.5 can be extended an inﬁnite number of times. The
resulting curve is inﬁnitely long and contains an inﬁnite number of inﬁnitesimally small segments.
Such a curve is known as the triadic Koch curve. A Java class that uses a recursive procedure
(see Section 6.3) to draw this curve is given in Listing 13.3. Note that method iterate
calls itself. Use class KochApp to generate the curves shown in Figure 13.5.
Listing 13.3: Class for drawing the Koch curve
package org . opensourcephysics . sip . ch13 ;
import java . awt . Graphics ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . ;
import org . opensourcephysics . display . ;
public class KochApp extends AbstractCalculation implements Drawable {
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 493
(a)
(b)
(c)
Figure 13.5: The ﬁrst three stages (a)–(c) of the generation of a self-similar Koch curve. At each
stage the displacement of the middle third of each segment is in the direction that increases the
area under the curve. The curves were generated using class KochApp. The Koch curve is an
example of a continuous curve for which there is no tangent deﬁned at any of its points. The
Koch curve is self-similar on each length scale.
DisplayFrame frame = new DisplayFrame ( "Koch Curve" ) ;
int n = 0;
public KochApp ( ) {
frame . setPreferredMinMax ( −100 , 600 , −100, 600);
frame . setSquareAspect ( true ) ;
frame . addDrawable ( this ) ;
}
public void calculate ( ) {
n = control . getInt ( "Number of iterations" ) ;
frame . s e t V i s i b l e ( true ) ;
}
public void i t e r a t e ( double x1 , double y1 , double x2 , double y2 ,
int n , DrawingPanel panel ,
Graphics g ) {
/ / draw Koch curve using r ec ur si on
i f (n>0) {
double dx = ( x2−x1 ) / 3 ;
double dy = ( y2−y1 ) / 3 ;
double xOneThird = x1+dx ; / / new end at 1/3 of l i n e segment
double yOneThird = y1+dy ;
double xTwoThird = x1+2 dx ; / / new end at 2/3 of l i n e segment
double yTwoThird = y1+2 dy ;
/ / r o t a t e s l i n e segment ( dx , dy ) by 60 degrees and adds
/ / to ( xOneThird , yOneThird )
double xMidPoint = (0.5 dx−0.866 dy+xOneThird ) ;
double yMidPoint = (0.5 dy+0.866 dx+yOneThird ) ;
/ / each l i n e segment g e n e r a t e s 4 new ones
i t e r a t e ( x1 , y1 , xOneThird , yOneThird , n−1 , panel , g ) ;
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 494
i t e r a t e ( xOneThird , yOneThird , xMidPoint , yMidPoint , n−1 ,
panel , g ) ;
i t e r a t e ( xMidPoint , yMidPoint , xTwoThird , yTwoThird , n−1 , panel , g ) ;
i t e r a t e ( xTwoThird , yTwoThird , x2 , y2 , n−1 ,
panel , g ) ;
} else {
int ix1 = panel . xToPix ( x1 ) ;
int iy1 = panel . yToPix ( y1 ) ;
int ix2 = panel . xToPix ( x2 ) ;
int iy2 = panel . yToPix ( y2 ) ;
g . drawLine ( ix1 , iy1 , ix2 , iy2 ) ;
}
}
public void draw ( DrawingPanel panel , Graphics g ) {
i t e r a t e (0 , 0 , 500 , 0 , n , panel , g ) ;
}
public void reset ( ) {
control . setValue ( "Number of iterations" , 3 ) ;
}
public s t a t i c void main ( String args [ ] ) {
CalculationControl . createApp (new KochApp ( ) ) ;
}
}
How can we determine the fractal dimension of the Koch and similar mathematical objects?
There are several generalizations of the Euclidean dimension that lead naturally to a deﬁnition
of the fractal dimension (see Section 13.5). Here we consider a deﬁnition based on counting
boxes. Consider a one-dimensional curve of unit length that has been divided into N equal segments
of length so that N = 1/ (see Figure 13.6). As decreases, N increases linearly, which
is the expected result for a one-dimensional curve. Similarly, if we divide a two-dimensional
square of unit area into N equal subsquares of length , we have N = 1/ 2, the expected result
for a two-dimensional object (see Figure 13.6). In general, we have N = 1/ D, where D is the
fractal dimension of the object. If we take the logarithm of both sides of this relation, we can
express the fractal dimension as
D =
logN
log(1/ )
(box dimension). (13.8)
Now let us apply this deﬁnition to the Koch curve. Each time the length of our measuring
unit is reduced by a factor of 3, the number of segments is increased by a factor of 4. If we use
the size of each segment as the size of our measuring unit, then at the nth iteration we have
N = 4n and = (1/3)n, and the fractal dimension of the triadic Koch curve is given by
D =
log4n
log3n
=
nlog4
nlog3
≈ 1.2619 (triadic Koch curve). (13.9)
From (13.9) we see that the Koch curve has a fractal dimension between that of a line and a
plane. Is this statement consistent with your visual interpretation of the degree to which the
triadic Koch curve ﬁlls space?
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 495
d = 1 d = 2
Figure 13.6: Examples of one-dimensional and two-dimensional objects.
Problem 13.4. The recursive generation of regular fractals
(a) Recursion is used in method iterate in KochApp and is one of the more diﬃcult programming
concepts. Explain the nature of recursion and the way it is implemented.
(b) Regular fractals can be generated from a pattern that is used in a self-replicating manner.
Write a program to generate the quadric Koch curve shown in Figure 13.7a. What is its
fractal dimension?
(c) What is the fractal dimension of the Sierpiński gasket shown in Figure 13.7b? Write a
program that generates the next several iterations.
(d) What is the fractal dimension of the Sierpiński carpet shown in Figure 13.7c? How does the
fractal dimension of the Sierpiński carpet compare to the fractal dimension of a percolation
cluster? Are the two fractals visually similar?
13.3 Kinetic Growth Processes
Many systems in nature exhibit fractal geometry. Fractals have been used to describe the irregular
shapes of such varied objects as coastlines, clouds, coral reefs, and the human lung. Why
are fractal structures so common? How do fractal structures form? In this section we discuss
several growth models that generate structures that show a remarkable similarity to forms observed
in nature. The ﬁrst two models are already familiar to us and exemplify the ﬂexibility
and utility of kinetic growth models.
Epidemic model. In the context of the spread of disease, we usually want to know the
conditions for an epidemic. A simple lattice model of the spread of a disease can be formulated
as follows. Suppose that an occupied site corresponds to an infected person. Initially there is a
single infected person and the four nearest neighbor sites (on the square lattice) correspond to
susceptible people. At the next time step, we visit the four susceptible sites and occupy (infect)
each site with probability p. If a susceptible site is not occupied, we say that the site is immune
and we do not test it again. We then ﬁnd the new susceptible sites and continue until either the
disease is controlled or reaches the boundary of the lattice. Convince yourself that this growth
model of a disease generates a cluster of infected sites that is identical to a percolation cluster
at probability p. The only diﬀerence is that we have introduced a discrete time step into the
model. Some of the properties of this model are explored in Problem 13.5.
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 496
(a)
(b)
(c)
Figure 13.7: (a) The ﬁrst few iterations of the quadric Koch curve. (b) The ﬁrst few iterations of
the Sierpiński gasket. (c) The ﬁrst few iterations of the Sierpiński carpet.
Problem 13.5. A simple epidemic model
(a) Explain why the simple epidemic model discussed in the text generates the same clusters
as in the percolation model. What is the minimum value of p necessary for an epidemic
to occur? Recall that in one time step, all susceptible sites are visited simultaneously and
infected with probability p. Determine how n, the number of infected sites, depends on the
time t (the number of time steps) for various values of p. A straightforward way to proceed
is to modify class SingleCluster so that all susceptible sites are visited and occupied with
probability p before new susceptible sites are found. In Chapter 14 we will learn that this
model is an example of a cellular automaton.
(b) What are some ways that you could modify the model to make it more realistic? For example,
the infected sites might recover after a certain time.
Eden model. An even simpler example of a growth model was proposed by Eden in 1958
to simulate the growth of tumors or a bacterial colony. Although we will ﬁnd that the resultant
mass distribution is not a fractal, the description of the Eden model illustrates the general
nature of the fractal growth models we will discuss.
Choose a seed site at the center of the lattice for simplicity. The unoccupied nearest neighbors
of the occupied sites are the perimeter or growth sites. In the simplest version of the model,
a growth site is chosen at random and occupied. The newly occupied site is removed from the
list of growth sites and the new growth sites are added to the list. This process is repeated many
times until a large cluster of occupied sites is formed. The diﬀerence between this model and the
simple epidemic model is that all tested sites are occupied. In other words, no growth sites ever
become “immune.” Some of the properties of Eden clusters are investigated in Problem 13.6.
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 497
0
2
4
6
8
10
0 1 2 3 4
ln r
ln M
Figure 13.8: Plot of lnM versus lnr for a single Eden cluster generated on a L = 61 square lattice.
A least squares ﬁt from r = 2 to r = 32 yields a slope of approximately 2.01.
Problem 13.6. The Eden model
(a) Modify class SingleCluster so that clusters are generated on a square lattice according to
the Eden model. A straightforward procedure is to occupy perimeter sites with probability
p = 1. The simulation should be stopped when the cluster just reaches the edge of the lattice.
What would happen if we were to occupy perimeter sites indeﬁnitely? Follow the procedure
of Problem 13.3 and determine the number of occupied sites M(r) within a distance r of the
seed site. Assume that M(r) ∼ rD for suﬃciently large r and estimate D from the slope of
a log-log plot of M versus r. A typical log-log plot is shown in Figure 13.8 for L = 61. Can
you conclude from your data that Eden clusters are compact?
(b) Modify your program so that only the perimeter or growth sites are shown. Where are the
majority of the perimeter sites relative to the center of the cluster? Grow as big a cluster as
time permits.
Invasion percolation. A dynamical process known as invasion percolation has been used
to model the shape of the oil-water interface that occurs when water is forced into a porous
medium containing oil. The goal is to use the water to recover as much oil as possible. In this
process a water cluster grows into the oil through the path of least resistance. Consider a lattice
of size Lx×Ly, with the water (the invader) initially occupying the left edge (see Figure 13.9). The
resistance to the invader is given by assigning to each lattice site a uniformly distributed random
number between 0 and 1; these numbers are ﬁxed throughout the invasion. Sites that are nearest
neighbors of the invader sites are the perimeter sites. At each time step, the perimeter site with
the lowest random number is occupied by the invader and the oil (the defender) is displaced.
The invading cluster grows until a path of occupied sites connects the left and right edges of
the lattice. After this path forms, there is no need for the water to occupy any additional sites.
To minimize boundary eﬀects, periodic boundary conditions are used for the top and bottom
edges, and all quantities are measured over only a central region for from the left and right
edges of the lattice.
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 498
0.10 0.07 0.84 0.42 0.64
0.70 0.13 0.04 0.89 0.59
0.55 0.22 0.61 0.34 0.72
t = 0
0.10 0.07 0.84 0.42 0.64
0.70 0.13 0.04 0.89 0.59
0.55 0.22 0.61 0.34 0.72
t = 1
0.10 0.07 0.84 0.42 0.64
0.70 0.13 0.04 0.89 0.59
0.55 0.22 0.61 0.34 0.72
t = 2
0.10 0.07 0.84 0.42 0.64
0.70 0.13 0.04 0.89 0.59
0.55 0.22 0.61 0.34 0.72
t = 3
0.10 0.07 0.84 0.42 0.64
0.70 0.13 0.04 0.89 0.59
0.55 0.22 0.61 0.34 0.72
t = 4
0.10 0.07 0.84 0.42 0.64
0.70 0.13 0.04 0.89 0.59
0.55 0.22 0.61 0.34 0.72
t = 5
0.10 0.07 0.84 0.42 0.64
0.70 0.13 0.04 0.89 0.59
0.55 0.22 0.61 0.34 0.72
t = 6
0.10 0.07 0.84 0.42 0.64
0.70 0.13 0.04 0.89 0.59
0.55 0.22 0.61 0.34 0.72
t = 7
Figure 13.9: Example of a cluster formed by invasion percolation on a 5×3 lattice. The lattice at
t = 0 shows the random numbers that have been assigned to the sites. The darkly shaded sites
are occupied by the invader that occupies the perimeter site (lightly shaded) with the smallest
random number. The cluster continues to grow until a site in the right-most column is occupied.
Class Invasion implements the invasion percolation algorithm. The two-dimensional array
element site[i][j] initially stores a random number for the site at (i,j). If the site at (i,j)
is occupied, then site[i][j] is set equal to 1. If the site at (i,j) is a perimeter site, then
site[i][j] is increased by 2. In this way we know which sites are perimeter sites, and the
value of the random number is associated with the perimeter site. A new perimeter site is
inserted into its proper ordered position in the lists perimeterListX and perimeterListY. The
perimeter lists are ordered so that the site with the largest random number is at the beginning.
Two search methods are provided for determining the position of a new perimeter site in
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 499
the perimeter lists. In a linear search we go through the list in order until the random number
associated with the new perimeter site is between two random numbers in the list. In a binary
search we divide the list in two, and determine in which half the new random number belongs.
Then we divide this half into half again and so on until the correct position is found. The linear
and binary search methods are compared in Problem 13.7d. The binary search is the default
method used in class Invasion.
The main quantities of interest are the fraction of sites occupied by the invader and the
probability P (r)∆r that a site with a random number between r and r + ∆r is occupied. The
properties of invasion percolation are explored in Problem 13.7.
Listing 13.4: Class for simulating invasion percolation
package org . opensourcephysics . sip . ch13 . invasion ;
import java . awt . Color ;
import org . opensourcephysics . frames . ;
public class Invasion {
public int Lx , Ly ;
public double s i t e [ ] [ ] ;
public int perimeterListX [ ] , perimeterListY [ ] ;
public int numberOfPerimeterSites ;
public boolean ok = true ;
public LatticeFrame l a t t i c e ;
public Invasion ( LatticeFrame latticeFrame ) {
l a t t i c e = latticeFrame ;
l a t t i c e . setIndexedColor (0 , Color . blue ) ;
l a t t i c e . setIndexedColor (1 , Color . black ) ;
}
public void i n i t i a l i z e ( ) {
Lx = 2 Ly ;
s i t e = new double [ Lx ] [ Ly ] ;
perimeterListX = new int [ Lx Ly ] ;
perimeterListY = new int [ Lx Ly ] ;
for ( int y = 0; y<Ly ; y++) {
s i t e [ 0 ] [ y ] = 1; / / occupy f i r s t column
l a t t i c e . setValue (0 , y , 1 ) ;
}
for ( int y = 0; y<Ly ; y++) {
for ( int x = 1; x<Lx ; x++) {
s i t e [ x ] [ y ] = Math . random ( ) ;
l a t t i c e . setValue ( x , y , 0 ) ;
}
}
numberOfPerimeterSites = 0;
for ( int y = 0; y<Ly ; y++) { / / second column i s perimeter s i t e s
s i t e [ 1 ] [ y ] += 2; / / perimeter s i t e s have s i t e > 2;
numberOfPerimeterSites ++;
/ / i n s e r t s s i t e in perimeter l i s t in order
i n s e r t (1 , y ) ;
}
ok = true ;
}
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 500
public void i n s e r t ( int x , int y ) {
int insertionLocation = binarySearch ( x , y ) ;
for ( int i = numberOfPerimeterSites −1; i >insertionLocation ; i −−) {
perimeterListX [ i ] = perimeterListX [ i −1];
perimeterListY [ i ] = perimeterListY [ i −1];
}
perimeterListX [ insertionLocation ] = x ;
perimeterListY [ insertionLocation ] = y ;
}
public int binarySearch ( int x , int y ) {
int f i r s t L o c a t i o n = 0;
int lastLocation = numberOfPerimeterSites −2;
i f ( lastLocation <0) {
lastLocation = 0;
}
int middleLocation = ( f i r s t L o c a t i o n+lastLocation ) / 2 ;
/ / determine which h a l f of l i s t new number i s in
while ( lastLocation −firstLocation >1) {
int middleX = perimeterListX [ middleLocation ] ;
int middleY = perimeterListY [ middleLocation ] ;
i f ( s i t e [ x ] [ y]> s i t e [ middleX ] [ middleY ] ) {
lastLocation = middleLocation ;
} else {
f i r s t L o c a t i o n = middleLocation ;
}
middleLocation = ( f i r s t L o c a t i o n+lastLocation ) / 2 ;
}
return lastLocation ;
}
/ / goes in order looking f o r l o c a t i o n to i n s e r t
public int linearSearch ( int x , int y ) {
i f ( numberOfPerimeterSites==1) {
return 0;
} else {
for ( int i = 0; i <numberOfPerimeterSites −1; i ++) {
i f ( s i t e [ x ] [ y]> s i t e [ perimeterListX [ i ] ] [ perimeterListY [ i ] ] ) {
return i ;
}
}
}
return numberOfPerimeterSites −1;
}
public void step ( ) {
i f ( ok ) {
int nx [ ] = {1 , −1, 0 , 0 } ;
int ny [ ] = {0 , 0 , 1 , −1};
int x = perimeterListX [ numberOfPerimeterSites −1];
int y = perimeterListY [ numberOfPerimeterSites −1];
i f ( x>Lx−3) {
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 501
/ / i f c l u s t e r g e t s near the end , stop simulation
ok = false ;
}
numberOfPerimeterSites −−;
s i t e [ x ] [ y ] −= 1;
l a t t i c e . setValue ( x , y , 1 ) ;
for ( int i = 0; i <4; i ++) { / / f i n d s new perimeter s i t e s
int perimeterX = x+nx [ i ] ;
int perimeterY = ( y+ny [ i ])%Ly ;
i f ( perimeterY==−1) {
perimeterY = Ly−1;
}
i f ( s i t e [ perimeterX ] [ perimeterY ] <1) { / / new perimeter s i t e
s i t e [ perimeterX ] [ perimeterY ] += 2;
numberOfPerimeterSites ++;
i n s e r t ( perimeterX , perimeterY ) ;
}
}
}
}
public void computeDistribution ( PlotFrame data ) {
int numberOfBins = 20;
int numberOccupied = 0;
double occupied [ ] = new double [ numberOfBins ] ;
double number [ ] = new double [ numberOfBins ] ;
double binSize = 1.0/ numberOfBins ;
int minX = Lx /3;
int maxX = 2 minX ;
for ( int x = minX ; x<=maxX; x++) {
for ( int y = 0; y<Ly ; y++) {
int bin = ( int ) ( numberOfBins ( s i t e [ x ] [ y ]%1));
number [ bin ]++;
i f ( ( s i t e [ x ] [ y]>1)&&( s i t e [ x ] [ y ] <2)) {
numberOccupied++;
occupied [ bin ]++;
}
}
}
data . setMessage ( "Number occupied = "+numberOccupied ) ;
for ( int bin = 0; bin<numberOfBins ; bin++) {
data . append (0 , ( bin +0.5) binSize , occupied [ bin ]/number [ bin ] ) ;
}
}
}
Problem 13.7. Invasion percolation
(a) Use class Invasion to generate an invasion percolation cluster on a 20 × 40 lattice and describe
the qualitative nature of the cluster.
(b) Compute M(L), the number of sites occupied by the invader in the central L × L region
of the L × 2L lattice when the invader ﬁrst reaches the right edge. Average over at least
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 502
twenty conﬁgurations. Assume that M(L) ∼ LD and estimate D from a plot of lnM versus
lnL. Compare your estimate for D with the fractal dimension of site percolation clusters
at p = pc. (The ﬁrst published results for M(L) by Wilkinson and Willemsen were for 2000
realizations each for L in the range 20 to 100.)
(c) Determine the probability P (r)∆r that a site with a random number between r and r + ∆r is
occupied. Choose ∆r = 0.01. Plot P (r) versus r for L = 20 and for values of L up to about
L ≥ 50. Is there a value of r near which P (r) changes rapidly? How does this value of r
compare to the value of pc for site percolation on the square lattice? On the basis of your
numerical estimate for the exponent D found in part (b) and the qualitative behavior of
P (r), make a hypothesis about the relation between the nature of the geometrical properties
of the invasion percolation cluster and the spanning percolation cluster at p = pc.
(d) Explain the nature of the two search algorithms given in class Invasion. Which method
yields the fastest results on a 30 × 60 lattice? Verify that the CPU time for a linear and
binary search is proportional to n and logn, respectively, where n is the number of items in
the list to be searched. Hence, for suﬃciently large n, a binary search usually is preferred.
(e)∗ Modify your program so that the invasion percolation clusters are grown from a seed at
the origin. Grow a cluster until it reaches a boundary of the lattice. Estimate the fractal
dimension as you did for the spanning percolation clusters in Problem 13.3 and compare
your two estimates. On the basis of this estimate and your results from parts (b) and (c),
can you conclude that the spanning cluster in invasion percolation is a fractal?
Diﬀusion in disordered media. In Chapter 7 we considered random walks on perfect lattices.
We found that the mean square displacement of a random walker R2(t) is proportional to the
time t for suﬃciently large t. (For a simple random walk, this relation holds for all t.) Now let us
suppose that the random walker is restricted to a disordered lattice, for example, the occupied
sites of a percolation cluster. What is the asymptotic t-dependence of R2(t) in this case? This
model of a random walk on a percolation cluster is known as the “ant in the labyrinth.”
Just as a random walk on a lattice is a simple example of diﬀusion, a random walk on
a disordered lattice is a simple example of the general problem of diﬀusion and transport in
disordered media. Because many materials of interest are noncrystalline and disordered, there
are many physical phenomena that can be related to the motion of an ant in the labyrinth.
In the usual formulation of the ant in the labyrinth, we place a walker (ant) at random on
one of the occupied sites of a percolation cluster that has been generated with probability p. At
each time step, the ant tosses a coin with four possible outcomes (for a square lattice). If the
outcome corresponds to a step to an occupied site, the ant moves; otherwise, it remains at its
present position. In either case, the time t is increased by one unit.
The main quantity of interest is R2(t), the square of the distance between the ant’s position
at t = 0 and its position at time t. We can generate many walks with diﬀerent initial positions
on the same cluster and average over many percolation clusters to obtain the ant’s mean square
displacement R2(t) . How does R2(t) depend on p and t? We consider this question in Problem
13.8.
Problem 13.8. The ant in the labyrinth
(a) For p = 1, the ant walks on a perfect lattice, and hence, R2(t) = 2dDt. Suppose that an
ant does a random walk on a spanning cluster with p > pc on a square lattice. Assume that
R2(t) → 4Ds(p)t for p > pc and suﬃciently long times. We have denoted the diﬀusion
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 503
1
t = 0
1/3
1/3
0 1/3
t = 1
0
1/6
0
2/3 0
1/6
t = 2
1/18
5/18
0
1/18
2/9
0 7/18
0
t = 3
Figure 13.10: The evolution of the probability distribution function Wt(i) for three successive
time steps.
coeﬃcient by Ds because we are considering random walks only on spanning clusters and
are not considering walks on the ﬁnite clusters that also exist for p > pc. Generate a cluster
at p = 0.7 using the single cluster growth algorithm considered in Problem 13.3. Choose the
initial position of the ant to be the seed site and modify your program to observe the motion
of the ant on the screen. Use L ≥ 101 and average over at least 100 walkers for t up to 500.
Where does the ant spend much of its time? If R2(t) ∝ t, what is Ds(p)/D(p = 1)?
(b) As in part (a) compute R2(t) for p = 1.0, 0.8, 0.7, 0.65, and 0.62 with L = 101. If time
permits, average over several clusters. Make a log-log plot of R2(t) versus t. What is the
qualitative t-dependence of R2(t) for relatively short times? Is R2(t) proportional to t for
longer times? (Remember that the maximum value of R2 is bounded by the ﬁnite size of
the lattice.) If R2(t) ∝ t, estimate Ds(p). Plot Ds(p)/D(p = 1) as a function of p and discuss
its qualitative dependence.
(c) Compute R2(t) for p = 0.4 and conﬁrm that for p < pc, the clusters are ﬁnite, R2(t) is
bounded, and diﬀusion is impossible.
(d) Because there is no diﬀusion for p < pc, we might expect that Ds vanishes as p → pc from
above, that is, Ds(p) ∼ (p − pc)µs for p pc. Extend your calculations of part (b) to larger
L, more walkers (at least 1000), and more values of p near pc and estimate the dynamical
exponent µs.
(e) At p = pc, we might expect R2(t) to exhibit a diﬀerent type of t-dependence, for example,
R2(t) → t2/z for large t. Do you expect the exponent z to be greater or less than two? Do
a simulation of R2(t) at p = pc and estimate z. Choose L ≥ 201 and average over several
spanning clusters.
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 504
(f) The algorithm we have been using corresponds to a “blind” ant because the ant chooses from
four outcomes even if some of these outcomes are not possible. In contrast, the “myopic”
ant can look ahead and see the number q of nearest neighbor occupied sites. The ant then
chooses one of the q possible outcomes and thus always takes a step. Redo the simulations in
part (b). Does R2(t) reach its asymptotic linear dependence on t earlier or later compared
to the blind ant?
(g)∗ The limitation of approach we have taken so far is that we have to average over diﬀerent
random walks R2(t) on a given cluster and also average over diﬀerent clusters. A more
eﬃcient way of treating random walks on a random lattice is to use an exact enumeration
approach and to consider all possible walks on a given cluster. The idea of the exact enumeration
method is that Wt+1(i), the probability that the ant is at site i at time t + 1, is
determined solely by the probabilities of the ant being at the neighbors of site i at time t.
Store the positions of the occupied sites in an array and introduce two arrays corresponding
to Wt+1(i) and Wt(i) for all sites i in the cluster. Use the probabilities Wt(i) to obtain
Wt+1(i) (see Figure 13.10). Spatial averages such as the mean square displacement can be
calculated from the probability distribution function at diﬀerent times. The details of the
method and the results are discussed in Majid et al., who used walks of 5000 steps on
clusters with ∼ 103 sites and averaged their results over 1000 diﬀerent clusters.
(h)∗ Another reason for the interest in diﬀusion in disordered media is that the diﬀusion coeﬃcient
is proportional to the electrical conductivity of the medium. One of Einstein’s many
contributions was to show that the mobility, the ratio of the mean velocity of the particles
in a system to an applied force, is proportional to the self-diﬀusion coeﬃcient in the absence
of the applied force (see Reif). For a system of charged particles, the mean velocity of
the particles is proportional to the electrical current and the applied force is proportional
to the voltage. Hence, the mobility and the electrical conductivity are proportional, and
the conductivity is proportional to the self-diﬀusion coeﬃcient.
The electrical conductivity σ vanishes near the percolation threshold as σ ∼ (p − pc)µ
with µ ≈ 1.30 (see Section 12.1). The diﬃculty of doing a direct Monte Carlo calculation of
σ was considered in Project 12.18. We measured the self-diﬀusion coeﬃcient Ds by always
placing the ant on a spanning cluster rather than on any cluster. In contrast, the conductivity
is measured for the entire system including all ﬁnite clusters. Hence, the self-diﬀusion
coeﬃcient D that enters into the Einstein relation should be determined by placing the
ant at random anywhere on the lattice, including sites that belong to the spanning cluster
and sites that belong to the many ﬁnite clusters. Because only those ants that start on the
spanning cluster can contribute to D, D is related to Ds by D = P∞Ds, where P∞ is the probability
that the ant would land on a spanning cluster. Because P∞ scales as P∞ ∼ (p − pc)β,
we have that (p − pc)µ ∼ (p − pc)β(p − pc)µs or µ = µs + β. Use your result for µs found in part
(d) and the exact result β = 5/36 (see Table 12.1) to estimate µ and compare your result to
the critical exponent µ for the dc electrical conductivity.
(i)∗ We can also derive the scaling relation z = 2 + µs/ν = 2 + (µ − β)ν, where z is deﬁned in
part (e). Is it easier to determine µs or z accurately from a Monte Carlo simulation on
a ﬁnite lattice? That is, if your real interest is estimating the best value of the critical
exponent µ for the conductivity, should you determine the conductivity directly or should
we measure the self-diﬀusion coeﬃcient at p = pc or at p > pc? What is your best estimate
of the conductivity exponent µ?
Diﬀusion limited aggregation. Many objects in nature grow by the random addition of
subunits. Examples include snow ﬂakes, lightning, crack formation along a geological fault,
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 505
Figure 13.11: A DLA cluster of 4284 particles on a square lattice with L = 300.
and the growth of bacterial colonies. Although it might seem unlikely that such phenomena
have much in common, the behavior observed in many models gives us clues that these and
many other natural phenomena can be understood in terms of a few unifying principles. A
popular model that is a good example of how random motion can give rise to beautiful selfsimilar
clusters is known as diﬀusion limited aggregation or DLA.
The ﬁrst step is to occupy a site with a seed particle. Next, a particle is released at random
from a point on the circumference of a large circle whose center coincides with the seed. The
particle undergoes a random walk until it reaches a perimeter site of the seed and sticks. Then
another random walker is released from the circumference of a large circle and walks until it
reaches a perimeter site of one of the two particles in the cluster and sticks. This process is
repeated many times (typically on the order of several thousand to several million) until a large
cluster is formed. A typical DLA cluster is shown in Figure 13.11. Some of the properties of
DLA clusters are explored in Problem 13.9.
The following class provides a reasonably eﬃcient simulation of DLA. Walkers begin just
outside a circle of radius startRadius enclosing the existing cluster and centered at the seed
site. If the walker moves away from the cluster, the step size for the random walker increases.
If the walker wanders too far away (further than maxRadius), the walk is restarted.
Listing 13.5: Class for simulating diﬀusion limited aggregation
package org . opensourcephysics . sip . ch13 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . LatticeFrame ;
import java . awt . Color ;
public class DLAApp extends AbstractSimulation {
LatticeFrame latticeFrame = new LatticeFrame ( "DLA" ) ;
byte s [ ] [ ] ; / / l a t t i c e on which c l u s t e r l i v e s
int xOccupied [ ] , yOccupied [ ] ; / / l o c a t i o n of occupied s i t e s
int L ; / / l i n e a r dimension of l a t t i c e
int halfL ; / / L/2
int ringSize ; / / ring s i z e in which walkers can move
int numberOfParticles ; / / number of p a r t i c l e s in c l u s t e r
/ / radius of c l u s t e r at which walkers are s t a r t e d
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 506
int startRadius ;
/ / maximum radius walker can go b e f o r e a new walk i s s t a r t e d
int maxRadius ;
public void i n i t i a l i z e ( ) {
latticeFrame . setMessage ( null ) ;
numberOfParticles = 1;
L = control . getInt ( "lattice size" ) ;
startRadius = 3;
halfL = L/2;
ringSize = L/10;
maxRadius = startRadius+ringSize ;
s = new byte [L ] [ L ] ;
s [ halfL ] [ halfL ] = Byte .MAX_VALUE;
latticeFrame . setAll ( s ) ;
}
public void reset ( ) {
latticeFrame . setIndexedColor (0 , Color .BLACK) ;
control . setValue ( "lattice size" , 300);
setStepsPerDisplay (100);
enableStepsPerDisplay ( true ) ;
i n i t i a l i z e ( ) ;
}
public void stopRunning ( ) {
control . println ( "Number of particles = "+numberOfParticles ) ;
/ / add code to compute the mass d i s t r i b u t i o n here
}
public void doStep ( ) {
int x = 0 , y = 0;
i f ( startRadius <halfL ) {
/ / find random i n i t i a l p o s i t i o n of new walker
do {
double theta = 2 Math . PI Math . random ( ) ;
x = halfL +( int ) ( startRadius Math . cos ( theta ) ) ;
y = halfL +( int ) ( startRadius Math . sin ( theta ) ) ;
/ / random walk , returns true i f new walk i s needed
} while ( walk ( x , y ) ) ;
}
i f ( startRadius >=halfL ) { / / stop the simulation
control . calculationDone ( "Done" ) ;
latticeFrame . setMessage ( "Done" ) ;
}
latticeFrame . setMessage ( "n = "+numberOfParticles ) ;
}
public boolean walk ( int x , int y ) {
do {
double rSquared = ( x−halfL ) ( x−halfL )+( y−halfL ) ( y−halfL ) ;
int r = 1+( int ) Math . sqrt ( rSquared ) ;
i f ( r>maxRadius ) {
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 507
return true ; / / s t a r t new walker
}
i f ( ( r<halfL )&&(s [ x +1][ y]+ s [ x −1][ y]+ s [ x ] [ y+1]+s [ x ] [ y−1] >0)) {
numberOfParticles ++;
s [ x ] [ y ] = 1;
latticeFrame . setValue ( x , y , Byte .MAX_VALUE) ;
i f ( r>=startRadius ) {
startRadius = r +2;
}
maxRadius = startRadius+ringSize ;
return false ; / / walk i s f i n i s h e d
} else { / / take a st ep
/ / s e l e c t d i r e c t i o n randomly
switch ( ( int ) (4 Math . random ( ) ) ) {
case 0 :
x++;
break ;
case 1 :
x−−;
break ;
case 2 :
y++;
break ;
case 3 :
y−−;
}
} / / end e l s e i f
} while ( true ) ; / / end do loop
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl . createApp (new DLAApp ( ) ) ;
}
}
Problem 13.9. Diﬀusion limited aggregation
(a) DLAApp generates diﬀusion limited aggregation clusters on a square lattice. Each walker
begins at a random site on a launching circle of radius r = Rmax + 2, where Rmax is the
maximum distance of any particle in the cluster from the origin. To save computer time, we
remove a walker that reaches a distance 2Rmax from the seed site and place a new walker at
random on the circle of radius r. If the clusters appear to be fractals, make a visual estimate
of the fractal dimension. Choose a lattice of linear dimension L ≥ 61. (Experts can make
a visual estimate of D to within a few percent.) Modify DLAApp by color coding the sites
in the cluster according to their time of arrival; for example, color the ﬁrst group of sites
white, the next group blue, the next group red, and the last group sites green. (Your choice
of the size of the group depends in part on the total size of your cluster.) Which parts of the
cluster grow faster? Do any of the late arriving green particles reach the center?
(b) At t = 0, the four perimeter (growth) sites on the square lattice each have a probability
pi = 1/4 of becoming part of the cluster. At t = 1, the cluster has mass two and six perimeter
sites. Identify the perimeter sites and convince yourself that their growth probabilities are
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 508
not the same. Do a Monte Carlo simulation and verify that two perimeter sites have growth
probabilities p = 2/9 and the other four have p = 5/36. We discuss a more direct way of
determining the growth probabilities in Problem 13.10.
(c) DLAApp generates clusters ineﬃciently because most of the CPU time is spent while the
random walker is wandering far from the perimeter sites of the cluster. There are several
ways of making your program more eﬃcient. One way is to let the walker take bigger
steps the further it is from the cluster. For example, if the walker is a distance R > Rmax,
a step of length greater than or equal to R − Rmax − 1 may be permitted if this distance is
greater than one lattice unit. If the walker is very close to the cluster, the step length is
one lattice unit. Make this modiﬁcation to class DLA and estimate the fractal dimension of
diﬀusion limited clusters generated on a square lattice by computing M(r), the number of
sites in the cluster within a radius r centered at the seed site. Because very large clusters
are needed to accurately estimate the fractal dimension, you will obtain only approximate
results. Other possible modiﬁcations to make the implementation of the algorithm are discussed
in Project 13.17 and by Meakin (see references).
(d)∗ Each time we grow a DLA cluster (and other clusters in which a perimeter site is selected
at random), we obtain a slightly diﬀerent cluster if we use a diﬀerent random number
sequence. One way of reducing this “noise” is to use “noise reduction,” that is, a perimeter
site not occupied until it has been visited m times. Each time the random walker lands on
a perimeter site, the number of visits for this site is increased by one until the number of
visits equals m and the site is occupied. The idea is that noise reduction accelerates the
approach to the asymptotic scaling behavior. Consider m = 2, 3, 4, and 5 and grow DLA
clusters on the square lattice. Are there any qualitative diﬀerences between the clusters for
diﬀerent values of m?
(e)∗ In Chapter 12 we found that the exponents describing the percolation transition are independent
of the symmetry of the lattice; for example, the exponents for the square and
triangular lattices are the same. We might expect that the fractal dimension of DLA clusters
would also show such universal behavior. However, the presence of a lattice introduces a
small anisotropy that becomes apparent only when very large clusters with the order of 106
sites are grown. Modify your program so that DLA clusters are generated on a triangular
lattice. Do the clusters have the same visual appearance as on the square lattice? Estimate
the fractal dimension and compare your estimate to your result for the square lattice.
The best estimates of D for the square and triangular lattices are D ≈ 1.5 and D ≈ 1.71,
respectively. We are reminded of the diﬃculty of extrapolating the asymptotic behavior
from ﬁnite clusters. We consider the growth of diﬀusion limited aggregation clusters in
the continuum in Project 13.16.
∗Laplacian growth model. As we discussed in Section 10.6, we can formulate the solution
of Laplace’s equation in terms of a random walk. We now do the converse and formulate the
DLA algorithm in terms of a solution to Laplace’s equation. Consider the probability P (r) that
a random walker reaches a site r starting from the external boundary. This probability satisﬁes
the relation
P (r) =
1
4 a
P (r + a), (13.10)
where the sum in (13.10) is over the four nearest neighbor sites (on a square lattice). If we
set P = 1 on the boundary and P = 0 on the cluster, then (13.10) also applies to sites that are
neighbors of the external boundary and the cluster. A comparison of the form of (13.10) with
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 509
the form of (10.12) shows that the former is a discrete version of Laplace’s equation 2P = 0.
Hence, P (r) has the same behavior as the electrical potential between two electrodes connected
to the outer boundary and the cluster, and the growth probability at a perimeter site of the
cluster is proportional to the value of the potential at that site.
∗Problem 13.10. Laplacian growth models
(a) Solve the discrete Laplace equation (13.10) by hand for the growth probabilities of a DLA
cluster of mass 1, 2, and 3. Set P = 1 on the boundary and P = 0 on the cluster. Compare
your results to your results in Problem 13.9b for mass 1 and 2.
(b) You are probably familiar with the random nature of electrical discharge patterns that occur
in atmospheric lightning. Although this phenomenon, known as dielectric breakdown, is
complicated, we will see that a simple model leads to discharge patterns that are similar to
those that are observed in nature. Because lightning occurs in an inhomogeneous medium
with diﬀerences in the density, humidity, and conductivity of air, we will develop a model
of electrical discharge in an inhomogeneous insulator. We know that when an electrical
discharge occurs, the electrical potential φ satisﬁes Laplace’s equation 2φ = 0. One version
of the model (see Family et al.) is speciﬁed by the following steps:
(i) Consider a large boundary circle of radius R and place a charge source at the origin.
Choose the potential φ = 0 at the origin (an occupied site) and φ = 1 for sites on
the circumference of the circle. The radius R should be larger than the radius of the
growing pattern.
(ii) Use the relaxation method (see Section 10.5) to compute the values of the potential φi
for (empty) sites within the circle.
(iii) Assign a random number r to each empty site within the boundary circle. The random
number ri at site i represents a breakdown coeﬃcient and the random inhomogeneous
nature of the insulator.
(iv) The growth sites are the nearest neighbor sites of the discharge pattern (the occupied
sites). Form the product riφa
i for each growth site i, where a is an adjustable parameter.
Because the potential for the discharge pattern is zero, φi for growth site i can be
interpreted as the magnitude of the potential gradient at site i.
(v) The perimeter site with the maximum value of the product rφa breaks down; that is,
set φ for this site equal to zero.
(vi) Use the relaxation method to recompute the values of the potential at the remaining
unoccupied sites and repeat steps (iv) and (v).
Choose a = 1/4 and analyze the structure of the discharge pattern. Does the pattern appear
qualitatively similar to lightning? Does the pattern appear to have a fractal geometry? Estimate
the fractal dimension by counting M(b), the average number of sites belonging to the
discharge pattern that are within a b×b box. Consider other values of a, for example, a = 1/6
and a = 1/3, and show that the patterns have a fractal structure with a tunable fractal dimension
that depends on the parameter a. Published results (Family et al.) are for patterns
with 800 occupied sites.
(c) Another version of the dielectric breakdown model associates a growth probability pi =
φa
i / j φa
j with each growth site i, where the sum is over all the growth sites. One of the
growth sites is occupied with probability pi. That is, choose a growth site at random and
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 510
generate a random number r between 0 and 1. If r ≤ pi, the growth site i is occupied. As
before, the exponent a is a free parameter. Convince yourself that a = 1 corresponds to
diﬀusion limited aggregation. (The boundary condition used in the latter corresponds to a
zero potential at the growth sites.) To what type of cluster does a = 0 correspond? Consider
a = 1/2, 1, and 2 and explore the dependence of the visual appearance of the clusters on a.
Estimate the fractal dimension of the clusters.
(d) Consider a deterministic growth model for which all growth sites are tested for occupancy
at each growth step. Adopt the same geometry and boundary conditions as in part (b) and
use the relaxation method to solve Laplace’s equation for φi. Then ﬁnd the perimeter site
with the largest value of φ and set φmax equal to this value. Only those perimeter sites for
which the ratio φi/φmax is larger than a parameter p become part of the cluster; φi is set
equal to unity for these sites. After each growth step, the new growth sites are determined
and the relaxation method is used to recompute the values of φi at each unoccupied site.
Choose p = 0.35 and determine the nature of the regular fractal pattern. What is the fractal
dimension? Consider other values of p and determine the corresponding fractal dimension.
These patterns have been termed Laplace fractal carpets (see Family et al.).
Surface growth models. The fractal objects we have discussed so far are self-similar’ that is,
if we look at a small piece of the object and magnify it isotropically to the size of the original, the
original and the magniﬁed object look similar (on the average). In the following, we introduce
some simple models that generate a class of fractals that are self-similar only for scale changes
in certain directions.
Suppose that we have a ﬂat surface at time t = 0. How does the surface grow as a result of
vapor deposition and sedimentation? For example, consider a surface that is initially a line of L
occupied sites. Growth is in the vertical direction only (see Figure 13.12).
As before, we simply choose a growth site at random and occupy it (the Eden model again).
The average height of the surface is given by
h =
1
Ns
Ns
i=1
hi, (13.11)
where hi is the distance of the ith surface site from the substrate, and the sum is over all surface
sites Ns. (The precise deﬁnition of a surface site is discussed in Problem 13.11.)
Each time a particle is deposited, the time t is increased by unity. Our main interest is how
the width of the surface changes with t. We deﬁne the width of the surface by
w2
=
1
Ns
Ns
i=1
(hi − h)2
. (13.12)
In general, the width w, which is a measure of the surface roughness, depends on L and t. For
short times we expect that
w(L,t) ∼ tβ
. (13.13)
The exponent β describes the growth of the correlations with time along the vertical direction.
Figure 13.12 illustrates the evolution of the surface generated according to the Eden model.
After a characteristic time, the length over which the ﬂuctuations are correlated becomes comparable
to L, and the width reaches a steady-state value that depends only on L. We write
w(L,t 1) ∼ Lα
, (13.14)
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 511
Figure 13.12: Example of surface growth according to the Eden model. The surface site in
column i is the perimeter site with the maximum value of hi. In the ﬁgure the average height of
the surface is 20.46 and the width is 2.33.
where α is known as the roughness exponent.
From (13.14) we see that in the steady state, the width of the surface in the direction perpendicular
to the substrate grows as Lα. This scaling behavior of the width is characteristic of
a self-aﬃne fractal. Such a fractal is invariant (on the average) under anisotropic scale changes;
that is, diﬀerent scaling relations exist along diﬀerent directions. For example, if we rescale the
surface by a factor b in the horizontal direction, then the surface must be rescaled by a factor
of bα in the direction perpendicular to the surface to preserve the similarity along the original
and rescaled surfaces.
Note that on short length scales, that is, lengths shorter than the width of the interface, the
surface is rough and its roughness can be characterized by the exponent α. (Imagine an ant
walking on the surface.) For length scales much larger than the width of the surface, the surface
appears to be ﬂat and, in our example, it is a one-dimensional object. The properties of the
surface generated by several growth models are explored in Problem 13.11.
Problem 13.11. Growing surfaces
(a) In the Eden model a perimeter site is chosen at random and occupied. The growth rule is
the same as the usual Eden model, but the growth is started from a line of length L rather
than a single site. Hence, there can be “overhangs” as shown in Figure 13.12. Use periodic
boundary conditions in the horizontal direction to determine the perimeter sites. The height
hi corresponds to the height of column i. Consider L = 64. Describe the visual appearance
of the surface as the surface grows. Is the surface well deﬁned visually? Where are most of
the perimeter sites?
(b) To estimate the exponents α and β, plot the width w(t) as a function of t for L = 32, 64, and
128 on the same graph. What type of plot is most appropriate? Does the width initially
grow as a power law? If so, estimate the exponent β. Is there a L-dependent crossover
time after which the width of the surface approaches its steady-state value? How can you
estimate the exponent α? The best numerical estimates for β and α are consistent with the
exact values β = 1/3 and α = 1/2.
(c)∗ The dependence of w(L,t) on t and L can be combined into the scaling form
w(L,t) ≈ Lα
f (t/Lα/β
), (13.15)
where
f (x) =



Axβ x 1
constant x 1,
(13.16)
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 512
Figure 13.13: Example of the growth of a surface according to the ballistic deposition model.
Note that if column one is chosen, the next site that would be occupied (not shaded) would
leave an unoccupied site below it.
where A is a constant. Verify the existence of the scaling form (13.15) by plotting the ratio
w(L,t)/Lα versus t/Lα/β for the diﬀerent values of L considered in part (b). If the scaling
form holds, the results for w for the diﬀerent values of L should fall on a universal curve.
Use either the estimated values of α and β that you found in part (b) or the exact values.
(d) The Eden model is not really a surface growth model, because any perimeter site can become
part of the cluster. In the simplest random deposition model, a column is chosen at random
and a particle is deposited at the top of the column of already deposited particles. There
is no horizontal correlation between neighboring columns. Do a simulation of this growth
model and visually inspect the surface of the interface. Show that the heights of the columns
follow a Poisson distribution [see (7.31)] and that h ∼ t and w ∼ t1/2. This structure does not
depend on L and hence α = 0.
(e) In the ballistic deposition model, a column is chosen at random and a particle is assumed to
fall vertically until it reaches the ﬁrst perimeter site that is a nearest neighbor of a site that
already is part of the surface. This condition allows for growth parallel to the substrate.
Only one particle falls at a time. How do the rules for this growth model diﬀer from those
of the Eden model? How does the surface compare to that of the Eden model? Suppose that
instead of the particle falling vertically, we let it do a random walk as in DLA. Would the
resultant surface be the same?
13.4 Fractals and Chaos
In Chapter 6 we explored dynamical systems that exhibited chaos under certain conditions. We
found that after an initial transient, the trajectory of such a dynamical system consists of a set
of points in phase space called an attractor. For chaotic motion this attractor is often an object
that can be described as a fractal. Such attractors are called strange attractors.
We ﬁrst consider the familiar logistic map [see (6.1)] xn+1 = 4rxn(1 − xn). For most values
of the control parameter r > r∞ = 0.892486417967..., the trajectories are chaotic. Are these
trajectories fractals?
To calculate the fractal dimension for dynamical systems, we use the box counting method
introduced in Section 13.2 in which space is divided into d-dimensional boxes of length . Let
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 513
N( ) equal the number of boxes that contain a piece of the trajectory. The fractal dimension is
deﬁned by the relation
N( ) ∼ lim
→0
−D
(box dimension). (13.17)
Equation (13.17) holds only when the number of boxes is much larger than N( ) and the number
of points on the trajectory is suﬃciently large. If the trajectory moves through many dimensions,
that is, the phase space is very large, box counting becomes too memory intensive because
we need an array of size ∝ −d. This array becomes very large for small and large d.
A more eﬃcient approach is to compute the correlation dimension. In this approach we store
in an array the position of N points on the trajectory. We compute the number of points Ni(r),
and the fraction of points fi(r) = Ni(r)/(N − 1) within a distance r of the point i. The correlation
function C(r) is deﬁned by
C(r) ≡
1
N
i
fi(r), (13.18)
and the correlation dimension Dc is deﬁned by
C(r) ∼ lim
r→0
rDc (correlation dimension). (13.19)
From (13.19) we see that the slope of a log-log plot of C(r) versus r yields an estimate of the
correlation dimension. In practice, small values of r must be discarded because we cannot
sample all of the points on the trajectory, and hence there is a cutoﬀ value of r below which
C(r) = 0. In the large r limit, C(r) saturates to unity if the trajectory is localized as it is for
chaotic trajectories. We expect that for intermediate values of r, there is a scaling regime where
(13.19) holds.
In Problems 13.12–13.14, we consider the fractal properties of some of the dynamical systems
that we considered in Chapter 6.
Problem 13.12. Strange attractor of the logistic map
(a) Write a program that uses box counting to determine the fractal dimension of the attractor
for the logistic map. Compute N( ), the number of boxes of length that have been visited
by the trajectory. Test your program for r < r∞. How does the number of boxes containing
a piece of the trajectory change with ? What does this dependence tell you about the
dimension of the trajectory for r < r∞?
(b) Compute N( ) for r = 0.9 using at least ﬁve diﬀerent values of , for example, 1/ = 100,
300, 1000, 3000, . . . . Iterate the map at least 10,000 times before determining N( ). What
is the fractal dimension of the attractor? Repeat for r ≈ r∞, r = 0.95, and r = 1.
(c) Generate points at random in the unit interval and estimate the fractal dimension using the
same method as in part (b). What do you expect to ﬁnd? Use your results to estimate the
accuracy of the fractal dimension that you found in part (b).
(d) Write a program to compute the correlation dimension for the logistic map and repeat the
calculations for parts (b) and (c).
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 514
Problem 13.13. Strange attractor of the Hénon map
(a) Use two-dimensional boxes of linear dimension to estimate the fractal dimension of the
strange attractor of the Hénon map [see (6.32)] with a = 1.4 and b = 0.3. Iterate the map at
least 1000 times before computing N( ). Does it matter what initial condition you choose?
(b) Compute the correlation dimension for the same parameters used in part (a) and compare
Dc with the box dimension computed in part (a).
(c) Iterate the Hénon map and view the trajectory on the screen by plotting xn+1 versus xn in
one window and yn versus xn in another window. Do the two ways of viewing the trajectory
look similar? Estimate the correlation dimension, where the ith data point is deﬁned by
(xi,xi+1) and the distance rij between the ith and jth data point is given by rij
2 = (xi − xj)2 +
(xi+1 − xj+1)2.
(d) Estimate the correlation dimension with the ith data point deﬁned by xi and rij
2 = (xi −xj)2.
What do you expect to obtain for Dc? Repeat the calculation for the ith data point given by
(xi,xi+1,xi+2) and rij
2 = (xi −xj)2 +(xi+1 −xj+1)2 +(xi+2 −xj+2)2. What do you ﬁnd for Dc?
∗Problem 13.14. Strange attractor of the Lorenz model
(a) Use three-dimensional graphics or three two-dimensional plots of x(t) versus y(t), x(t) versus
z(t), and y(t) versus z(t) to view the structure of the Lorenz attractor. Use σ = 10, b = 8/3,
r = 28, and the time step ∆t = 0.01. Compute the correlation dimension for the Lorenz at-
tractor.
(b) Repeat the calculation of the correlation dimension using x(t), x(t +τ), and x(t +2τ) instead
of x(t), y(t), and z(t). Choose the delay time τ to be at least ten times greater than the time
step ∆t.
(c) Compute the correlation dimension in the two-dimensional space of x(t) and x(t + τ). Do
the same calculation in four dimensions using x(t), x(t + τ), x(t + 2τ), and x(t + 3τ). What
can you conclude about the results for the correlation dimension using two-, three-, and
four-dimensional spaces? What do you expect to see for d > 4?
Problems 13.13 and 13.14 illustrate a practical method for determining the underlying
structure of systems when, for example, the data consists only of a single time series, that is,
measurements of a single quantity over time. The dimension Dc(d) computed by increasing
the dimension of the space d using the delayed coordinate τ eventually saturates when d is approximately
equal to the number of variables that actually determine the dynamics. Hence, if
we have extensive data for a single variable, for example, the atmospheric pressure or a stock
market index, we can use this method to determine the number of independent variables that
determine the dynamics of the variable. This information can then be used to help create models
of the dynamics.
13.5 Many Dimensions
So far we have discussed three ways of deﬁning the fractal dimension: the mass dimension
(13.1), the box dimension (13.17), and the correlation dimension (13.19). These methods do not
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 515
always give the same results for the fractal dimension. Indeed, there are many other dimensions
that we could compute. For example, instead of just counting the boxes that contain a part of an
object, we can count the number of points of the object in each box ni and compute pi = ni/N,
where N is the total number of points. A generalized dimension Dq can be deﬁned as
Dq =
1
q − 1
lim
→0
ln
N( )
i=1 p
q
i
ln
. (13.20)
The sum in (13.20) is over all the boxes and involves the probabilities raised to the qth power.
For q = 0, we have
D0 = −lim
→0
lnN( )
ln
. (13.21)
If we compare the form of (13.21) with (13.17), we can identify D0 with the box dimension. For
q = 1, we need to take the limit of (13.20) as q → 1. Let
u(q) = ln
i
pi
q
, (13.22)
and do a Taylor-series expansion of u(q) about q = 1. We have
u(q) = u(1) + (q − 1)
du
dq
+ ··· . (13.23)
The quantity u(1) = 0 because i pi = 1. The ﬁrst derivative of u(q) is given by
du
dq
= i pi
q lnpi
i pi
q =
i
pi lnpi, (13.24)
where the last equality follows by setting q = 1. If we use the above relations, we ﬁnd that D1 is
given by
D1 = lim
→0
i pi lnpi
ln
(information dimension). (13.25)
D1 is called the information dimension because of the similarity of the plnp term in the numerator
of (13.24) to the information form of the entropy.
It is possible to show that D2 as deﬁned by (13.20) is the same as the mass dimension deﬁned
in (13.1) and the correlation dimension Dc. That is, box counting gives D0 and correlation
functions give D2 (cf. Sander et al.).
There are many objects in nature that diﬀer in appearance but have similar fractal dimension.
An example is the diﬀerent visual appearance in three dimensions of diﬀusion limited aggregation
clusters and the percolation clusters at the percolation threshold. (Both objects have
a fractal dimension of approximately 2.5.) In some cases this diﬀerence can be accounted for
by the multifractal properties of an object. For multifractals the various Dq are diﬀerent, in contrast
to monofractals for which the diﬀerent measures are the same. Percolation clusters are an
example of a monofractal because pi ∼ D0 , the number of boxes N( ) ∼ −D0 , and from (13.20)
Dq = D0 for all q. Multifractals occur when the growth quantities are not the same throughout
the object as frequently happens for the strange attractors produced by chaotic dynamics.
Diﬀusion limited aggregation is an example of a multifractal.
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 516
13.6 Projects
Although the kinetic growth models we have considered yield beautiful pictures, there is much
we do not understand. For example, the fractal dimension of DLA clusters can be calculated
only by approximate theories whose accuracy is unknown. Why do the fractal dimensions have
the values that we estimated by various simulations? Can we trust our numerical estimates of
the various exponents, or is it necessary to consider much larger systems to obtain their true
asymptotic values? Can we ﬁnd unifying features for the many kinetic growth models that
presently exist? What is the relation of the various kinetic growth models to physical systems?
What are the essential quantities needed to characterize the geometry of an object?
One of the reasons that kinetic growth models are diﬃcult to understand is that the ﬁnal
cluster typically depends on the history of the growth. We say that these models are examples of
“nonequilibrium behavior.” The combination of simplicity, beauty, complexity, and relevance
to many experimental systems suggests that the study of fractal objects will continue to involve
a wide range of workers in many disciplines.
Project 13.15. The percolation cluster size distribution
Use the Leath algorithm to determine the critical exponent τ of the cluster size distribution ns
for percolation clusters at p = pc:
ns ∼ s−τ
. (s 1) (13.26)
Modify class SingleCluster so that many clusters are generated and ns is computed for a given
probability p. Remember that the number of clusters of size s that are grown from a seed is the
product sns, rather than ns itself (see Problem 13.3a). Grow at least 100 clusters on a square
lattice with L ≥ 61. If time permits, use bigger lattices and average over more clusters and also
estimate the accuracy of your estimate of τ. See Grassberger for a discussion of an extension of
this approach to estimating the value of pc in higher dimensions.
Project 13.16. Continuum DLA
(a) In the continuum (oﬀ-lattice) version of diﬀusion limited aggregation. the diﬀusing particles
are assumed to be disks of radius a. A disk executes a random walk until its center is
within a distance 2a of the center of a disk that is already a part of the DLA cluster. At each
step the walker changes its position by (r cosθ,r sinθ), where r is the step size, and θ is a
random variable between 0 and 2π. Modify your DLA program or class DLAApp to simulate
continuum DLA.
(b) Compare the appearance of a continuum DLA cluster with a DLA cluster generated on a
square lattice. It is necessary to grow very large clusters (approximately 106 particles) to
see the diﬀerences.
(c) Use the mass dimension to estimate the fractal dimension of continuum DLA clusters and
compare its value with the value you found for the square lattice.
Project 13.17. More eﬃcient simulation of DLA
To improve the eﬃciency of the algorithm, the walker in class DLAApp is restarted if it wanders
too far from the existing cluster. When the walker is within the distance startRadius of the
seed, no optimization is used. Because there can be many unoccupied sites within this distance,
it is desirable to use an additional optimization technique (see Ball and Brady). The idea is to
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 517
choose a simple geometrical object (a circle or square) centered at the walker such that none
of the cluster is within the object. The walker moves in one step to a site on the boundary
of the object. For a circle the walker can move with equal probability to any location on the
circumference. For the square we need the probability of moving to various locations on the
boundary. To ﬁnd the largest object that does not contain a part of the DLA cluster, consider
coarse grained lattices. For example, each 2×2 group of sites on the original lattice corresponds
to one site on the coarser lattice; each 2 × 2 group of sites on the coarse lattice corresponds to a
site on an even coarser lattice, etc. If a site is occupied, then any coarse grained site containing
this site is also occupied.
(a) Because we have considered DLA clusters on a square lattice, we use squares centered at the
walker. We ﬁrst ﬁnd the probability p(∆x,∆y,s) that a walker centered on a square of length
l = 2s+1, will be displaced by the (∆x,∆y). This probability can be computed by simulating
a random walk starting at the origin and ending at a boundary site of the square. Repeat
this simulation for many walkers and then for various values of s. The fraction of walkers
that reach the position (∆x,∆y) is p(∆x,∆y,s). Determine p(∆x,∆y,s) for s = 1 to 16. Store
your results in a ﬁle.
(b) We next determine the arrays such that for a given value of s and a uniform random number
r, we can quickly ﬁnd (∆x,∆y). One way to do so is to create four arrays. The ﬁrst array
lists the probability determined from part (a) such that the values for s = 1 are listed ﬁrst.
Call this array p. For example, p[1] = p(−1,−1,1), p(2) = p(1) + p(−1,0,1), p[3] = p[2] +
p(−1,1,1), etc. The array start tells us where to start in the array p for each value of s.
The arrays dx(i) and dy(i) give the values of ∆x and ∆y corresponding to p[i]. To see
how these arrays are used, consider a walker located at (x,y) centered on a square of linear
dimension 2s + 1. Generate a random number r and ﬁnd i = start(s). If r < p[i], then
the walker moves to (x + dx(i),y + dy(i)). If not, increment i by unity and check again.
Repeat until r ≤ p[i]. Write a program to create these four arrays and store them in a ﬁle.
(c) Write a method to determine the maximum value of the parameter s such that a square
of size 2s + 1 centered at the position of the walker does not contain any part of the DLA
cluster. Use coarse grained lattices to do this determination more eﬃciently. Modify class
DLA to incorporate this method and the arrays deﬁned in part (b). How much faster is your
modiﬁed program than the original class DLA for clusters of 500 and 5000 particles?
(d) What is the largest cluster you can grow on your computer in a reasonable time? Does the
cluster show any evidence for anisotropy? For example, does the cluster tend to extend
further along the axes or along any other direction?
Project 13.18. Cluster-cluster aggregation
In DLA all the particles that stick to a cluster are the same size (the growth occurs by the
addition of one particle at a time), and the cluster that is formed is motionless. In the following,
we consider a cluster-cluster aggregation (CCA) model in which the clusters do a random walk
as they aggregate.
Suppose we begin with a dilute collection of N particles. Each of these particles is initially
a cluster of unit mass and does a random walk until two particles become nearest neighbors.
They then stick together to form a cluster of two particles. This new cluster now moves as a
single random walker with a smaller diﬀusion coeﬃcient. As this process continues, the clusters
become larger and fewer in number. For simplicity, we assume a square lattice with periodic
boundary conditions. The CCA algorithm can be summarized as follows:
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 518
(i) Place N particles at random positions on the lattice. Do not allow a site to be occupied by
more than one particle. Identify the ith particle with the ith cluster.
(ii) Check if any two clusters have particles that are nearest neighbors. If so, join these two
clusters to form a single cluster.
(iii) Choose a cluster at random. Decide whether to move the cluster as discussed. If so, move
it at random to one of the four possible directions. The details will be discussed in the
following paragraphs.
(iv) Repeat steps (ii) and (iii) for the desired number of steps or until there is only a single
cluster.
What rule should we use to decide whether to move a cluster? One possibility is to select
a cluster at random and simply move it. This possibility corresponds to all clusters having
the same diﬀusion coeﬃcient, regardless of their mass. A more realistic rule is to assume that
the diﬀusion coeﬃcient of a cluster is inversely related to its mass s, for example, Ds ∝ s−x
with x 0. A common assumption is x = 1. If we assume that Ds is inversely proportional to
the linear dimension (radius) of the cluster, an assumption consistent with the Stokes–Einstein
relation, then x = 1/d, where d is the spatial dimension. However, because the resultant clusters
are fractals, we really should take x = 1/D, where D is the fractal dimension of the cluster.
To implement the cluster-cluster aggregation algorithm, we need to store the position of
each particle and the cluster to which each particle belongs. In class CCA, which can be downloaded
from ch13 directory, the position of a particle is given by its x- and y-coordinates and
stored in the arrays x and y, respectively. The array element site[x][y] equals zero if there is
no particle at (x,y); otherwise, the element equals the label of the cluster to which the particle
at (x,y) belongs.
The labels of the clusters are found as follows. The array element firstParticle(k) gives
the particle label of the ﬁrst particle in cluster k. To determine all the particles in a given
cluster, we use a data structure called a linked list. We implement the linked list using the
array nextParticle, so that the value of an element of this array is the index for the next
element in the linked list. The array nextParticle contains a series of linked lists, one for
each cluster, such that nextParticle[i] equals the particle label of another particle in the
same cluster as particle i. If nextParticle[i] = −1, there are no more particles in the cluster.
To see how these arrays work, consider three particles 5, 9, and 16 which constitute cluster
4. We have firstParticle[4] = 5, nextParticle[5] = 9, nextParticle[9] = 16, and
nextParticle[16] = -1.
As the clusters undergo a random walk, we need to check if any pair of particles in diﬀerent
clusters have become nearest neighbors. If such a situation occurs, their respective clusters
have to be merged. The check for nearest neighbors is done in method checkNeighbors. If
site[x][y] and site[x+1][y] are both nonzero and are not equal, then the two clusters associated
with these sites need to be combined. To do so, we add the particles of the smaller
cluster to those of the larger cluster. We use another array, lastParticle, to keep track of the
last particle in a cluster. The merger can be accomplished by the following statements:
/ / l i n k l a s t p a r t i c l e of l a r g e r c l u s t e r to f i r s t p a r t i c l e
/ / of smaller c l u s t e r
nextParticle [ l a s t p a r t i c l e [ largerClusterLabel ] ] =
f i r s t P a r t i c l e [ smallerClusterLabel ] ;
/ / s e t s the l a s t p a r t i c l e of l a r g e r c l u s t e r to the l a s t p a r t i c l e
/ / of smaller c l u s t e r
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 519
l a s t P a r t i c l e [ largerClusterLabel ] = l a s t P a r t i c l e [ smallerClusterLabel ] ;
/ / adds mass of smaller c l u s t e r to the l a r g e r c l u s t e r
mass [ largerClusterLabel ] += mass [ smallerClusterLabel ] ;
To complete the merger, all the entries in site[x][y] corresponding to the smaller cluster are
relabeled with the label for the larger cluster, and the last cluster in the list is relabeled by the
label of the small cluster, so that if there are n clusters they are labeled by 0,1,...,n − 1.
(a) Write a target class for class CCA. The class assumes that the diﬀusion coeﬃcient is independent
of the cluster mass. Choose L = 50 and N = 500 and describe the qualitative appearance
of the clusters as they form. Do they appear to be fractals? Compare their appearance
to DLA clusters.
(b) Compute the fractal dimension of the ﬁnal cluster. Use the center of mass rcm as the origin
of the cluster, where rcm = (1/N) i xi, i yi and (xi,yi) is the position of the ith particle.
Average your results over at least ten ﬁnal clusters. Do the same for other values of L and N.
Are the clusters formed by cluster-cluster aggregation more or less space ﬁlling than DLA
clusters?
(c) Assume that the diﬀusion coeﬃcient of a cluster of s particles varies as Ds ∝ s−1/2 in two
dimensions. Let Dmax be the diﬀusion coeﬃcient of the largest cluster. Choose a random
number r between 0 and 1 and move the cluster if r < Ds/Dmax. Repeat the simulations in
part (a) and discuss any changes in your results. What eﬀect does the dependence of D on s
have on the motion of the clusters?
References and Suggestions for Further Reading
We have considered only a few of the models that lead to self-similar patterns. Use your imagination
to design your own model of real-world growth processes. We encourage you to read the
research literature and the many books on fractals.
R. C. Ball and R. M. Brady, “Large scale lattice eﬀect in diﬀusion-limited aggregation,” J.
Phys. A 18, L809–L813 (1985). The authors discuss the optimization algorithm used in
Project 13.17.
Albert–László Barabási and H. Eugene Stanley, Fractal Concepts in Surface Growth, Cambridge
University Press (1995).
J. B. Bassingthwaighte, L. S. Liebovitch, and B. J. West, Fractal Physiology Oxford University
Press (1994).
D. Ben–Avraham and S. Havlin, Diﬀusion and Reactions in Fractals and Disordered Systems,
Cambridge University Press (2005).
K. S. Birdi, Fractals in Chemistry, Geochemistry, and Biophysics, Plenum Press (1993).
Armin Bunde and Shlomo Havlin, editors, Fractals and Disordered Systems, revised edition,
Springer–Verlag (1996).
Fereydoon Family and David P. Landau, editors, Kinetics of Aggregation and Gelation, North–
Holland (1984). A collection of research papers that give a wealth of information, pictures,
and references on a variety of growth models.
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 520
Fereydoon Family, Daniel E. Platt, and Tamás Vicsek, “Deterministic growth model of pattern
formation in dendritic solidiﬁcation,” J. Phys. A 20, L1177–L1183 (1987). The authors
discuss the nature of Laplace fractal carpets.
Fereydoon Family and Tamás Vicsek, editors, Dynamics of Fractal Surfaces, World Scientiﬁc
(1991). A collection of reprints.
Fereydoon Family, Y. C. Zhang, and Tamás Vicsek, “Invasion percolation in an external ﬁeld:
Dielectric breakdown in random media,” J. Phys. A. 19, L733–L737 (1986).
Jens Feder, Fractals, Plenum Press (1988). This text discusses the applications as well as the
mathematics of fractals.
Gary William Flake, The Computational Beauty of Nature, MIT Press (2000).
J.–M. Garcia–Ruiz, E. Louis, P. Meakin, and L. M. Sander, editors, Growth Patterns in Physical
Sciences and Biology, NATO ASI Series B304, Plenum (1993).
Peter Grassberger, “Critical percolation in high dimensions," Phys. Rev. E 67, 036101-1–4
(2003). The author uses the Leath algorithm to estimate the value of pc.
Thomas C. Halsey, “Diﬀusion limited aggregation: A model for pattern formation,” Physics
Today 53 (11), 36 (2000).
J. M. Hammersley and D. C. Handscomb, Monte Carlo Methods, Methuen (1964). The chapter
on percolation processes discusses a growth algorithm for percolation.
H. J. Herrmann, “Geometrical cluster growth models and kinetic gelation,” Physics Reports
136, 153–224 (1986).
Robert C. Hilborn, Chaos and Nonlinear Dynamics, second edition, Oxford University Press
(2000).
Ofer Malcai, Daniel A. Lidar, Ofer Biham, and David Avnir, “Scaling range and cutoﬀs in empirical
fractals,” Phys. Rev. E, 56, 2817–2828 (1997). The authors show that experimental
reports of fractal behavior are typically based on a scaling range that spans only 0.5–2
decades and discuss the possible implications of this limited scaling range.
Benoit B. Mandelbrot, The Fractal Geometry of Nature, W. H. Freeman (1983). An inﬂuential
and beautifully illustrated book on fractals.
Imtiaz Majid, Daniel Ben-Avraham, Shlomo Havlin, and H. Eugene Stanley, “Exact-enumeration
approach to random walks on percolation clusters in two dimensions,” Phys. Rev. B 30,
1626 (1984).
Paul Meakin, Fractals, Scaling and Growth Far From Equilibrium, Cambridge University Press
(1998). Also see P. Meakin, “The growth of rough surfaces and interfaces,” Physics Reports
235, 189–289 (1993). The author has written many seminal articles on DLA and similar
models.
L. Niemeyer, L. Pietronero, and H. J. Wiesmann, “Fractal dimension of dielectric breakdown,”
Phys. Rev. Lett. 52, 1033 (1984).
H. O. Peitgen and P. H. Richter, The Beauty of Fractals, Springer-Verlag (1986).
CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 521
Luciano Pietronero and Erio Tosatti, editors, Fractals in Physics, North–Holland (1986). A collection
of research papers, many of which are accessible to the motivated reader.
Raissa M. D’Souza, “Anomalies in simulations of nearest neighbor ballistic deposition,” Int. J.
Mod. Phys. C 8 (4), 941–951 (1997). The author ﬁnds that ballistic deposition is a sensitive
physical test for correlations present in pseudorandom sequences.
F. Reif, Fundamentals of Statistical and Thermal Physics, McGraw-Hill (1965). Einstein’s relation
between the diﬀusion and mobility is discussed in Chapter 15.
John C. Russ, Fractal Surfaces, Plenum Press (1994).
Evelyn Sander, Leonard M. Sander, and Robert M. Ziﬀ, “Fractals and Fractal Correlations,”
Computers in Physics 8, 420–425 (1994). An introduction to fractal growth models and
the calculation of their properties.
H. Eugene Stanley and Nicole Ostrowsky, editors, On Growth and Form, Martinus Nijhoﬀ Publishers,
Netherlands (1986). A collection of research papers at the same level as the 1984
Family and Landau collection.
Hideki Takayasu, Fractals in the Physical Sciences, John Wiley & Sons (1990).
David D. Thornburg, Discovering Logo, Addison–Wesley (1983). The book is more accurately
described by its subtitle, An Invitation to the Art and Pattern of Nature. The nature of recursive
procedures and fractals are discussed using many simple examples.
Donald L. Turcotte, Fractals and Chaos in Geology and Geophysics, Cambridge University Press
(1992).
Tamás Vicsek, Fractal Growth Phenomena, second edition, World Scientiﬁc Publishing (1992).
This book contains an accessible introduction to diﬀusion limited and cluster-cluster ag-
gregation.
Bruce J. West, Fractal Physiology and Chaos in Medicine, World Scientiﬁc Publishing (1990).
David Wilkinson and Jorge F. Willemsen, “Invasion percolation: A new form of percolation
theory,” J. Phys. A 16, 3365–3376 (1983).
Yu-Xia Zhang, Jian-Ping Sang, Xian-Wu Zou, and Zhun-Zhi Jin, “Random walk on percolation
under an external ﬁeld,” Physica A 350, 163–172 (2005). The authors consider random
walks with a drift.
Chapter 14
Complex Systems
We introduce cellular automata, neural networks, genetic algorithms, and growing networks
to explore the concepts of self-organization and complexity. Applications to sandpiles, ﬂuids,
earthquakes, and other areas are discussed.
14.1 Cellular Automata
Part of the fascination of physics is that it allows us to reduce natural phenomena to a few simple
laws. It is also fascinating to think about how a few simple laws can produce the enormously
rich behavior that we see in nature. In this chapter we will discuss several models that illustrate
some of the new ideas that are emerging from the study of complex systems.
The ﬁrst class of models that we will discuss are known as cellular automata. Cellular automata
were introduced by von Neumann and Ulam in 1948 and are mathematical idealizations
of dynamical systems in which space and time are discrete and the quantities of interest have a
ﬁnite set of discrete values that are updated according to a local rule. A cellular automaton can
be thought of as a lattice of sites or a checkerboard with colored squares (the cells). Each cell
changes its state at the tick of an external clock according to a rule based on the present conﬁguration
of the cells in its neighborhood. Cellular automata are examples of discrete dynamical
systems that can be simulated exactly on a digital computer.
Because the original motivation for studying cellular automata was their biological aspects,
the discrete locations in space are frequently referred to as cells. More recently, cellular automata
have been applied to a wide variety of physical systems ranging from ﬂuids to galaxies.
We will usually refer to sites rather then cells, except when we are explicitly discussing biological
systems. The important characteristics of cellular automata include the following:
1. Space is discrete and consists of a regular array of sites. Each site has a ﬁnite set of values.
2. The rule for the new value of a site depends only on the values of a local neighborhood of
sites near it.
3. Time is discrete. The variables at each site are updated simultaneously based on the values
of the variables at the previous time step. Hence, the state of the entire lattice advances in
discrete time steps.
522
CHAPTER 14. COMPLEX SYSTEMS 523
t:
t + 1: 0
000
1
001
0
010
1
011
1
100
0
101
1
110
0
111
Figure 14.1: Example of a local rule for the evolution of a one-dimensional cellular automaton.
The variable at each site can have values 0 or 1. The top row shows the 23 = 8 possible combinations
of three sites. The bottom row gives the value of the central site at the next iteration. For
example, if the value of a site is 0 and its left neighbor is 1 and its right neighbor is 0, the central
site will have the value 1 in the next time step. This rule is termed 01011010 in binary notation
(see the second row), the modulo-two rule or rule 90. Note that 90 is the base ten (decimal)
equivalent of the binary number 01011010; that is, 90 = 21 + 23 + 24 + 26.
We ﬁrst consider one-dimensional cellular automata and assume that the neighborhood of
a given site is the site itself and the sites immediately to the left and right of it. Each site is
assumed to have two states (a Boolean automaton). An example of such a rule is illustrated in
Figure 14.1, where we see that a rule can be labeled by the binary representation of the update
rule for each of the eight possible neighborhoods and by the base ten equivalent of the binary
representation. Because any eight digit binary number speciﬁes a one-dimensional cellular
automaton, there are 28 = 256 possible rules.
Class OneDimensionalAutomatonApp takes the decimal representation of the rule as input
and produces the rule array update, which is used to update each lattice site using periodic
boundary conditions. The OneDimensionalAutomatonApp class manipulates numbers using
their binary representation. Note the use of the bit manipulation operators »> and & (AND)
in method setRule. To understand how the right shift operator »> works, consider the expression
13 »> 1. In this case the result of the shift operator is to shift the bits of the binary
representation of the integer 13 to the right by one. Because the binary representation of 13 is
1101, the result of the shift operator is 0110. (The left-hand bits are ﬁlled with 0s as needed.)
To understand the nature of the & operator, consider the expression 0110 & 1, which we can
write as 0110 & 0001. In this case the result is 0000 because the & operator sets each of the the
resulting bits to 1 if the corresponding bit in both operands is 1; otherwise, the bit is zero.
We use the LatticeFrame class to represent the sites and their evolution. At a given time,
the sites are drawn in the horizontal direction; time increases in the vertical direction. In
method iterate, the % operator is used to determine the left and right neighbors of a site using
periodic boundary conditions. Also note the use of the left shift operator « in method iterate.
A more complete discussion of bit manipulation is given in Section 14.6.
Listing 14.1: One-dimensional cellular automaton class.
package org . opensourcephysics . sip . ch14 . ca ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . ;
public class OneDimensionalAutomatonApp extends AbstractCalculation {
LatticeFrame automaton = new LatticeFrame ( "" ) ;
/ / update [ ] maps neighborhood c o n f i g u r a t i o n s to 0 or 1
int [ ] update = new int [ 8 ] ;
public void calculate ( ) {
control . clearMessages ( ) ;
int L = control . getInt ( "Linear dimension" ) ;
int tmax = control . getInt ( "Maximum time" ) ;
CHAPTER 14. COMPLEX SYSTEMS 524
/ / d e f a u l t i s l a t t i c e s i t e s a l l zero
automaton . r e s i z e L a t t i c e (L , tmax ) ;
/ / seed l a t t i c e by putting 1 in middle of f i r s t row
automaton . setValue (L/2 , 0 , 1 ) ;
/ / choose c o l o r of empty and occupied s i t e s
automaton . setIndexedColor (0 , java . awt . Color .YELLOW) ; / / empty
automaton . setIndexedColor (1 , java . awt . Color .BLUE ) ; / / occupied
setRule ( control . getInt ( "Rule number" ) ) ;
for ( int t = 1; t<tmax ; t ++) {
i t e r a t e ( t , L ) ;
}
}
public void i t e r a t e ( int t , int L) {
for ( int i = 0; i <L ; i ++) {
/ / read the neighborhood b i t s around index i , using p e r i o d i c b . c ’ s
int l e f t = automaton . getValue ( ( i −1+L)%L , t −1);
int center = automaton . getValue ( i , t −1);
int right = automaton . getValue ( ( i +1)%L , t −1);
/ / encode l e f t , center , and r i g h t b i t s into one i n t e g e r value
/ / between 0 and 7
int neighborhood = ( left <<2)+( center <<1)+( right <<0);
/ / update [ neighborhood ] g i v e s the new s i t e value f o r t h i s neighborhood
automaton . setValue ( i , t , update [ neighborhood ] ) ;
}
}
public void setRule ( int ruleNumber ) {
control . println ( "Rule = "+ruleNumber+"\n" ) ;
control . println ( "111 110 101 100 011 010 001 000" ) ;
for ( int i = 7; i >=0; i −−) {
/ / ( ruleNumber >>> i ) s h i f t s the contents of ruleNumber to the r i g h t by i
/ / b i t s . In p a r t i c u l a r , the i t h b i t of ruleNumber r e s i d e s in the rightmost
/ / p o s i t i o n of t h i s e x p r e s s i o n . After "and" ing with the number 1 , we are
/ / l e f t with e i t h e r the number 0 or 1 , depending on whether the i t h
/ / b i t of ruleNumber was c l e a r e d or s e t .
update [ i ] = ( ( ruleNumber>>>i )&1);
control . print ( " "+update [ i ]+" " ) ;
}
control . println ( ) ;
}
public void reset ( ) {
control . setValue ( "Rule number" , 90);
control . setValue ( "Maximum time" , 100);
control . setValue ( "Linear dimension" , 500);
}
public s t a t i c void main ( String args [ ] ) {
CalculationControl . createApp (new OneDimensionalAutomatonApp ( ) ) ;
}
}
The properties of all 256 one-dimensional cellular automata have been cataloged (see Wol-
CHAPTER 14. COMPLEX SYSTEMS 525
fram, 1984). We explore some of the properties of one-dimensional cellular automata in Problems
14.1 and 14.3.
Problem 14.1. One-dimensional cellular automata
(a) What is the result of 13 & 12, 33 >>> 1 (decimal representation) and 1101 & 0111 (binary
representation)? Consider rule 90 and work out by hand the values of update[] according
to method setRule.
(b) Use OneDimensionalAutomatonApp and consider rule 90 shown in Figure 14.1. This rule is
also known as the modulo-two rule because the value of a site at step t+1 is the sum modulo
2 of its two neighbors at step t. Choose the initial conﬁguration to be a single nonzero
site (the seed) at the midpoint of the lattice. It is suﬃcient to consider the evolution for
approximately twenty iterations. Is the resulting pattern of nonzero sites self-similar? If so,
characterize the pattern by a fractal dimension.
(c) Determine the properties of a rule for which the value of a site at step t + 1 is the sum
modulo 2 of the values of its neighbors plus its own value at step t. This rule is equivalent
to 10010110 or rule 150 = 21 + 22 + 24 + 27. Start with a single seed site.
(d) Choose a random initial conﬁguration for which the independent probability for each site
to have the value 1 is p = 1/2; otherwise, the value of the site is 0. Determine the evolution of
rule 90, rule 150, rule 18 = 21+24 (00010010), rule 73 = 20+23+26 (01001001), and rule 136
(10001000). How sensitive are the patterns that are formed to the initial conditions? Does
the nature of the patterns depend on the use or nonuse of periodic boundary conditions?
Listing 14.2: A more eﬃcient implementation of method iterate in
OneDimensionalAutomatonApp.
public void i t e r a t e ( int t , int L) {
/ / encodes s t a t e (L−1) and s t a t e (0) in second and f i r s t b i t s
/ / of neighborhood v a r i a b l e
int neighborhood = ( automaton . getValue (L−1 , t −1)<<1) +
automaton . getValue (0 , t −1);
for ( int i = 0; i < L ; i ++) {
/ / c l e a r t h i r d b i t of neighborhood , but keep second and f i r s t b i t s
neighborhood = neighborhood & 3;
/ / s h i f t second and f i r s t b i t s of neighborhood to t h i r d
/ / and second b i t s
neighborhood = neighborhood << 1;
/ / encode s t a t e ( i +1) into f i r s t b i t of neighborhood using
/ / p e r i o d i c boundary c o n d i t i o n s
neighborhood += automaton . getValue ( ( i +1)%L , t −1);
/ / neighborhood now encodes the t h r e e b i t s of s t a t e surrounding
/ / index i at time t −1. with neighborhood as an index , the
/ / update [ ] t a b l e g i v e s us the s t a t e at index i and time t .
automaton . setValue ( i , t , update [ neighborhood ] ) ;
}
}
Method iterate in class OneDimensionalAutomatonApp is not as eﬃcient as possible because
it does not use information about the neighborhood at site i to determine the neighborhood
at site i + 1. A more eﬃcient implementation is given in Listing 14.2. To understand how
CHAPTER 14. COMPLEX SYSTEMS 526
this version of method iterate works, suppose that the lattice at t = 0 is 1011, and we want to
determine the neighborhood of the site at i = 0. The answer is 6 in decimal, corresponding to
110 in binary. Because of periodic boundary conditions, the index to the left of i = 0 is L − 1.
The expression (automaton.getValue(L-1,t-1)«1) yields 001 « 1 = 010 because « shifts all
bits to the left. (Only 3 bits are needed to describe the neighborhood.) The statement
int neighborhood = ( automaton . getValue (L−1 , t −1)<<1) +
automaton . getValue (0 , t −1);
yields 010 + 001 = 011. The eﬀect of the statement
neighborhood = neighborhood & 3;
is to clear the third bit of the neighborhood but to keep the second and ﬁrst bits: 011 & 011 =
011. In this case nothing is changed. We then shift the second and ﬁrst bits of the neighborhood
to the third and second bits:
neighborhood = neighborhood << 1;
and obtain neighborhood = 110. Finally the statement
neighborhood += automaton . getValue ( ( i +1)%L , t −1);
gives neighborhood = 011 + 000 = 011, which is 2 in decimal.
∗Problem 14.2. Whose time is more important?
(a) Work out another example to make sure that you understand the nature of the bit manipulations
that are used in Listing 14.1 and in the more eﬃcient version of method iterate.
(b) Which version of method iterate would you use, the more eﬃcient but more diﬃcult to
understand (and debug) version or the less eﬃcient but easier to understand version? What
is more important, computer time or programmer time? In general, the answer depends on
the context.
The dynamical behavior of many of the 256 one-dimensional Boolean cellular automata
is uninteresting, and hence we also consider one-dimensional Boolean cellular automata with
larger neighborhoods (including the site itself). Because a larger neighborhood implies that
there are many more possible update rules, we place some reasonable restrictions on the rules.
First, we assume that the rules are symmetrical; for example, the neighborhood 100 produces
the same value for the central site as 001. We also require that the zero neighborhood 000 yields
0 for the central site, and that the value of the central site depends only on the sum of the values
of the sites in the neighborhood; for example, 011 produces the same value for the central site
as 101 (see Wolfram, 1984).
A simple way of coding the rules that is consistent with these requirements is as follows.
Each rule is labeled by a sequence of 0s and 1s such that the sequence indicates which sums set
the central site equal to 1. If the lowest order digit is 1, then the central site is set to 1 if the sum
is 0. If the next digit is 1, then the central site is set to 1 if the sum is 1, etc. For example, the
rule 10110 indicates that the central site will be set to 1 if the number of neighbors equal to 1 is
1, 2, or 4.
CHAPTER 14. COMPLEX SYSTEMS 527
Problem 14.3. More one-dimensional cellular automata
(a) Modify class OneDimensionalAutomatonApp so that it incorporates the possible rules discussed
in the text based on the number of sites equal to 1 in a neighborhood of 2z + 1 sites.
How many possible rules are there for z = 1? Choose z = 1 and a random initial conﬁguration
and determine if the long time behavior for each rule belongs to one of the following
categories:
(i) A homogeneous state where every site has the same value. An example is rule 1000.
(ii) A pattern consisting of separate stable or periodic regions. An example is rule 0100.
(iii) A chaotic, aperiodic pattern. An example is rule 1010.
(iv) A set of complex, localized structures that may not live forever. There are no examples
for z = 1.
(b) Modify your program so that z = 2. Wolfram (1984) claims that rules 010100 and 110100
are the only examples of complex behavior (category 4). Describe how the behavior of these
two rules diﬀers from the behavior of the other rules. Find at least one rule for each of the
four categories.
The results of Problem 14.3 suggests that an important feature of cellular automata is their
capability for self-organization. In particular, the class of complex localized structures is distinct
from regular as well as aperiodic structures.
An important idea of complexity theory is that simple rules can lead to complex behavior.
This complex behavior is not random but has structure. Are there “coarse grained” descriptions
that can predict the dynamical behavior of these systems, or do we have to implement the model
on a computer using the dynamical rules at the lowest level of description? For example, our
understanding of the ﬂow of a ﬂuid through a pipe would be very limited if the only way we
could obtain information about the behavior of ﬂuids was to solve the equations of motion for
all the individual particles. In this case there is a coarse grained description of ﬂuids where the
fundamental ﬂuid variables are not the individual positions and velocities of the molecules, but
rather a velocity ﬁeld which can be interpreted as a spatial average over the velocities of many
particles. The resultant partial diﬀerential equation of ﬂuid mechanics, known as the Navier–
Stokes equation, provides the coarse grained description, which can be solved, in principle, to
predict the motion of the ﬂuid.
Is there an analogous coarse grained description of a cellular automaton? Israeli and Goldenfeld
have found some examples for which a coarse grained description exists. We ﬁrst simulate
a cellular automaton that produces complex structures. Then we start with the same initial
state and create a coarse grained lattice such that each of its cells is a coarse grained description
of a group of cells on the original lattice. The idea is to determine a diﬀerent update rule to
evolve the coarse grained lattice such that the conﬁgurations of the coarse grained lattice are
identical to the coarse grained conﬁgurations of the original lattice that were obtained using
the original update rule. If it is possible to implement this procedure in general, we would be
better able to develop theories of complex macroscopic systems without needing to know the
details of the dynamics of the microscopic constituents that make up these systems. We explore
two examples in Problem 14.4.
CHAPTER 14. COMPLEX SYSTEMS 528
∗Problem 14.4. Coarse graining one-dimensional cellular automata
(a) Add methods to OneDimensionalAutomatonApp that create a coarse grained lattice such that
groups of three cells are coarse grained to 1 if all three cells are 1 and coarse grained to 0
otherwise. Allow the coarse grained lattice to evolve separately using a diﬀerent update
rule than the original lattice. The coarse grained lattice should be updated after every three
updates of the original lattice. Draw the coarse grained lattice as a space-time diagram
similar to what we have done for the original lattice, such that each cell in the coarse grained
lattice is three times the size of a cell on the original lattice in both the space and time
directions. Use rule 146 (10010010) for the original lattice and rule 128 (10000000) for the
coarse grained lattice. Choose a lattice size L that is a multiple of 3 and run for a time that
is a multiple of 3. You should see similar patterns in the two lattices, although the original
lattice contains some details that are washed out by the coarse grained lattice. If you coarse
grain the original lattice cells at each time step, you will obtain the same pattern as the
coarse grained lattice.
(b) Modify your program such that each pair of cells is coarse grained to 1 if two original cells
are both 0 or both 1 and coarse grained to 0 otherwise. Use rule 105 (01101001) on the
original cells with L = 120 for 60 iterations and run the coarse grained system using rule
150 (10100110). You should obtain results similar to those found in part (a).
Traﬃc models. Physicists have been at the forefront of the development of a more systematic
approach to the characterization and control of traﬃc. Much of this work was initiated at
General Motors by Robert Herman in the late 1950s. The car-following theory of traﬃc ﬂow
that he and Elliott Montroll and others developed during this time is still used today. What has
changed is the way we can implement these theories. The continuum approach used by Herman
and Montroll is based on partial diﬀerential equations. An alternative that is more ﬂexible and
easier to understand is based on cellular automata.
We ﬁrst consider a simple one lane highway where cars enter at one end and exit at the other
end. To implement the Nagel–Schreckenberg cellular automaton model, we use integer arrays
for the position xi and velocity vi, where i indexes a car and not a lattice site. The important
input parameters of the simulation are the maximum velocity vmax, the density of cars ρ, and the
probability p of a car slowing down. This probability adds some randomization to the drivers.
The algorithm implemented in class Freeway for the motion of each car at each iteration is as
follows:
1. If vi < vmax, increase the velocity vi of car i by one unit; that is, vi → vi + 1. This change
models the process of acceleration to the maximum velocity.
2. Compute the distance to the next car d. If vi ≥ d, then reduce the velocity to vi = d − 1 to
prevent crashes.
3. With probability p, reduce the velocity of a moving car by one unit. Thus, vi → vi − 1.
4. Update the position xi of car i so that xi(t + 1) = xi(t) + vi.
This ordering of the steps ensures that cars do not overlap.
Listing 14.3: One lane freeway class.
package org . opensourcephysics . sip . ch14 . t r a f f i c ;
CHAPTER 14. COMPLEX SYSTEMS 529
import java . awt . Graphics ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . frames . ;
import org . opensourcephysics . display2d . ;
import org . opensourcephysics . controls . ;
public class Freeway implements Drawable {
public int [ ] v , x , xtemp ;
public LatticeFrame spaceTime ;
public double [ ] distribution ;
public int roadLength ;
public int numberOfCars ;
public int maximumVelocity ;
public double p ; / / p r o b a b i l i t y of reducing v e l o c i t y
private CellLattice road ;
public double flow ;
public int steps , t ;
/ / number of i t e r a t i o n s b e f o r e s c r o l l i n g space −time diagram
public int scrollTime = 100;
public void i n i t i a l i z e ( LatticeFrame spaceTime ) {
this . spaceTime = spaceTime ;
x = new int [ numberOfCars ] ;
xtemp = new int [ numberOfCars ] ; / / used to allow p a r a l l e l updating
v = new int [ numberOfCars ] ;
spaceTime . r e s i z e L a t t i c e ( roadLength , 100);
road = new CellLattice ( roadLength , 1 ) ;
road . setIndexedColor (0 , java . awt . Color .RED) ;
road . setIndexedColor (1 , java . awt . Color .GREEN) ;
spaceTime . setIndexedColor (0 , java . awt . Color .RED) ;
spaceTime . setIndexedColor (1 , java . awt . Color .GREEN) ;
int d = roadLength/numberOfCars ;
x [ 0 ] = 0;
v [0 ] = maximumVelocity ;
for ( int i = 1; i <numberOfCars ; i ++) {
x [ i ] = x [ i −1]+d ;
i f (Math . random() <0.5) {
v [ i ] = 0;
} else {
v [ i ] = 1;
}
}
flow = 0;
steps = 0;
t = 0;
}
public void step ( ) {
for ( int i = 0; i <numberOfCars ; i ++) {
xtemp [ i ] = x [ i ] ;
}
for ( int i = 0; i <numberOfCars ; i ++) {
i f ( v [ i ]<maximumVelocity ) {
CHAPTER 14. COMPLEX SYSTEMS 530
v [ i ]++; / / a c c e l e r a t i o n
}
/ / d i s t a n c e between cars
int d = xtemp [ ( i +1)%numberOfCars]−xtemp [ i ] ;
/ / p e r i o d i c boundary conditions , d = 0 c o r r e c t l y t r e a t s one
/ / car on road
i f (d<=0) {
d += roadLength ;
}
i f ( v [ i ]>=d) {
v [ i ] = d−1; / / slow down due to cars in f r o n t
}
i f ( ( v [ i ]>0)&&(Math . random() <p ) ) {
v [ i ] − −; / / randomization
}
x [ i ] = ( xtemp [ i ]+v [ i ])%roadLength ;
flow += v [ i ] ;
}
steps ++;
computeSpaceTimeDiagram ( ) ;
}
public void computeSpaceTimeDiagram ( ) {
t ++;
i f ( t<scrollTime ) {
for ( int i = 0; i <numberOfCars ; i ++) {
spaceTime . setValue ( x [ i ] , t , 1 ) ;
}
} else { / / s c r o l l diagram
for ( int y = 0; y<scrollTime −1;y++) {
for ( int i = 0; i <roadLength ; i ++) {
spaceTime . setValue ( i , y , spaceTime . getValue ( i , y + 1 ) ) ;
}
}
for ( int i = 0; i <roadLength ; i ++) {
spaceTime . setValue ( i , scrollTime −1 , 0 ) ; / / zero l a s t row
}
for ( int i = 0; i <numberOfCars ; i ++) {
spaceTime . setValue ( x [ i ] , scrollTime −1 , 1 ) ; / / add new row
}
}
}
public void draw ( DrawingPanel panel , Graphics g ) {
i f ( x==null ) {
return ;
}
road . setBlock (0 , 0 , new byte [ roadLength ] [ 1 ] ) ;
for ( int i = 0; i <numberOfCars ; i ++) {
road . setValue ( x [ i ] , 0 , ( byte ) 1 ) ;
}
road . draw ( panel , g ) ;
g . drawString ( "Number of Steps = "+steps , 10 , 20);
CHAPTER 14. COMPLEX SYSTEMS 531
g . drawString ( "Flow =
"+ControlUtils . f3 ( ( double ) flow /( roadLength steps ) ) , 10 , 40);
g . drawString ( "Density =
"+ControlUtils . f3 ( ( double ) numberOfCars /( roadLength ) ) , 10 , 60);
}
}
The target class FreewayApp shows the movement of the cars and a space-time diagram,
with time on the vertical axis and space on the horizontal axis. When the number of iterations
equals scrollTime, the diagram scrolls down. The ﬂow rate is the average of the car velocities
divided by the length of the highway. Thus, two cars moving at constant velocity will have twice
the ﬂow rate of one car moving at the same velocity.
Listing 14.4: FreewayApp Class.
package org . opensourcephysics . sip . ch14 . t r a f f i c ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . ;
public class FreewayApp extends AbstractSimulation {
Freeway freeway = new Freeway ( ) ;
DisplayFrame display = new DisplayFrame ( "Freeway" ) ;
LatticeFrame spaceTime = new LatticeFrame ( "space" , "time" ,
"Space Time Diagram" ) ;
public FreewayApp ( ) {
display . addDrawable ( freeway ) ;
}
public void i n i t i a l i z e ( ) {
freeway . numberOfCars = control . getInt ( "Number of cars" ) ;
freeway . roadLength = control . getInt ( "Road length" ) ;
freeway . p = control . getDouble ( "Slow down probability" ) ;
freeway . maximumVelocity = control . getInt ( "Maximum velocity" ) ;
display . setPreferredMinMax (0 , freeway . roadLength , −3, 4 ) ;
freeway . i n i t i a l i z e ( spaceTime ) ;
}
public void doStep ( ) {
freeway . step ( ) ;
}
public void reset ( ) {
control . setValue ( "Number of cars" , 10);
control . setValue ( "Road length" , 50);
control . setValue ( "Slow down probability" , 0 . 5 ) ;
control . setValue ( "Maximum velocity" , 2 ) ;
control . setValue ( "Steps between plots" , 1 ) ;
enableStepsPerDisplay ( true ) ;
}
public void resetAverages ( ) {
freeway . flow = 0;
freeway . steps = 0;
CHAPTER 14. COMPLEX SYSTEMS 532
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl control =
SimulationControl . createApp (new FreewayApp ( ) ) ;
control . addButton ( "resetAverages" , "resetAverages" ) ;
}
}
Problem 14.5. Cellular automata traﬃc models
(a) Run FreewayApp for 10 cars on a road of length 50, with vmax = 2 and p = 0.5. Allow the
system to evolve before recording the ﬂow rate. Repeat the simulation with a diﬀerent initial
conﬁguration at least several more times to estimate the uncertainty in the data. Repeat for
1, 2, 5, 20, 30, and 40 cars. Plot the ﬂow rate versus the density. This plot is called the
fundamental diagram. Explain its qualitative shape. At what density do traﬃc jams begin to
occur?
(b) Repeat part (a) with a road of length 500 and the same car densities. Use other road lengths
to determine the minimum road length needed to obtain results that are independent of the
length of the road.
(c) Add methods to your classes to compute the velocity and gap distributions, where the gap
is deﬁned as the distance between two cars.
(d) For a ﬁxed road length, compare your results for vmax = 1 with your results for vmax = 2.
Also consider vmax = 5. Are there any qualitative diﬀerences in the behavior of the cars?
(e) Explore the eﬀect of the speed reduction probability by considering p = 0.2 and p = 0.8.
(f) Add on- and oﬀ-ramps separated by a ﬁxed distance. One way to do so is to choose a car
at random and have it slow down as it approaches the oﬀ-ramp and exits. To maintain a
constant density, allow a car to enter the on-ramp whenever a car leaves the highway. What
is the eﬀect of adding the on- and oﬀ-ramps?
(g) Modify your program to simulate a two-lane highway. You will need to choose rules for
moving from one lane to the other. Some possibilities to explore include the following. One
reason for a car to move to the left lane is that the car is moving at less than the maximum
speed and cannot increase its speed due to the car in front of it. Such a car could move to
the left lane if there were a free space to the left. One reason for a car to move to the right
lane is that there is a car immediately behind it. How does the behavior of the two lane
highway diﬀer from that of the one-lane highway?
(h) Modify your two-lane simulation so that there are two kinds of vehicles (for example, cars
and trucks) with diﬀerent values of vmax. How do the gap and velocity distributions change?
Compute separate values for the truck and car ﬂows as well as the total ﬂow. Compute the
average speed of the trucks and compare it with that of cars.
Because one-dimensional cellular automata models are limited, we consider several twodimensional
models. The philosophy is the same except that the neighborhood contains more
sites. For the eight neighbor sites shown in Figure 14.2a, there are 29 = 512 possible conﬁgurations
for the eight neighbors and the center site, and 2512 possible rules. Clearly, we cannot
CHAPTER 14. COMPLEX SYSTEMS 533
(a) (b)
Figure 14.2: (a) The local neighborhood of a site in the Game of Life is given by the sum of its
eight neighbors. (b) Examples of initial conﬁgurations for the Game of Life, some of which lead
to interesting patterns. Live cells are shaded.
go through all these rules in any systematic fashion as we did for one-dimensional cellular automata.
For this reason, we will choose our rules based on other considerations.
The Game of Life. The rules used in LifeApp implement a popular two-dimensional cellular
automaton known as the Game of Life. This model, invented in 1970 by the mathematician
John Conway, produces many fascinating patterns. The rules of the game are simple. For each
cell determine the sum of the values of its four nearest and four next-nearest neighbors (see
Figure 14.2a). A “live” cell (value 1) remains alive only if this sum equals 2 or 3. If the sum is
greater than 3, the cell will “die” (become 0) at the next iteration due to overcrowding. If the
sum is less than 2, the cell will die due to isolation. A dead cell will come to life only if the sum
equals 3.
Listing 14.5: Implementation of the Game of Life.
package org . opensourcephysics . sip . ch14 . ca ;
import org . opensourcephysics . frames . ;
import org . opensourcephysics . controls . ;
import java . awt . Color ;
public class LifeApp extends AbstractSimulation {
LatticeFrame latticeFrame = new LatticeFrame ( "Game of Life" ) ;
byte [ ] [ ] newCells ;
int size = 16;
public LifeApp ( ) {
latticeFrame . setToggleOnClick ( true , 0 , 1 ) ;
latticeFrame . setIndexedColor (0 , Color .RED) ;
latticeFrame . setIndexedColor (1 , Color .BLUE ) ;
}
public void i n i t C e l l s ( int size ) {
this . size = size ;
newCells = new byte [ size ] [ size ] ;
latticeFrame . setAll ( newCells , 0 , size , 0 , size ) ;
latticeFrame . setValue ( size /2 , size /2 , 1 ) ;
latticeFrame . setValue ( size /2−1, size /2 , 1 ) ;
latticeFrame . setValue ( size /2+1 , size /2 , 1 ) ;
latticeFrame . setValue ( size /2 , size /2−1, 1 ) ;
latticeFrame . setValue ( size /2 , size /2+1 , 1 ) ;
}
public void clear ( ) {
latticeFrame . setAll (new byte [ size ] [ size ] ) ;
CHAPTER 14. COMPLEX SYSTEMS 534
latticeFrame . repaint ( ) ;
}
public void reset ( ) {
control . println ( "Click in drawingPanel to toggle cells." ) ;
control . setValue ( "grid size" , 16);
i n i t C e l l s ( 1 6 ) ;
}
public void i n i t i a l i z e ( ) {
i n i t C e l l s ( control . getInt ( "grid size" ) ) ;
}
private int calcNeighborsPeriodic ( int row , int col ) {
/ / do not count s e l f
int neighbors = −latticeFrame . getValue (row , col ) ;
/ / add the s i z e so that the mod operator works f o r row = 0
/ / and c o l = 0
row += size ;
col += size ;
for ( int i = −1; i <=1; i ++) {
for ( int j = −1; j <=1; j ++) {
neighbors += latticeFrame . getValue ( ( row+i )%size , ( col+ j )% size ) ;
}
}
return neighbors ;
}
public void doStep ( ) {
for ( int i = 0; i <size ; i ++) {
for ( int j = 0; j <size ; j ++) {
newCells [ i ] [ j ] = 0;
}
}
for ( int i = 0; i <size ; i ++) {
for ( int j = 0; j <size ; j ++) {
switch ( calcNeighborsPeriodic ( i , j ) ) {
case 0 :
case 1 :
newCells [ i ] [ j ] = 0; / / d i e s
break ;
case 2 :
/ / l i f e goes on
newCells [ i ] [ j ] = ( byte ) latticeFrame . getValue ( i , j ) ;
break ;
case 3 :
newCells [ i ] [ j ] = 1; / / condition f o r b i r t h
break ;
default :
newCells [ i ] [ j ] = 0; / / d i e s of overcrowding i f >3
}
}
}
CHAPTER 14. COMPLEX SYSTEMS 535
latticeFrame . setAll ( newCells ) ;
}
public s t a t i c void main ( String [ ] args ) {
OSPControl control = SimulationControl . createApp (new LifeApp ( ) ) ;
control . addButton ( "clear" , "Clear" ) ; / / o p t i o n a l custom action
}
}
Problem 14.6. The Game of Life
(a) LifeApp allows the user to determine the initial conﬁguration interactively by clicking on a
cell to change its value before hitting the Start button. Choose several initial conﬁgurations
with a small number of live cells and determine the diﬀerent types of patterns that emerge.
Some suggested initial conﬁgurations are shown in Figure 14.2b. Does it matter whether
you use ﬁxed or periodic boundary conditions? Use a 16 × 16 lattice.
(b) Modify LifeApp so that each cell is initially alive with a 50% probability. Use a 32 × 32
lattice. What types of patterns typically result after a long time? What happens for 20%
live cells? What happens for 70% live cells?
(c) Assume that each cell is initially alive with probability p. Given that the density of live cells
at time t is ρ(t), what is ρ(t + 1), the expected density at time t + 1? Do the simulation and
plot ρ(t + 1) versus ρ(t). If p = 0.5, what is the steady-state density of live cells?
(d)∗ LifeApp has not been optimized for the Game of Life and is written so that other rules can
be implemented easily. Rewrite LifeApp so that it uses bit manipulation (see Section 14.6).
The Game of Life is an example of a universal computing machine. That is, we can choose
an initial conﬁguration of live cells to represent any possible program and any set of input data,
run the Game of Life, and the output data will appear in some region of the lattice. The proof of
this result (see Berlekamp et al.) involves showing how various conﬁgurations of cells represent
the components of a computer, including wires, storage, and the fundamental components of a
CPU – the digital logic gates that perform and, or, and other logical and arithmetic operations.
Other cellular automata can also be shown to be universal computing machines.
14.2 Self-Organized Critical Phenomena
Very large events such as a magnitude eight earthquake, an avalanche on a snow covered mountain,
the sudden collapse of an empire (for example, the Soviet Union), or the crash of the stock
market are rare. When such events occur, are they due to some special set of circumstances or
are they part of a more general pattern of events that would occur without any speciﬁc external
intervention? The idea of self-organized criticality is that in many cases the occurrence of very
large events does not depend on special conditions or external forces and is due to the intrinsic
dynamics of the system.
If s represents the magnitude of an event, such as the energy released in an earthquake or
the amount of snow in an avalanche, then a system is said to be critical if the number of events,
N(s), follows a power law:
N(s) ∼ s−α
(no characteristic scale). (14.1)
CHAPTER 14. COMPLEX SYSTEMS 536
If α ≈ 1, the form (14.1) implies that there would be one large event of size 1000 for every 1000
events of size one. One implication of the power law form (14.1) is that there is no characteristic
scale, and the system is said to be scale invariant. This terminology reﬂects the fact that
power laws look the same on all scales. For example, the replacement s → bs in the function
N(s) = As−α yields a function N(s) that is indistinguishable from N(s), except for a change in
the amplitude A by the factor b−α.
Contrast the nature of the power law dependence of N(s) in (14.1) to the result of combining
a large number of independently acting random events. In this case we know that the
distribution of the sum is a Gaussian (see Problem 7.15), and N(s) has the form
N(s) ∼ e−(s/s0)2
(characteristic scale). (14.2)
Scale invariance does not hold for functions that decay as in (14.2), because the replacement
s → bs in the function e−(s/s0)2
changes s0 (the characteristic scale or size of s) by the factor b.
Note that for a power law distribution, there are events of all sizes, but for a Gaussian distribution,
there are, practically speaking, no events much larger than the characteristic scale s0. For
example, if we take s0 = 100, there would be one large event of size 1000 for every 2.7 × 1043
events of size one!
A simple example of self-organized critical phenomena is an idealized sandpile. Suppose
that we construct a sandpile by randomly adding one grain at a time onto a ﬂat surface with
open edges. Initially, the grains will remain where they land, but after we add more grains,
there will be small avalanches during which the grains move so that the local slope of the pile
is not too big. Eventually, the pile will reach a statistically stationary (time-independent) state,
and the amount of sand added will balance the sand that falls oﬀ the edge (on the average).
When a single grain of sand is added to such a conﬁguration, a rearrangement might occur that
triggers an avalanche of any size (up to the size of the system), so that the mean slope again
equals the critical value. We say that the statistically stationary state is critical because there
are avalanches of all sizes. The stationary state is self-organized because no external parameter
(such as the temperature) needs to be tuned to force the system to this state. In contrast, the
concentration of ﬁssionable material in a nuclear chain reaction has to be carefully controlled
for the nuclear chain reaction to become critical.
We consider a two-dimensional model of a sandpile and represent the height at site i by
the array element height[i]. One grain of sand is added to a random site j, height[j]++, at
each iteration. If height[j] = 4, then we remove the four grains from site j and distribute them
equally to its nearest neighbors. A site whose height is equal to four is said to topple. If any of
the neighbors now have four grains of sand, they topple as well. This process continues until
all sites have less than four grains of sand. Grains that fall outside the lattice are lost forever.
Class Sandpile implements this idealized model. The lattice is stored in a LatticeFrame
and the arrays toppleSiteX and toppleSiteY store the coordinates of the sites with four grains
of sand. The array distribution accumulates the data for the number of sites that topple at
each addition of a grain of sand to the pile. It is possible, though rare, that a site will topple
more than once in one step. Hence, the number of toppled sites may be greater than the number
of sites in the lattice.
Physically, it is not the actual height that determines toppling but the mean local slope
between a site and its nearest neighbors. Thus, what we call the “height” really should be called
the “slope.” However, in the literature many authors use the term “height.”
Listing 14.6: Implementation of the two-dimensional sandpile model.
package org . opensourcephysics . sip . ch14 . sandpile ;
CHAPTER 14. COMPLEX SYSTEMS 537
import java . awt . Graphics ;
import org . opensourcephysics . frames . ;
public class Sandpile {
int [ ] distribution ; / / d i s t r i b u t i o n of number of s i t e s toppling
int [ ] toppleSiteX , toppleSiteY ;
LatticeFrame height ;
int L , numberToppledMax ;
int numberToppled , numberOfSitesToTopple , numberOfGrains ;
public void i n i t i a l i z e ( LatticeFrame height ) {
this . height = height ;
height . r e s i z e L a t t i c e (L , L ) ; / / c r e a t e new l a t t i c e
/ / s i z e of d i s t r i b u t i o n array
numberToppledMax = 2 L L+1;
/ / could use histogramframe instead
distribution = new int [ numberToppledMax ] ;
toppleSiteX = new int [L L ] ;
toppleSiteY = new int [L L ] ;
numberOfGrains = 0;
resetAverages ( ) ;
}
public void step ( ) {
numberOfGrains++;
numberToppled = 0;
int x = ( int ) (Math . random ( ) L ) ;
int y = ( int ) (Math . random ( ) L ) ;
int h = height . getValue ( x , y )+1;
height . setValue ( x , y , h ) ; / / add grain to random s i t e
height . render ( ) ;
i f (h==4) { / / t op pl e grain
numberOfSitesToTopple = 1;
boolean unstable = true ;
int [ ] siteToTopple = { x , y } ;
while ( unstable ) {
unstable = toppleSite ( siteToTopple ) ;
}
}
distribution [ numberToppled ]++;
}
public boolean toppleSite ( int siteToTopple [ ] ) { / / t op pl e s i t e
numberToppled++;
int x = siteToTopple [ 0 ] ;
int y = siteToTopple [ 1 ] ;
numberOfSitesToTopple −−;
/ / remove grains from s i t e
height . setValue ( x , y , height . getValue ( x , y ) −4);
height . render ( ) ;
/ / add grains to neighbors
/ / i f (x , y ) i s on the border of the l a t t i c e , then
/ / some grains w i l l be l o s t .
CHAPTER 14. COMPLEX SYSTEMS 538
i f ( x+1<L) {
addGrain ( x+1, y ) ;
}
i f ( x>0) {
addGrain ( x−1 , y ) ;
}
i f ( y+1<L) {
addGrain ( x , y +1);
}
i f ( y>0) {
addGrain ( x , y −1);
}
i f ( numberOfSitesToTopple >0) {
/ / next s i t e to t op pl e
siteToTopple [ 0 ] = toppleSiteX [ numberOfSitesToTopple −1];
siteToTopple [ 1 ] = toppleSiteY [ numberOfSitesToTopple −1];
return true ;
} else {
return false ;
}
}
public void addGrain ( int x , int y ) {
int h = height . getValue ( x , y )+1;
height . setValue ( x , y , h ) ; / / add grain to s i t e
height . render ( ) ;
i f (h==4) { / / new s i t e to t op pl e
toppleSiteX [ numberOfSitesToTopple ] = x ;
toppleSiteY [ numberOfSitesToTopple ] = y ;
numberOfSitesToTopple++;
}
}
public void resetAverages ( ) {
distribution = new int [ numberToppledMax ] ;
numberOfGrains = 0;
}
}
Listing 14.7: The target class for the two-dimensional sandpile model.
package org . opensourcephysics . sip . ch14 . sandpile ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . ;
public class SandpileApp extends AbstractSimulation {
Sandpile sandpile = new Sandpile ( ) ; ;
LatticeFrame height = new LatticeFrame ( "x" , "y" , "Sandpile" ) ;
PlotFrame plotFrame = new PlotFrame ( "ln s" , "ln N" ,
"Distribution of toppled sites" ) ;
public SandpileApp ( ) {
height . setIndexedColor (0 , java . awt . Color .WHITE) ;
height . setIndexedColor (1 , java . awt . Color .BLUE ) ;
CHAPTER 14. COMPLEX SYSTEMS 539
height . setIndexedColor (2 , java . awt . Color .GREEN) ;
height . setIndexedColor (3 , java . awt . Color .RED) ;
height . setIndexedColor (4 , java . awt . Color .BLACK) ;
}
public void i n i t i a l i z e ( ) {
sandpile . L = control . getInt ( "L" ) ;
height . setPreferredMinMax (0 , sandpile . L , 0 , sandpile . L ) ;
sandpile . i n i t i a l i z e ( height ) ;
}
public void doStep ( ) {
sandpile . step ( ) ;
}
public void stop ( ) {
plotFrame . clearData ( ) ;
for ( int s = 1; s<sandpile . distribution . length ; s++) {
double f = ( double ) sandpile . distribution [ s ] ;
double N = ( double ) sandpile . numberOfGrains ;
i f ( f > 0) {
plotFrame . append (0 , Math . log ( s ) , Math . log ( f /N ) ) ; ;
}
}
plotFrame . render ( ) ;
}
public void reset ( ) {
control . setValue ( "L" , 10);
enableStepsPerDisplay ( true ) ;
}
public void resetAverages ( ) {
sandpile . resetAverages ( ) ;
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl control = SimulationControl . createApp (new SandpileApp ( ) ) ;
control . addButton ( "resetAverages" , "resetAverages" ) ;
}
}
Problem 14.7. A two-dimensional sandpile model
(a) Use the classes Sandpile and SandpileApp to simulate a two-dimensional sandpile with
linear dimension L. Run the simulation with L = 10 and stop it once toppling starts to
occur. When this behavior occurs, black cells (with four grains) will momentarily appear.
Use the Step button to watch individual toppling events and obtain a qualitative sense of
the dynamics of the sandpile model.
(b) Comment out the height.render() statements in Sandpile and add a statement to SandPileApp
so that the number of grains added to the system is displayed. (The number of
grains added is a measure of the number of conﬁgurations that are included in the various
CHAPTER 14. COMPLEX SYSTEMS 540
averages.) Now you will not be able to see individual toppling events, but you can more
quickly collect data on the toppling distribution, the frequency of the number of sites that
topple when a grain is added. The program outputs a log-log plot of the distribution. Estimate
the slope of the log-log distribution from the part of the plot that is linear and thus
determine the power law exponent α. Reset the averages and repeat your calculation to
obtain another estimate of α. If your two estimates of α are within a few percent, you have
added enough grains of sand. Compute α for L = 10, 20, 40, and 80. As you make the lattice
size larger, the range over which the log-log plot is linear should increase. Explain why the
plot is not linear for large values of the number of toppled sites.
Of course, the model of a sandpile in Problem 14.7 is over simpliﬁed. Laboratory experiments
indicate that real sandpiles show power law behavior if the piles are small, but that larger
sandpiles do not (see Jaeger et al.).
Earthquakes. The empirical Gutenberg–Richter law for N(E), the number of earthquakes
with energy release E, is consistent with power law behavior:
N(E) ∼ E−b
, (14.3)
with b ≈ 1. The magnitude of earthquakes on the Richter scale is approximately the logarithm
of the energy release. This power law behavior does not necessarily hold for individual fault systems,
but holds reasonably accurately when all fault systems are considered. One implication
of the power law dependence in (14.3) is that there is nothing special about large earthquakes.
In Problems 14.8 and 14.9 and Project 14.26 we explore some earthquake models.
Given the long time scales between earthquakes, there is considerable interest in simulating
models of earthquakes. The Burridge–Knopoﬀ model considered in Project 14.26 consists
of a system of coupled masses in contact with a rough surface. The masses are subjected to
static and dynamic friction forces due to the surface, and are also pulled by an external force
corresponding to slow tectonic plate motion. The major diﬃculty with this model is that the
numerical solution of the corresponding equations of motion is computationally intensive. For
this reason we consider several cellular automaton models that retain some of the basic physics
of the Burridge–Knopoﬀ model.
Problem 14.8. A simple earthquake model
Deﬁne the real variable F(i,j) on a square lattice, where F represents the force or stress on the
block at position (i,j). The initial state of the lattice at time t = 0 is found by assigning small
random values to F(i,j). The lattice is updated according to the following rules:
(i) Increase F at every site by a small amount ∆F, for example, ∆F = 10−3, and increase the
time t by 1. This increase represents the eﬀect of the driving force due to the slow motion
of the tectonic plate.
(ii) Check if F(i,j) is greater than Fc, the threshold value of the force. If not, the system is
stable and step 1 is repeated. If the system is unstable, go to step 3. Choose Fc = 4 for
convenience.
(iii) The release of stress due to the slippage of a block is represented by letting F(i,j) = F(i,j)−
Fc. The transfer of stress is represented by updating the stress at the sites of the four
neighbors at (i,j ± 1) and (i ± 1,j): F → F + 1. Periodic boundary conditions are not used.
CHAPTER 14. COMPLEX SYSTEMS 541
These rules are equivalent to the Bak–Tang–Wiesenfeld model. What is the relation of this
model to the sandpile model considered in Problem 14.7?
As an example, choose L = 10. Do the simulation and show that the system eventually
comes to a statistically stationary state, where the average value of the stress at each site stops
growing. Monitor N(s), the number of earthquakes of size s, where s is the total number of sites
(blocks) that are aﬀected by the instability. Then consider L = 30 and repeat your simulations.
Are your results for N(s) consistent with scaling?
Problem 14.9. A dissipative earthquake model
The Bak–Tang–Wiesenfeld earthquake model discussed in Problem 14.8 displays power law
scaling due to the inherent conservation of the dynamical variable, the stress. It is easy to modify
the model so that the stress is not conserved and the model is more realistic. The Rundle–
Jackson–Brown/Olami–Feder–Christensen model of an earthquake fault is a simple example of
such a nonconservative system.
(a) Modify the toppling rule in Problem 14.8 so that when the stress on site (i,j) exceeds Fc,
not all the excess stress is given to the neighbors. In particular, assume that when site (i,j)
topples, F(i,j) is reduced to the residual stress Fr(i,j). The amount α(Fij − Fr) is dissipated
leaving (Fij − Fr)(1 − α) to be distributed equally to the neighbors. If α = 0, the model is
equivalent to the model considered in Problem 14.8. Choose α = 0.2 and determine if N(s)
exhibits power law scaling. For simplicity, choose Fc = 4 and Fr = 1 (see Grassberger).
(b) Make the model more realistic by adding a small amount of noise to Fr so that Fr is uniformly
distributed between 1 − δ,1 + δ with δ = 0.05. Also run the model in what is called
the “zero-velocity limit” by ﬁnding the site with the maximum stress Fmax and then increasing
the stress on all sites by Fc − Fmax so that only one site initially becomes unstable.
Determine N(s) and see if your results diﬀer from what you found in part (a). Do you still
observe power law scaling?
(c) The model can be made more realistic still by assuming that the interaction between the
blocks is long range due to the existence of elastic forces. Distribute the excess stress equally
to all z neighbors that are within a distance of radius R of an unstable site. Each of the z
neighbors receives a stress equal to (Fij − Fr)(1 − α)/z. First choose R = 3 and see if the
qualitative behavior of N(s) changes as R becomes larger. Lattices with L ≥ 256 are typically
considered with R 30 (see the papers by W. Klein and J. B. Rundle and collaborators.).
The behavior of other simple models of natural phenomena is explored in the following.
Problem 14.10. Forest ﬁre model
(a) Consider the following model of the spread of a forest ﬁre. Suppose that at t = 0 the L × L
sites of a square lattice either have a tree or are empty with probability p and 1 − p, respectively.
The sites that have a tree are on ﬁre with probability f . At each iteration an empty
site grows a tree with probability g, a tree that has a nearest neighbor site on ﬁre catches
ﬁre, and a site that is already on ﬁre dies and becomes empty. This model is an example of
a probabilistic cellular automaton. Write a program to simulate this model and color code
the three types of sites. Use periodic boundary conditions.
(b) Choose L ≥ 30 and determine the values of g for which the forest maintains ﬁres indeﬁnitely.
Note that as long as g > 0, new trees will always grow.
CHAPTER 14. COMPLEX SYSTEMS 542
(c) Use the value of g that you found in part (b) and compute the distribution of the number
of sites sf on ﬁre. If the distribution is critical, determine the exponent α that characterizes
this distribution. Also compute the distribution for the number of trees st. Is there any
relation between these two distributions?
(d)∗ To obtain reliable results it is frequently necessary to average over many initial conﬁgurations.
However, the behavior of many systems is independent of the initial conﬁguration
and averaging over many initial conﬁgurations is unnecessary. This latter possibility is
called self-averaging. Repeat parts (b) and (c), but average your results over ten initial conﬁgurations.
Is this forest ﬁre model self-averaging?
Problem 14.11. Another forest ﬁre model
Consider a simple variation of the model discussed in Problem 14.10. At t = 0 each site is
occupied by a tree with probability p; otherwise, it is empty. The system is updated in successive
iterations as follows:
(i) Randomly grow new trees at time t with a small probability g from sites that are empty at
time t − 1;
(ii) A tree that is not on ﬁre at t − 1 catches ﬁre due to lightning with probability f .
(iii) Trees on ﬁre ignite neighboring trees, which in turn ignite their neighboring trees, etc. The
spreading of the ﬁre occurs instantaneously.
(iv) Trees on ﬁre at time t − 1 die (become empty sites) and are removed at time t (after they
have set their neighbors on ﬁre).
As in Problem 14.10, the changes in each site occur synchronously.
(a) Determine N(s), the number of clusters of trees of size s that catch ﬁre in each iteration. Two
trees are in the same cluster if they are nearest neighbors. Is the behavior of N(s) consistent
with N(s) ∼ s−α? If so, estimate the exponent α for several values of g and f .
(b)∗ The balance between the mean rate of birth and burning of trees in the steady state suggests
a value for the ratio f /g at which this model is likely to be scale invariant. If the average
steady state density of trees is ρ, then at each iteration the mean number of new trees
appearing is gN(1−ρ), where N = L2 is the total number of sites. In the same spirit, we can
say that for small f , the mean number of trees destroyed by lightning is f ρN s , where s
is the mean number of trees in a cluster. Is this reasoning consistent with the results of your
simulation? If we equate these two rates, we ﬁnd that s ∼ [(1−ρ)/ρ](g/f ). Because 0 < ρ <
1, it follows that s → ∞ in the limit f /g → 0. Given the relation s = ∞
s=1 sN(s)/ s N(s)
and the divergent behavior of s , why does it follow that N(s) must decay more slowly than
exponentially with s? This reasoning suggests that N(s) ∼ s−α with α < 2. Is this expectation
consistent with the results that you obtained in part (a)?
In this model there are three well-separated time scales; that is, the time for lightning to
strike (∝ f −1), the time for trees to grow (∝ g−1), and the instantaneous spreading of ﬁre through
a connected cluster. This separation of time scales seems to be an essential ingredient for selforganized
criticality (see Grinstein and Jayaprakash).
CHAPTER 14. COMPLEX SYSTEMS 543
Problem 14.12. Model of punctuated equilibrium
(a) The idea of punctuated equilibrium is that biological evolution occurs episodically rather
than as a steady, gradual process. That is, most of the major changes in life forms occur
in relatively short periods of time. Bak and Sneppen have proposed a simple model that
exhibits some of the behavior of punctuated equilibrium. The model consists of a onedimensional
cellular automaton of linear dimension L, where cell i represents the biological
ﬁtness of species i. Initially, all cells receive a random ﬁtness fi between 0 and 1. Then the
cell with the lowest ﬁtness and its two nearest neighbors are randomly given new ﬁtness
values. This update rule is repeated indeﬁnitely. Write a program to simulate the behavior
of this model. Use periodic boundary conditions and display the ﬁtness of each cell as a
column of height fi. Begin with L = 64 and describe what happens to the distribution of
ﬁtness values after a long time.
(b) We can crudely think of the update process as replacing a species and its neighbors by three
new species. In this sense the ﬁtness represents a barrier to creating a new species. If the
barrier is low, it is easier to create a new species. Do the low ﬁtness species die out? What is
the average value of ﬁtness of the species after the model is run for a long time (104 or more
iterations)? Compute the distribution of ﬁtness values N(f ) averaged over all cells and over
many iterations. Allow the system to come to a ﬂuctuating steady state before computing
N(f ). Plot N(f ) versus f . Is there a critical value fc below which N(f ) is much less than the
values above fc? Is the update rule reasonable from a evolutionary point of view?
(c) Modify your program to compute the distance x between successive ﬁtness changes and
the distribution of these distances P (x). Make a log-log plot of P (x) versus x. Is there any
evidence of self-organized criticality (power law scaling)?
(d) Another way to visualize the results is to make a plot of the time at which a cell is changed
versus the position of the cell. Is the distribution of the plotted points approximately uniform?
We might expect that the survival time of a species depends exponentially on its
ﬁtness, and hence each update corresponds to an elapsed time of e−cfi , where the constant
c sets the time scale, and fi is the ﬁtness of the cell that has been changed. Choose c = 100
and make a similar plot with the time axis replaced by the logarithm of the time; that is, the
quantity 100fi. Is this plot more meaningful?
(e) Another way of visualizing punctuated equilibrium is to plot the number of times groups
of cells change as a function of time. Divide the time into units of 100 updates and compute
the number of ﬁtness changes for cells i = 1 to 10 as a function of time. Do you see any
evidence of punctuated equilibrium?
14.3 The Hopﬁeld Model and Neural Networks
Neural network models have been motivated in part by how neurons in the brain collectively
store and recall memories. Usually, a neuron is in one of two states, a resting potential (not
ﬁring) or ﬁring at the maximum rate. A neuron “ﬁres” once it receives electrical inputs from
other neurons whose strength reaches a certain threshold. An important characteristic of a
neuron is that its output is a nonlinear function of the sum of its inputs. The assumption is
that when memories are stored in the brain, the strengths of the connections between neurons
change.
CHAPTER 14. COMPLEX SYSTEMS 544
One of the uses of neural network models is pattern recognition. If we see someone more
than once, the person’s face provides input that helps us to recall the person’s name. In the
same spirit, a neural network can be given a pattern, for example, a string of ±1s, that partially
reﬂect a previously memorized pattern. The idea is to store memories so that a computer can
recall them when the inputs are close to a particular memory.
We now consider an example of a neural network due to Hopﬁeld. The network consists
of N neurons and the state of the network is deﬁned by the state of each neuron Si, which
in the Hopﬁeld model takes on the values −1 (not ﬁring) and +1 (ﬁring). The strength of the
connection between neuron i and neuron j is denoted by wij, which is determined by the M
stored memories:
wij =
M
α=1
Sα
i Sα
j (14.4)
where Sα
i represents the state of neuron i in stored memory α. Given the initial state of all the
neurons, the dynamics of the network is simple. We choose a neuron i at random and change its
state according to its input, which is i j wijSj, where Sj represents the current state of neuron
j. Then we change the state of neuron i by setting
Si =



+1 for i j wijSj > 0
−1 for i j wijSj ≤ 0.
(14.5)
The threshold value of the input has been set equal to zero, but other values could be used as
well.
The HopfieldApp class in Listing 14.8 implements this model of a neural network and stores
memories based on user input. The state of the network is stored in the array S[i] and the
connections between the neurons are stored in the array w[i][j]. The user initially clicks on
various cells to toggle their values between −1 and +1 and presses the Remember button to store
a pattern. Then the user presses the Randomize button to initialize the Si by setting Si to ±1 at
random. After the memories are stored, press the Start button to update the neurons using the
Hopﬁeld algorithm to try to recall one of the stored memories.
Listing 14.8: HopfieldApp class.
package org . opensourcephysics . sip . ch14 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . ;
/ / Hopfield model of a neural network
public class HopfieldApp extends AbstractSimulation {
LatticeFrame l a t t i c e ;
int N; / / t o t a l number of neurons
double [ ] [ ] w; / / connection array (N by N elements )
int numberOfStoredMemories ;
public HopfieldApp ( ) {
l a t t i c e = new LatticeFrame ( "Hopfield state" ) ;
l a t t i c e . setToggleOnClick ( true , −1, 1 ) ;
l a t t i c e . setIndexedColor ( −1 , java . awt . Color . blue ) ;
l a t t i c e . setIndexedColor (0 , java . awt . Color . blue ) ;
l a t t i c e . setIndexedColor (1 , java . awt . Color . green ) ;
l a t t i c e . setSize (600 , 120);
CHAPTER 14. COMPLEX SYSTEMS 545
}
public void doStep ( ) {
int [ ] S = l a t t i c e . getAll ( ) ;
for ( int counter = 0; counter<N; counter ++) {
/ / chooses random neuron index
int i = ( int ) (N Math . random ( ) ) ;
double sum = 0;
for ( int j = 0; j <N; j ++) {
sum += w[ i ] [ j ] S [ j ] ;
}
S [ i ] = (sum>0) ? 1 : −1;
}
l a t t i c e . setAll ( S ) ;
}
public void i n i t i a l i z e ( ) {
N = control . getInt ( "Lattice size" ) ;
w = new double [N] [N] ;
l a t t i c e . r e s i z e L a t t i c e (N, 1 ) ;
for ( int i = 0; i <N; i ++) {
l a t t i c e . setAtIndex ( i , −1);
}
l a t t i c e . setMessage ( "# memories = "+(numberOfStoredMemories = 0 ) ) ;
}
public void reset ( ) {
control . setValue ( "Lattice size" , 8 ) ;
l a t t i c e . setMessage ( "# memories = "+(numberOfStoredMemories = 0 ) ) ;
}
public void addMemory ( ) {
int [ ] S = l a t t i c e . getAll ( ) ;
for ( int i = 0; i <N; i ++) {
for ( int j = i +1; j <N; j ++) {
w[ i ] [ j ] += S [ i ] S [ j ] ;
w[ j ] [ i ] += S [ i ] S [ j ] ;
}
}
l a t t i c e . setMessage ( "# memories = "+(++numberOfStoredMemories ) ) ;
}
public void randomizeState ( ) {
for ( int i = 0; i <N; i ++) {
l a t t i c e . setAtIndex ( i , Math . random() <0.5 ? −1 : 1 ) ;
}
l a t t i c e . repaint ( ) ;
}
public s t a t i c void main ( String args [ ] ) {
SimulationControl control =
SimulationControl . createApp (new HopfieldApp ( ) ) ;
control . addButton ( "addMemory" , "Remember" ) ;
CHAPTER 14. COMPLEX SYSTEMS 546
control . addButton ( "randomizeState" , "Randomize" ) ;
}
}
Problem 14.13. Memory recall in the Hopﬁeld model
(a) Use the HopfieldApp class to explore the ability of the Hopﬁeld neural network to store and
recall memories. Begin with N = 10 neurons and click on the cells to choose a pattern to
remember. Then click on the Randomize button to randomize the spins. Does the neural
network ﬁnd a pattern similar to the one you saved? Consider other values of N and various
patterns to obtain a feel for how the algorithm works.
(b) Store two memories of 20 bits, for example,
11111¯1¯1¯11¯1¯1¯1¯1¯1¯1¯111111 and 11¯1¯111¯1¯111¯1¯111¯1¯111¯1¯1
where we have written −1 as ¯1. Try to recall a memory using the input
1111111¯1¯1¯1¯1¯1¯11111111.
This input is similar to the ﬁrst memory. Record the Hamming distance between the ﬁnal
state and the closest memory, where the Hamming distance is the number of bits that diﬀer
between two strings. Repeat this procedure for several diﬀerent values of the number of
neurons and the memory length.
(c) Estimate how many memories can be stored for a given number of neurons before recall
becomes severely reduced. Make estimates for N = 10, 20, and 40. What is your criteria for
the recall to be considered correct?
(d) In the Hopﬁeld model every neuron is linked to every other neuron. (The value of the links
wij is determined by the stored memories.) Is the spatial dimension of the system relevant?
Describe how the HopfieldApp class can store two-dimensional patterns.
Neural networks can also be used for diﬃcult optimization problems. In Problem 14.14
we consider the problem of ﬁnding the minimum energy of a model spin glass. The latter is a
magnetic analog of an ordinary glass in which the positions of the molecules are not ordered as
in a crystal. In a spin glass the local magnetic moment is disordered because random magnetic
interactions are “frozen in” and do not change. The simplest model of a spin glass is based on
the simplest model of magnetism, the Ising model (see Section 15.5). In the Ising model the
magnetic moment is represented by a spin si which can take on two values, ±1. The spins are
located on the sites of a lattice. Each spin is assumed to interact with all other spins, and the
total energy of the system is given by
E = −
i,j i
JijViVj (14.6)
where the sum is over all pairs of spins. We have let w → J so that the notation is the same as the
Ising model. If Jij > 0, the spins i and j lower their energy by lining up in the same direction. If
Jij < 0, the spins lower their energy by lining up in opposite directions (see Figure 15.1).
We are interested in ﬁnding the ground state when the coupling constant Jij randomly takes
on the values ±J0/N, where N is the number of spins and J0 is an arbitrary constant. To ﬁnd
the ground state, we need to ﬁnd the conﬁgurations of spins that give the lowest value of the
CHAPTER 14. COMPLEX SYSTEMS 547
energy. Finding the ground state of a spin glass is particularly diﬃcult because there are many
conﬁgurations that correspond to local minima of the energy. In fact the problem of ﬁnding
the exact ground state is an example of a computationally diﬃcult problem called NP-complete.
(Another example of such a problem is considered in Problem 15.31.) In Problem 14.14 we
explore if the Hopﬁeld algorithm can ﬁnd a good approximation to the global minimum.
Problem 14.14. Minimum energy of an Ising spin glass
(a) Choose J0 = 4 in (14.6) and modify the HopfieldApp class so that it applies to a model spin
glass. Display the output string and the energy after every N attempts to change a spin.
Begin with N = 20.
(b) What happens to the energy after a long time? For diﬀerent initial states, but the same set
of the Jij, is the value of the energy the same after the system has evolved for a long time?
Explain your results in terms of the number of local energy minima.
(c) What is the behavior of the system? Do you ﬁnd periodic behavior, random behavior, or
does the system evolve to a state that does not change?
14.4 Growing Networks
A network is a collection of points called nodes that are connected by lines called links. Mathematicians
refer to networks as graphs, and graph theory has been an active ﬁeld of mathematics
for many years. A mathematical network can represent an actual network by deﬁning what a
node represents and the kind of relationship represented by a link. For example, in an airline
network the nodes represent airports and the links represent ﬂights between airports. In an
acquaintance network, the nodes represent individuals, and the links represent the state of two
people knowing each other. In a biochemical network, the nodes represent various molecular
types, and the links represent a reaction between molecules.
One reason for the recent interest in networks is that data on existing networks is now
more readily available due to the widespread use of computers. Indeed, one of the networks of
current interest is the network of websites. Another reason for the interest in networks is that
some new models of networks have been developed.
We ﬁrst discuss one of the original network models, the Erdös–Rényi model. In this model
we start with N nodes and then form n links between pairs of nodes such that each pair has
either one link or no links. The probability of a link between any pair of nodes is p = n/(N(N −
1)/2). One quantity of interest is the degree distribution D( ), which is the fraction of nodes
that have links. An example of the determination of D( ) is shown in Figure 14.3. In the
Erdös-Rényi model this distribution is a Poisson distribution for large N. Thus, there is a peak
in D( ), and for large , D( ) decreases exponentially.
In some network models there is a path between any pair of nodes. In other models, such
as the Erdös–Rényi model, there are some nodes that cannot be reached from other nodes (see
Figure 14.3). In these networks there are other quantities of interest that are analogous to those
in percolation theory. The main diﬀerence is that in network models the position of the nodes
is irrelevant, and only their connectivity is relevant. In particular, there is no spanning cluster
as can exist in percolation models. Instead, there can be a cluster that is signiﬁcantly larger
than the other clusters. In the Erdös–Rényi model, the transition at which such a “giant” cluster
appears depends on the probability p that any pair of nodes is connected. In the large N limit
this transition occurs at p = 1/N.
CHAPTER 14. COMPLEX SYSTEMS 548
Figure 14.3: Example of a disconnected network with 10 nodes and 9 links. The degree distribution
for this network is D(1) = 5/10 = 0.5, D(2) = 3/10 = 0.3, D(3) = 1/10 = 0.1, and
D(4) = 1/10 = 0.1. The cluster coeﬃcient or transitivity is deﬁned as 3 times the number of
triangles divided by the number of possible triples of connected nodes. In this case we have 1
triangle and 12 triples. Thus, the clustering coeﬃcient equals 3 × 1/12 = 0.25. If a node has
links, then the number of triples centered at that node is /(2!( − 2)!).
Problem 14.15. The Erdös–Rényi model
(a) Write a program to create networks based on the Erdös–Rényi model. Choose N = 100 and
p ≈ 0.01 and compute D( ); average over at least 10 networks. Show that D( ) follows a
Poisson distribution.
(b) Deﬁne a giant cluster as one that has over three times as many nodes as any other cluster
and at least 10% of the nodes. Find the value of p at which the giant cluster ﬁrst appears
for N = 64, 128, and 256. Average over 10 networks for each value of N. The cluster
distribution should be updated after every link is added using the labeling procedure used
in Chapter 12. In this case it is easier because every time we add a link, we either combine
two clusters or we make no change in the cluster distribution.
Some of the networks that we will consider are by deﬁnition connected. In these cases one
of the important quantities of interest is the mean path length between two nodes, where the
path length between two nodes is the shortest number of links from one node to the other. If the
mean path length weakly depends on the total number of nodes and is small, then this property
of networks is known as the “small world” property. A well-known example of the small world
property is what is called “six degrees of separation,” which refers to the fact that almost any
person is connected through a sequence of six connections to almost any other person.
We wish to understand the structure of diﬀerent networks. One structural property is the
clustering coeﬃcient or transitivity. If node A is linked to B and B to C, the clustering coeﬃcient
is the probability that A is linked to C (see Figure 14.3 for a precise deﬁnition). If this coeﬃcient
is large, then there will be many small loops of nodes in the network. If we think of the nodes
as people and the links as friendship connections, then the clustering coeﬃcient is a measure
of the tendency of people to form cliques. It is also of interest to see to what extent the network
is hierarchically organized. Can we ﬁnd groups of nodes that are linked together at diﬀerent
CHAPTER 14. COMPLEX SYSTEMS 549
levels of organization? Can we produce an organizational chart for the network similar to what
is used by many businesses? Algorithms for computing the hierarchical or community structure
of a network are discussed in the references.
Two popular network models are the Watts–Strogatz small world model and the Barabasi–
Albert preferential attachment model. In the Watts–Strogatz model, a regular lattice of nodes
connected by nearest neighbor links is “rewired” so that a link between two neighboring nodes
is broken with probability p, and a link is randomly added between one of the nodes and any
other node in the system. The small world property shows up as a logarithmic dependence of
the mean path length on the system size N for large p. The degree distribution is similar to that
of the Erdös-Rényi model.
In the preferential attachment model, we begin with a few connected nodes and then add
one node at a time. Each new node is then linked to m existing nodes, with preference given
to those nodes that already have many links. The probability of a node with links being
connected to a new node is proportional to . For example, if we have ten nodes in the network
with 1, 1, 3, 2, 7, 3, 4, 7, 10, and 2 links, respectively, then there are a total of 40 links and
the probability of getting the next link from a new node is 1/40, 1/40, 3/40, 2/40, 4/40, 3/40,
4/40, 7/40, 10/40, and 2/40, respectively. The result of this growth rule is that some nodes
will accumulate many links. The key result is that the link distribution is a power law with
D( ) ∼ −α. This scale-free behavior is very important because it says thst in the limit of an
inﬁnite network, there is a non-negligible probability that a node exists with any particular
number of links. Examples of real networks that have this behavior are actor networks where
the links correspond to two actors appearing in the same movie, airport networks, the internet,
and the links between various websites. In addition to the scale-free degree distribution, the
preferential attachment model also has the small world property that the mean path length
grows only logarithmically with the number of nodes.
The PreferentialAttachment class implements the preferential attachment model. Method
setPosition is not relevant to the actual growth model. It places the nodes in random positions
so that the network can be drawn so they are too close to each other. This drawing method
is useful only for networks with less than about 100 nodes.
Listing 14.9: PreferentialAttachment class: Preferential attachment network model.
package org . opensourcephysics . sip . ch14 . networks ;
import java . awt . Color ;
import java . awt . Graphics ;
import org . opensourcephysics . frames . ;
import org . opensourcephysics . display . Drawable ;
import org . opensourcephysics . display . DrawingPanel ;
public class PreferentialAttachment implements Drawable {
int [ ] node , linkFrom , degree ;
/ / p o s i t i o n s of nodes , only meaningful f o r d i s p l a y purposes
double [ ] x , y ;
int N; / / maximum number of nodes
int m = 2; / / number of attempted l i n k s per node
int linkNumber = 0; / / twice current number of l i n k s
int n = 0; / / current number of nodes
boolean drawPositions = true ; / / only draw network i f true
int numberOfCompletedNetworks = 0;
public void i n i t i a l i z e ( ) {
CHAPTER 14. COMPLEX SYSTEMS 550
/ / degree d i s t r i b u t i o n to be averaged over many networks
degree = new int [N] ;
numberOfCompletedNetworks = 0; / / w i l l draw many networks
startNetwork ( ) ;
}
public void addLink ( int i , int j , int s ) {
linkFrom [ i m+s ] = j ;
node [ i ]++;
node [ j ]++;
linkNumber += 2; / / twice current number of l i n k s
}
public void startNetwork ( ) {
n = 0;
linkFrom = new int [m N] ;
node = new int [N] ;
x = new double [N] ;
y = new double [N] ;
linkNumber = 0;
for ( int i = 0; i <=m; i ++) {
n++;
setPosition ( i ) ;
}
for ( int i = 1; i <m+1; i ++) {
for ( int j = 0; j <i ; j ++) {
addLink ( i , j , j ) ;
}
}
}
public void setPosition ( int i ) {
double r2min = 1000./N;
/ / used to insure two nodes are not drawn too c l o s e to each other
boolean ok = true ;
do {
ok = true ;
x [ i ] = Math . random ( ) 1 0 0 ;
y [ i ] = Math . random ( ) 1 0 0 ;
int j = 0;
while ( j <i&&ok ) {
double dx = x [ i ]−x [ j ] ;
double dy = y [ i ]−y [ j ] ;
double r2 = dx dx+dy dy ;
i f ( r2<r2min ) {
ok = false ;
}
j ++;
}
} while ( ! ok ) ;
}
public int findNode ( int i , int s ) {
CHAPTER 14. COMPLEX SYSTEMS 551
boolean ok = true ;
int j = 0;
do {
ok = true ;
int k = ( int ) (1+Math . random ( ) linkNumber ) ;
j = −1;
int sum = 0;
do {
j ++;
sum += node [ j ] ;
} while (k>sum ) ;
for ( int r = 0; r<s ; r ++) {
i f ( linkFrom [ i m+r]== j ) {
ok = false ;
}
}
} while ( ! ok ) ;
return j ;
}
public void addNode( int i ) {
n++;
i f ( drawPositions ) {
setPosition ( i ) ;
}
for ( int s = 0; s<m; s++) {
addLink ( i , findNode ( i , s ) , s ) ;
}
}
public void step ( ) {
i f (n<N) {
addNode(n ) ;
} else {
numberOfCompletedNetworks++;
/ / accumulate data f o r degree d i s t r i b u t i o n
for ( int i = 0; i <n ; i ++) {
degree [ node [ i ]]++;
}
startNetwork ( ) ; / / s t a r t another network
}
}
public void degreeDistribution ( PlotFrame plot ) {
plot . clearData ( ) ;
for ( int i = 1; i <N; i ++) {
i f ( degree [ i ] >0) {
plot . append (0 , Math . log ( i ) , Math . log ( degree [ i ] 1.0/
(N numberOfCompletedNetworks ) ) ) ;
}
}
}
CHAPTER 14. COMPLEX SYSTEMS 552
public void draw ( DrawingPanel panel , Graphics g ) {
i f ( node!= null&&drawPositions ) {
int pxRadius = Math . abs ( panel . xToPix (1.0) − panel . xToPix ( 0 ) ) ;
int pyRadius = Math . abs ( panel . yToPix (1.0) − panel . yToPix ( 0 ) ) ;
g . setColor ( Color . green ) ;
for ( int i = 0; i <n ; i ++) {
int xpix = panel . xToPix ( x [ i ] ) ;
int ypix = panel . yToPix ( y [ i ] ) ;
for ( int s = 0; s<m; s++) {
int j = linkFrom [ i m+s ] ;
int xpixj = panel . xToPix ( x [ j ] ) ;
int ypixj = panel . yToPix ( y [ j ] ) ;
g . drawLine ( xpix , ypix , xpixj , ypixj ) ; / / draw l i n k
}
}
g . setColor ( Color . red ) ;
for ( int i = 0; i <n ; i ++) {
int xpix = panel . xToPix ( x [ i ]) − pxRadius ;
int ypix = panel . yToPix ( y [ i ]) − pyRadius ;
/ / draw node
g . f i l l O v a l ( xpix , ypix , 2 pxRadius , 2 pyRadius ) ;
}
}
}
}
Problem 14.16. Preferential attachment model
(a) Write a target class that uses the PreferentialAttachment class and continuously creates
new networks until stopped by the user (so we can compute averages over many networks).
To speed up the computation, make it possible to optionally display the networks. The
program should output the average degree distribution D( ).
(b) Estimate the exponent α deﬁned by D( ) ∼ −α for N = 100 and m = 2. Repeat for N = 500.
Does the exponent α change? If time permits, consider N = 10,000. Does α depend on m?
(c) Modify PreferentialAttachmentModel so that the links are made randomly so that the
number of links a node already has is irrelevant to adding a link. What functional form
does the link distribution have now? Is this model equivalent to the Erdös-Rényi model?
(d)∗ Write a method to compute the clustering coeﬃcient, which is deﬁned in Figure 14.3. Plot
lnC(N) versus lnN for both the preference attachment model and the Erdös-Rényi model.
Compare and discuss your results in terms of the visual appearance of the networks.
Problem 14.17. Watts–Strogatz network
(a) Write a class to create a Watts–Strogatz network. Begin with N = 100 nodes which you
can visualize as equally spaced on a circle. (Their actual position is irrelevant.) Place links
between the 2m nearest neighbors. Thus, if m = 1, then only the nearest neighbors are
linked. If m = 2, then the nearest and next nearest neighbors are linked. Next write a
method to go through each link and then with probability p, break the link connection at
one end and reconnect it to another node at random.
CHAPTER 14. COMPLEX SYSTEMS 553
(b) Compute the degree distribution as a function of m for several values of p. Discuss your
results.
(c) As we increase p, the networks becomes more and more random. There is a transition from
a network where the path length ∼ N to one where ∼ lnN. This transition occurs when
Np1/d ∼ 1, where d is the dimension of the original lattice before rewiring (d = 1 for a circle).
Draw a number of networks with diﬀerent values of N and p and use this visualization to
explain the dependence of on N.
(d)∗ Write a method to compute the clustering coeﬃcient C. Plot C versus lnp for N = 100 and
m = 2. Repeat for larger N.
Problem 14.18. A model of a social network
In many social situations we notice groups of people who interact closely with each other, but
not necessarily with other groups. Usually, those in a group have some common interest or
personal attribute. How can we model this situation? People do not usually become friends
with other people just because they have many friends already (the preferential attachment
mechanism). Instead, they choose someone to interact with and a friendship is established with
some probability. A simple model is given by the following rule. As each node is added to a
system, choose m existing nodes at random, and with probability p establish a link. This process
will create a number of clusters of linked nodes. We can imagine that there is a possibility of
a phase transition between the existence of a giant cluster that contains a large fraction of the
nodes and a situation where all the clusters are small. This model was analyzed by Zalányi et
al.
(a) Write a class to model this random attachment model and compute the degree distribution
as well as the cluster distribution. Consider at least N = 1000 nodes and measure D( ),
the degree distribution, for m = 2, 3, and 5 and p = 0.1 and p = 0.9. Average over at least
ten trials. You should not ﬁnd power law behavior for D( ). Explain why this behavior is
expected.
(b) Compute D( ,t), the number of links connected to a node as a function of t, the time when
the node is added. We would expect nodes added in the beginning to have more links than
those at the end. Describe and discuss the functional form of D( ,t).
(c) Consider m = 5 and generate many networks for diﬀerent values of p. Determine the cluster
distribution. A giant cluster exists when the largest cluster is at least three times larger than
the next largest cluster. Estimate the value of p for which the giant cluster ﬁrst appears. You
should ﬁnd an approximate power law cluster distribution only at the transition. What is
the exponent of the power law?
(d) How does the value of p at the transition change with m? Explain your results.
(e) Consider m = 1 and generate networks for many values of p. Determine the cluster distribution.
You should ﬁnd an approximate power law distribution for all values of p. What
are the exponents for the power law? Why do you think there is not a phase transition for
m = 1? Consider the possibility of two clusters merging for diﬀerent values of m.
CHAPTER 14. COMPLEX SYSTEMS 554
14.5 Genetic Algorithms
Many people ﬁnd it diﬃcult to accept that evolution is suﬃciently powerful to generate the
biological complexity observed in nature. Part of this diﬃculty arises from the inability of humans
to intuitively grasp time scales that are much greater than their own lifetimes. Another
reason is that it is very diﬃcult to appreciate how random changes can lead to emergent complex
structures. Genetic algorithms provide one way of understanding the nature of evolution.
Their principal utility at present is in optimization problems, but they are also being used to
model biological and social evolution.
Historically, developments in physics, such as x-ray crystallography and quantum mechanics,
have lead to developments in biology. In recent years developments in biology as well as in
computer science and other areas have had a direct impact on developments in physics. Genetic
algorithms are an example of the inﬂuence of ideas in biology impacting ideas in physics.
The idea of genetic algorithms is to model the process of evolution by natural selection.
This process involves two steps: random changes in the genetic code during reproduction and
selection according to some ﬁtness criteria. In biological organisms the genetic code is stored in
the DNA. We will store the genetic code as a string of 1s and 0s. The genetic code constitutes the
genotype. The conversion of this string to the organism or phenotype depends on the problem.
The selection criteria is applied to the phenotype.
First we describe how change is introduced into the genotype. Typically, nature changes the
genetic code in two ways. The most obvious, but less often used method, is mutation. Mutation
corresponds to changing a character at random in the genetic code string from 0 to 1 or from
1 to 0. The other much more powerful method is associated with sexual reproduction. We
take two strings, remove a piece from one string, and exchange it with the same length piece
from the other string. For example, if string A = 0011001010 and string B = 0001110001, then
exchanging the piece from position 4 to position 7 leads to two new strings, A = 0011110010
and B = 0001001001. This type of change is called recombination or crossover.
At each generation we produce changes using recombination and mutation. We then select
from the enlarged population of strings (including strings from the previous generation) a new
population for the next generation. Usually, a constant population size is maintained from one
generation of strings to the next.
We next have to choose a selection criterion. If we want to model an actual ecosystem, we
can include a physical environment and other sets of populations corresponding to diﬀerent
species. The ﬁtness could depend on the interaction of the diﬀerent species with one another,
the interaction within each species, and the interaction with the physical environment. In addition,
the behavior of the populations might change the environment from one generation to the
next. For simplicity, we will consider only a single population of strings, a simple phenotype,
and a simple criteria for ﬁtness.
The phenotype we consider is a variant of the Ising model considered in Problem 14.14. We
consider a square lattice of linear dimension L occupied by N = L2 spins that have the values
si = ±1. The energy of the system is given by
E = −
i,j=nn(i)
Jijsisj (14.7)
where the sum is over all pairs of spins that are nearest neighbors. The energy function in (14.7)
assumes that only nearest neighbor spins interact, in contrast to the energy function in (14.6)
which assumes that every spin interacts with every other spin. The coupling constants Jij are
CHAPTER 14. COMPLEX SYSTEMS 555
either +1, −1, or distributed according to some probability distribution. If we assume that |Jij| =
1, then the minimum energy equals −2N and the maximum energy is 2N. Because we want
the ﬁtness to be positive, we choose 2N − E as the measure of ﬁtness and take the probability
of selecting a particular string with energy E for the next generation to be proportional to the
ﬁtness 2N − E.
How does a genotype become “expressed” as a phenotype? A genotype consists of a string
of length N with 1s and 0s. The lattice site (i,j) corresponds to the nth position in the string
where n = jL + i. If the character in the string at position n is 0, then the spin at site (i,j) equals
−1. If the character is 1, then the spin equals +1. Note that in this case the representation of
the genotype is very similar to that of the phenotype. In particular, they have the same size N,
and each “piece” can have only two values. In general, the expression of the genotype in the
phenotype is much more complicated. Usually, a sequence within the genotype corresponds to
one value in the phenotype, which in biological systems is related to the coding for a speciﬁc
protein. Such a sequence is what we call a gene.
We now have all the ingredients we need to apply the genetic algorithm. The GeneticApp
class obtains the various parameters, initializes the population of genotypes, and calls the various
methods needed to evolve the gene pool (see the doStep method). The GenePool class
carries out the evolution. In method recombine two genotypes are chosen at random, and a random
piece of one is exchanged for the equivalent piece of the other. In method mutate a random
position in a randomly selected genotype is changed. We use a boolean array to represent the
genotype, so that a change represents converting true to false or vice versa. In both methods
we do not replace the original genotype but instead add a new genotype to the population. The
Phenotype class determines the ﬁtness of each member of the population by computing the
energy of the lattice of spins corresponding to each member of the population. Members of this
population are selected for the new generation by generating a discrete nonuniform probability
distribution as discussed in Section 11.5.
Listing 14.10: The GeneticApp class.
package org . opensourcephysics . sip . ch14 . genetic ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . ;
public class GeneticApp extends AbstractSimulation {
GenePool genePool = new GenePool ( ) ;
Phenotype phenotype = new Phenotype ( ) ;
DisplayFrame frame = new DisplayFrame ( "Gene pool" ) ;
public void i n i t i a l i z e ( ) {
phenotype . L = control . getInt ( "Lattice size" ) ;
genePool . populationNumber = control . getInt ( "Population size" ) ;
genePool . recombinationRate = control . getInt ( "Recombination rate" ) ;
genePool . mutationRate = control . getInt ( "Mutation rate" ) ;
genePool . genotypeSize = phenotype . L phenotype . L ;
genePool . i n i t i a l i z e ( phenotype ) ;
phenotype . i n i t i a l i z e ( ) ;
frame . addDrawable ( genePool ) ;
frame . setPreferredMinMax ( −1.0 , genePool . genotypeSize +5, −1.0 ,
genePool . populationNumber +2);
frame . setSize ( phenotype . L phenotype . L 10 ,
genePool . populationNumber 20);
}
CHAPTER 14. COMPLEX SYSTEMS 556
public void doStep ( ) {
genePool . evolve ( ) ;
phenotype . determineFitness ( genePool ) ;
phenotype . s e l e c t ( genePool ) ;
control . clearMessages ( ) ;
control . println ( genePool . generation+
" generations , best fitness = "+phenotype . bestFitness ) ;
}
public void reset ( ) {
control . setValue ( "Lattice size" , 8 ) ;
control . setValue ( "Population size" , 20);
control . setValue ( "Recombination rate" , 10);
control . setValue ( "Mutation rate" , 4 ) ;
}
public s t a t i c void main ( String args [ ] ) {
SimulationControl . createApp (new GeneticApp ( ) ) ;
}
}
Listing 14.11: The GenePool class.
package org . opensourcephysics . sip . ch14 . genetic ;
import java . awt . Color ;
import java . awt . Graphics ;
import org . opensourcephysics . display . ;
public class GenePool implements Drawable {
int populationNumber ;
int numberOfGenotypes ;
int recombinationRate ;
int mutationRate ;
int genotypeSize ;
boolean [ ] [ ] genotype ;
int generation = 0;
Phenotype phenotype ;
public void i n i t i a l i z e ( Phenotype phenotype ) {
this . phenotype = phenotype ;
generation = 0;
numberOfGenotypes = populationNumber+2 recombinationRate+mutationRate ;
genotype = new boolean [ numberOfGenotypes ] [ genotypeSize ] ;
for ( int i = 0; i <populationNumber ; i ++) {
for ( int j = 0; j <genotypeSize ; j ++) {
i f (Math . random() >0.5) {
genotype [ i ] [ j ] = true ; / / s e t s genes randomly
}
}
}
}
public void copyGenotype ( boolean a [ ] , boolean b [ ] ) { / / copy a to b
CHAPTER 14. COMPLEX SYSTEMS 557
for ( int i = 0; i <genotypeSize ; i ++) {
b [ i ] = a [ i ] ;
}
}
public void recombine ( ) {
for ( int r = 0; r<recombinationRate ; r += 2) {
/ / chooses random genotype
int i = ( int ) (Math . random ( ) populationNumber ) ;
int j = 0;
do {
/ / chooses second random genotype
j = ( int ) (Math . random ( ) populationNumber ) ;
} while ( i==j ) ;
/ / random s i z e to recombine
int size = 1+( int ) (0.5 genotypeSize Math . random ( ) ) ;
/ / random l o c a t i o n
int s t a r t P o s i t i o n =
( int ) ( genotypeSize Math . random ( ) ) ;
/ / index f o r new genotype
int r1 = populationNumber+r ;
/ / index f o r second new genotype
int r2 = populationNumber+r +1;
copyGenotype ( genotype [ i ] , genotype [ r1 ] ) ;
copyGenotype ( genotype [ j ] , genotype [ r2 ] ) ;
for ( int position =
s t a r t P o s i t i o n ; position <s t a r t P o s i t i o n+size ; position ++) {
int pbcPosition = position%genotypeSize ;
/ / make new genotypes
genotype [ r1 ] [ pbcPosition ] = genotype [ j ] [ pbcPosition ] ;
genotype [ r2 ] [ pbcPosition ] = genotype [ i ] [ pbcPosition ] ;
}
}
}
public void mutate ( ) {
/ / index f o r new genotype
int index = populationNumber+2 recombinationRate ;
for ( int m = 0;m<mutationRate ;m++) {
/ / c h o i c e random e x i s t i n g genotype
int n = ( int ) (Math . random ( ) populationNumber ) ;
/ / random p o s i t i o n to mutate
int position = ( int ) ( genotypeSize Math . random ( ) ) ;
/ / copy genotype
copyGenotype ( genotype [n ] , genotype [ index+m] ) ;
/ / mutate
genotype [ index+m] [ position ] = ! genotype [n ] [ position ] ;
}
}
public void evolve ( ) {
recombine ( ) ;
mutate ( ) ;
CHAPTER 14. COMPLEX SYSTEMS 558
generation ++;
}
public void draw ( DrawingPanel panel , Graphics g ) {
/ / draws genotype as s t r i n g of red or green squares and l i s t s
/ / f i t n e s s f o r each genotype
i f ( genotype==null ) {
return ;
}
i f ( phenotype . selectedPopulationFitness==null ) {
return ;
}
int sizeX = Math . abs ( panel . xToPix (0.8) − panel . xToPix ( 0 ) ) ;
int sizeY = Math . abs ( panel . yToPix (0.6) − panel . yToPix ( 0 ) ) ;
for ( int n = 0;n<populationNumber ; n++) {
int ypix = panel . yToPix (1.5 n)− sizeY ;
for ( int position = 0; position <genotypeSize ; position ++) {
i f ( genotype [n ] [ position ] ) {
g . setColor ( Color . red ) ;
} else {
g . setColor ( Color . green ) ;
}
int xpix = panel . xToPix ( position )− sizeX ;
g . f i l l R e c t ( xpix , ypix , sizeX , sizeY ) ;
}
g . setColor ( Color . black ) ;
g . drawString ( String . valueOf ( phenotype . selectedPopulationFitness [n ] ) ,
panel . xToPix ( genotypeSize +1) , ypix+sizeY ) ;
}
}
}
Listing 14.12: The Phenotype class.
/ / population of phenotypes ( random bond I s i n g model )
package org . opensourcephysics . sip . ch14 . genetic ;
public class Phenotype {
int L ;
int [ ] [ ] [ ] J ; / / random bonds
int [ ] populationFitness , selectedPopulationFitness ;
int t o t a l F i t n e s s ;
int highestEnergy ;
int bestFitness ;
public void i n i t i a l i z e ( ) {
J = new int [L ] [ L ] [ 2 ] ;
highestEnergy = 2 L L ; / / h i g h e s t p o s s i b l e energy
bestFitness = 0;
for ( int i = 0; i <L ; i ++) {
for ( int j = 0; j <L ; j ++) {
for ( int bond = 0; bond<2;bond++) {
i f (Math . random() >0.5) {
J [ i ] [ j ] [ bond ] = 1;
} else {
CHAPTER 14. COMPLEX SYSTEMS 559
J [ i ] [ j ] [ bond ] = −1;
}
}
}
}
}
public void determineFitness ( GenePool genePool ) {
t o t a l F i t n e s s = 0;
int s t a t e [ ] [ ] = new int [L ] [ L ] ;
populationFitness = new int [ genePool . numberOfGenotypes ] ;
for ( int n = 0;n<genePool . numberOfGenotypes ; n++) {
for ( int i = 0; i <L ; i ++) {
/ / s e t s up l a t t i c e based on genotype
for ( int j = 0; j <L ; j ++) {
int position = i+ j L ;
i f ( genePool . genotype [n ] [ position ] ) {
s t a t e [ i ] [ j ] = 1;
} else {
s t a t e [ i ] [ j ] = −1;
}
}
}
for ( int i = 0; i <L ; i ++) {
/ / compute energy of l a t t i c e c o n f i g u r a t i o n
for ( int j = 0; j <L ; j ++) {
populationFitness [n] −=
s t a t e [ i ] [ j ] ( J [ i ] [ j ] [ 0 ] s t a t e [ ( i +1)%L ] [ j ]+
J [ i ] [ j ] [ 1 ] s t a t e [ i ] [ ( j +1)%L ] ) ;
}
}
/ / f i t n e s s > 0; low energy i mp li es high f i t n e s s
populationFitness [n] = highestEnergy −populationFitness [n ] ;
t o t a l F i t n e s s += populationFitness [n ] ;
}
}
public void s e l e c t ( GenePool genePool ) {
selectedPopulationFitness = new int [ genePool . numberOfGenotypes ] ;
boolean savedGenotype [ ] [ ] =
new boolean [ genePool . numberOfGenotypes ] [ genePool . genotypeSize ] ;
for ( int n = 0;n<genePool . numberOfGenotypes ; n++) {
genePool . copyGenotype ( genePool . genotype [n ] , savedGenotype [n ] ) ;
}
for ( int n = 0;n<genePool . populationNumber ; n++) {
int f it n es s F ra c ti o n = ( int ) (Math . random ( ) t o t a l F i t n e s s ) ;
int choice = 0;
int fitnessSum = populationFitness [ 0 ] ;
while ( fitnessSum<f i tn e ss F ra c t io n ) {
choice ++;
fitnessSum += populationFitness [ choice ] ;
}
selectedPopulationFitness [n] = populationFitness [ choice ] ;
CHAPTER 14. COMPLEX SYSTEMS 560
i f ( selectedPopulationFitness [n]> bestFitness ) {
bestFitness = selectedPopulationFitness [n ] ;
}
genePool . copyGenotype ( savedGenotype [ choice ] , genePool . genotype [n ] ) ;
}
}
}
Problem 14.19. Ground state of Ising-like models
(a) Use the genetic algorithm we have discussed to ﬁnd the ground state of the ferromagnetic
Ising model for which Jij = 1. In this case the ground state energy is E = −2L2 (all spins up
or all spins down). It will be necessary to modify method Initialize in class Phenotype.
Choose L = 4 and consider a population of 20 strings, with 10 recombinations and 4 mutations
per generation. How long does it take to ﬁnd the ground state energy? You might wish
to modify the program to show each new generation is shown on the screen.
(b) Find the mean number of generations needed to ﬁnd the ground state for L = 4, 6, and 8.
Repeat each run several times. Use a population of 100, a recombination rate of 50, and a
mutation rate of 20. Are there any general trends as L is increased? How do your results
change if you double the population size? What happens if you double the recombination
rate or mutation rate? Use larger lattices if you have suﬃcient computer resources.
(c) Repeat part (b) for the antiferromagnetic model for which Jij = −1.
(d) Repeat part (b) for a spin glass for which Jij = ±1 at random. In this case we do not know
the ground state energy in advance. What criterion can you use to terminate a run?
One of the important features of the genetic algorithm is that the change in the genetic code
is selected not in the genotype directly, but in the phenotype. Note that the way we change the
strings (particularly with recombination) is not closely related to the two-dimensional lattice
of spins. We could have used some other prescription for converting a string of 0s and 1s to
a conﬁguration of spins on a two-dimensional lattice. If the phenotype is a three-dimensional
lattice, we could use the same procedure for modifying the genotype, but a diﬀerent prescription
for converting the genetic sequence (the string of 0s and 1s) to the phenotype (the threedimensional
lattice of spins). The point is that it is not necessary for the genetic coding to
mimic the phenotypic expression. This point becomes distorted in the popular press when a
gene is tied to a particular trait, because speciﬁc pieces of DNA rarely correspond directly to
any explicitly expressed trait in the phenotype.
14.6 Lattice Gas Models of Fluid Flow
We now return to cellular automaton models and discuss one of their more interesting applications—
simulations of ﬂuid ﬂow. In general, ﬂuid ﬂow is very diﬃcult to simulate because the partial
diﬀerential equation describing the ﬂow of incompressible ﬂuids, the Navier–Stokes equation,
is nonlinear, and this nonlinearity can lead to the failure of standard numerical algorithms. In
addition, there are typically many length scales that must be considered simultaneously. These
length scales include the microscopic motion of the ﬂuid particles, the length scales associated
with ﬂuid structures such as vortices, and the length scales of macroscopic objects such as pipes
CHAPTER 14. COMPLEX SYSTEMS 561
Velocity Vector Direction Symbol Abbreviation Decimal Binary
v0 (1,0) RIGHT RI 1 00000001
v1 (1,−
√
3)/2 RIGHT_DOWN RD 1 00000010
v2 −(1,
√
3)/2 LEFT_DOWN LD 4 00000100
v3 (−1,0) LEFT LE 8 00001000
v4 (−1,
√
3)/2 LEFT_UP LU 16 00010000
v5 (1,
√
3)/2 RIGHT_UP RU 32 00100000
v6 (0,0) STATIONARY S 64 01000000
BARRIER 128 10000000
Table 14.1: Summary of the possible velocities and their representations.
or obstacles. Because of these considerations, simulations of ﬂuid ﬂow based on the direct numerical
solutions of the Navier–Stokes equation typically require very sophisticated numerical
methods (cf. Oran and Boris).
Cellular automaton models of ﬂuids are known as lattice gas models. In a lattice gas model
the positions of the particles are restricted to the sites of a lattice, and the velocities are restricted
to a small number of vectors corresponding to neighbor sites. A time step is divided into two
substeps. In the ﬁrst substep the particles move freely to their corresponding nearest neighbor
lattice sites. Then the velocities of the particles at each lattice site are changed according to
a collision rule that conserves mass (particle number), momentum, and kinetic energy. The
purpose of the collision rules is not to accurately model microscopic collisions, but rather to
achieve the correct macroscopic behavior. The idea is that if we satisfy the conservation laws
associated with microscopic collisions, then we can ﬁnd the correct physics at the macroscopic
level, including translational and rotational invariance, by averaging over many particles.
We assume a triangular lattice, because it can be shown that this symmetry is suﬃcient
to yield the macroscopic Navier–Stokes equations for a continuum. In contrast, the more limited
symmetry of a square lattice is not suﬃcient. Three-dimensional models are much more
diﬃcult to implement and justify theoretically.
All the moving particles are assumed to have the same speed and mass. The possible velocity
vectors lie only in the direction of the nearest neighbor sites, and hence there are six possible
velocities as summarized in Table 14.1. A rest particle is also allowed. The number of particles
at each site moving in a particular direction (channel) is restricted to be zero or one.
In the ﬁrst substep all particles move in the direction of their velocity to a neighboring
site. In the second substep the velocity vectors at each lattice site are changed according to the
appropriate collision rule. Examples of the collision rules are illustrated in Figures 14.4–14.6.
The rules are deterministic with only one possible set of velocities after a collision for each
possible set of velocities before a collision. It is easy to check that momentum conservation for
collisions between the particles is enforced by these rules.
As in Section 14.1, we use bit manipulation to eﬃciently represent a lattice site and the
collision rules. Each lattice site is represented by one element of the integer array lattice. In
Java each int stores 32 bits, but we will use only the ﬁrst 8 bits. We use the ﬁrst six bits from
0 to 5 to represent particles moving in the six possible directions with bit 0 corresponding to
a particle moving with velocity v0 (see Table 14.1). If there are three particles with velocities
v0, v2, and v4 at a site and no barrier, then the value of the lattice array element at this site is
00010101 in binary notation.
Bit 6 represents a possible rest (stationary) particle. If we want a site to act as a barrier that
CHAPTER 14. COMPLEX SYSTEMS 562
LU RU
RILE
RDLD
Figure 14.4: Examples of collision rules for three particles, with one particle unchanged and no
stationary particles. Each direction or channel is represented by 32 bits, but we need only the
ﬁrst 8 bits. The various channels are summarized in Table 14.1.
(a) (b) (c)
Figure 14.5: (a) Example of collision rule for three particles with zero net momentum. (b)
Example of two particle collision rule. (c) Example of four-particle collision rule. The rules for
states that are not shown is that the velocities do not change after a collision. An open circle
represents a lattice site and the absence of a stationary particle.
blocks incoming particles, we set bit 7. For example, a barrier site containing a particle with
velocity v1 is represented by 10000010.
The rules for the collisions are given in the declaration of the class variables in class LatticeGas.
Because rule is declared static final, we cannot normally overwrite its values. However, an
exception is made for static initializers that are run when the class is ﬁrst loaded. To construct
the rules, we use the bitwise or operator | and use named constants for each of the possible
states. As an example, the state corresponding to one particle moving to the right, one moving
to the left and down, and one moving to the left and up is given by LU+LD+RI, which we write
as LU|LD|RI or 00010101. The collision rule in Figure 14.5(a) is that this state transforms to one
particle moving to the right and down, one moving left, and one moving to the right and up.
Hence, this collision rule is given by rule[LU|LD|RI] = RU|LE|RD. The other rules are given in
a similar way. Stationary particles can also be created or destroyed. For example, what are the
states before and after the collision for rule[LU|RI] = RU|S?
To every rule corresponds a dual rule that ﬂips the bits corresponding to the presence and
absence of a particle. This duality means that we need to only specify half of the rules. The
dual rules can be constructed by ﬂipping all bits of the input and output. Our convention is to
list the rules starting without a stationary particle. Then the corresponding dual rules are those
CHAPTER 14. COMPLEX SYSTEMS 563
(a) (b)
(c) (d)
Figure 14.6: (a) and (c) and (b) and (d) are duals of each other. An open circle represents
the absence of a stationary particle, and a ﬁlled circle represents the presence of a stationary
particle. Note that the collision rule in (c) is similar to (b), and the collision rule in (d) is similar
to (a), but in the opposite direction.
that start with a stationary particle. The dual rules are implemented by the statement
rule [ i ^(RU|LU|LE|LD|RD| RI | S ) ] = rule [ i ]^(RU|LU|LE|LD|RD| RI | S ) ;
where ˆ is the bitwise exclusive or operator, which equals 1 if both bits are diﬀerent and is 0
otherwise. Two examples of dual rules are given in Figure 14.6.
The rules in Figures 14.5(b) and 14.5(c) cycle through the states in a particular direction.
Although these rules are straightforward, they are not invariant under reﬂection. To help eliminate
this bias, we cycle in the opposite direction when a stationary particle is present (see
Figure 14.6).
We adopt the rule that when a particle moves onto a barrier site, we set the velocity v of
this particle equal to −v (see Figure 14.7). Because of our ordering of the velocities, the rule for
updating a barrier can be expressed compactly using bit manipulation. Reﬂection oﬀ a barrier
is accomplished by shifting the higher-order bits to the right by three bits (»>3) and shifting
the lower-order bits to the left by three bits («3). Check the rules given in Listing 14.13. Other
possibilities are to set the angle of incidence equal to the angle of reﬂection or to set the velocity
to an arbitrary direction. The latter case would correspond to a collision oﬀ a rough surface.
The step method runs through the entire lattice and moves all the particles. The updated
values of the sites are placed in the array newLattice. We then go through the newLattice
array, implement the relevant collision rule at each site, and write the results into the array
Lattice.
The movement of the particles is accomplished as follows. Because the even rows are horizontally
displaced one half a lattice spacing from the odd rows, we need to treat odd and even
rows separately. In the step method we loop through every other row and update site1 and
site2 at the same time. An example will show how this update works. The statement
CHAPTER 14. COMPLEX SYSTEMS 564
t = 0 t = 1 t = 2
Figure 14.7: Example of a collision from a barrier. At t = 1 the particle moves to the barrier site
and then reverses its velocity. The symbol ⊗ denotes a barrier site.
site1
j - 1
j
j + 1site2
rghtcent
j + 2
left
Figure 14.8: We update site1 and site2 at the same time. The rows are indexed by j. The
dotted line connects sites in the same column.
rght [ j −1] |= s i t e 1 & RIGHT_DOWN;
means that if there is a particle moving to the right and down at site1, then the bit corresponding
to RIGHT_DOWN is added to the site rght (see Figure 14.8). The statement
cent [ j ] |= s i t e 1 & (STATIONARY|BARRIER) | s i t e 2 & RIGHT_DOWN;
means that a stationary particle at site1 remains there, and if site1 is a barrier, it remains so.
If site2 has a particle moving in the direction RD, then site1 will receive this particle.
To maintain a steady ﬂow rate, we add the necessary horizonal momentum to the lattice
uniformly after each time step. The procedure is to chose a site at random and determine if it is
possible to change the sites’s horizontal momentum. If so, we then remove the left bit and add
the right bit or vice versa. This procedure is accomplished by the statements at the end of the
step method.
Listing 14.13: Listing of the LatticeGas class.
package org . opensourcephysics . sip . ch14 . l a t t i c e g a s ;
import org . opensourcephysics . display . ;
import java . awt . ;
import java . awt . geom . AffineTransform ;
import java . awt . geom . Line2D ;
public class LatticeGas implements Drawable {
/ / input parameters from user
public double flowSpeed ; / / c o n t r o l s pressure
/ / s i z e of v e l o c i t y arrows displayed
public double arrowSize ;
CHAPTER 14. COMPLEX SYSTEMS 565
public int spatialAveragingLength ; / / s p a t i a l averaging of v e l o c i t y
public int Lx , Ly ; / / l i n e a r dimensions of l a t t i c e
public int [ ] [ ] l a t t i c e , newLattice ;
private double numParticles ;
s t a t i c final double SQRT3_OVER2 = Math . sqrt ( 3 ) / 2 ;
s t a t i c final double SQRT2 = Math . sqrt ( 2 ) ;
s t a t i c final int
RIGHT = 1 , RIGHT_DOWN = 2 , LEFT_DOWN = 4;
s t a t i c final int
LEFT = 8 , LEFT_UP = 16 , RIGHT_UP = 32;
s t a t i c final int
STATIONARY = 64 , BARRIER = 128;
/ / maximum number of p a r t i c l e s per s i t e
s t a t i c final int NUM_CHANNELS = 7;
/ / 7 channel b i t s plus 1 b a r r i e r b i t per s i t e
s t a t i c final int NUM_BITS = 8;
/ / t o t a l number of p o s s i b l e s i t e c o n f i g u r a t i o n s = 2^8
s t a t i c final int NUM_RULES = 1<<8;
/ / 1 << 8 means move the zeroth b i t over 8 p l a c e s to the l e f t to
/ / the eighth b i t
s t a t i c final double ux [ ] = {
1.0 , 0.5 , −0.5 , −1.0 , −0.5 , 0.5 , 0
} ;
s t a t i c final double uy [ ] = {
0.0 , −SQRT3_OVER2, −SQRT3_OVER2, 0.0 , SQRT3_OVER2, SQRT3_OVER2, 0
} ;
/ / averaged v e l o c i t i e s f o r every s i t e c o n f i g u r a t i o n
s t a t i c final double [ ] vx , vy ;
s t a t i c final int [ ] rule ;
s t a t i c { / / s e t rule t a b l e
/ / d e f a u l t rule i s the i d e n t i t y rule
rule = new int [NUM_RULES] ;
for ( int i = 0; i <BARRIER ; i ++) {
rule [ i ] = i ;
}
/ / a b b r e v i a t i o n s f o r channel b i t i n d i c e s
int RI = RIGHT, RD = RIGHT_DOWN, LD = LEFT_DOWN;
int LE = LEFT , LU = LEFT_UP , RU = RIGHT_UP;
int S = STATIONARY;
/ / t h r e e p a r t i c l e zero momentum r u l e s
rule [LU|LD| RI ] = RU|LE|RD;
rule [RU|LE|RD] = LU|LD| RI ;
/ / t h r e e p a r t i c l e r u l e s with unperturbed p a r t i c l e
rule [RU|LU|LD] = LU|LE| RI ;
rule [LU|LE| RI ] = RU|LU|LD;
rule [RU|LU|RD] = RU|LE| RI ;
rule [RU|LE| RI ] = RU|LU|RD;
rule [RU|LD|RD] = LE|RD| RI ;
rule [LE|RD| RI ] = RU|LD|RD;
rule [LU|LD|RD] = LE|LD| RI ;
rule [LE|LD| RI ] = LU|LD|RD;
CHAPTER 14. COMPLEX SYSTEMS 566
rule [RU|LD| RI ] = LU|RD| RI ;
rule [LU|RD| RI ] = RU|LD| RI ;
rule [LU|LE|RD] = RU|LE|LD;
rule [RU|LE|LD] = LU|LE|RD;
/ / two p a r t i c l e c y c l i c r u l e s
rule [LE| RI ] = RU|LD;
rule [RU|LD] = LU|RD;
rule [LU|RD] = LE| RI ;
/ / four p a r t i c l e c y c l i c r u l e s
rule [RU|LU|LD|RD] = RU|LE|LD| RI ;
rule [RU|LE|LD| RI ] = LU|LE|RD| RI ;
rule [LU|LE|RD| RI ] = RU|LU|LD|RD;
/ / s t a t i o n a r y p a r t i c l e c r e a t i o n r u l e s
rule [LU| RI ] = RU| S ;
rule [RU|LE] = LU| S ;
rule [LU|LD] = LE| S ;
rule [LE|RD] = LD| S ;
rule [LD| RI ] = RD| S ;
rule [RD|RU] = RI | S ;
rule [LU|LE|LD|RD| RI ] = RU|LE|LD|RD| S ;
rule [RU|LE|LD|RD| RI ] = LU|LD|RD| RI | S ;
rule [RU|LU|LD|RD| RI ] = RU|LE|RD| RI | S ;
rule [RU|LU|LE|RD| RI ] = RU|LU|LD| RI | S ;
rule [RU|LU|LE|LD| RI ] = RU|LU|LE|RD| S ;
rule [RU|LU|LE|LD|RD] = LU|LE|LD| RI | S ;
/ / add a l l r u l e s indexed with a s t a t i o n a r y p a r t i c l e ( dual r u l e s )
for ( int i = 0; i <S ; i ++) {
/ / ^ i s the e x c l u s i v e or operator
rule [ i ^(RU|LU|LE|LD|RD| RI | S ) ] = rule [ i ]^(RU|LU|LE|LD|RD| RI | S ) ;
}
/ / add r u l e s to bounce back at b a r r i e r s
for ( int i = BARRIER ; i <NUM_RULES; i ++) {
/ / & i s b i t w i s e and operator
int highBits = i &(LE|LU|RU) ;
int lowBits = i &(RI |RD|LD) ;
rule [ i ] = BARRIER | ( highBits > >3)|( lowBits <<3);
}
}
s t a t i c { / / s e t average s i t e v e l o c i t i e s
/ / f o r every p a r t i c l e s i t e c o n f i g u r a t i o n i , c a l c u l a t e t o t a l
/ / net v e l o c i t y and place in vx [ i ] , vy [ i ]
vx = new double [NUM_RULES] ;
vy = new double [NUM_RULES] ;
for ( int i = 0; i <NUM_RULES; i ++) {
for ( int dir = 0; dir <NUM_CHANNELS; dir ++) {
i f ( ( i&(1<<dir ) ) ! = 0 ) {
vx [ i ] += ux [ dir ] ;
vy [ i ] += uy [ dir ] ;
}
}
}
}
public void i n i t i a l i z e ( int Lx , int Ly , double density ) {
CHAPTER 14. COMPLEX SYSTEMS 567
this . Lx = Lx ;
this . Ly = Ly−Ly%2; / / Ly must be even
/ / approximate t o t a l number of p a r t i c l e s
numParticles = Lx Ly NUM_CHANNELS density ;
/ / density = number of p a r t i c l e s divided by the maximum number p o s s i b l e
l a t t i c e = new int [ Lx ] [ Ly ] ;
newLattice = new int [ Lx ] [ Ly ] ;
int sevenParticleSite = ((1<<NUM_CHANNELS) −1); / / equals 127
for ( int i = 0; i <Lx ; i ++) {
/ / wall at top and bottom
l a t t i c e [ i ] [ 1 ] = l a t t i c e [ i ] [ Ly−2] = BARRIER ;
for ( int j = 2; j <Ly−2; j ++) {
/ / occupy s i t e by 0 or 7 p a r t i c l e s , average occupation w i l l
/ / be about the density
int siteValue = Math . random() < density ? sevenParticleSite : 0;
l a t t i c e [ i ] [ j ] = siteValue ; / / random p a r t i c l e c o n f i g u r a t i o n
}
}
for ( int j = 3 Ly/10; j <7 Ly/10; j ++) {
l a t t i c e [2 Lx /10][ j ] = BARRIER ; / / o b s t r u c t i o n toward the l e f t
}
}
public void step ( ) {
/ / move a l l p a r t i c l e s forward
for ( int i = 0; i <Lx ; i ++) {
/ / d e f i n e the columns of a 2−dim array
int [ ] l e f t = newLattice [ ( i −1+Lx)%Lx ] ;
/ / use a b b r e v i a t i o n s to align e x p r e s s i o n s
int [ ] cent = newLattice [ i ] ;
int [ ] rght = newLattice [ ( i +1)%Lx ] ;
for ( int j = 1; j <Ly−2; j += 2) {
/ / loop j in increments of 2 to d e c r e a s e reads and writes
/ / of neighbors
int s i t e 1 = l a t t i c e [ i ] [ j ] ;
int s i t e 2 = l a t t i c e [ i ] [ j +1];
/ / move a l l p a r t i c l e s in s i t e 1 and s i t e 2 to t h e i r neighbors
rght [ j −1] |= s i t e 1&RIGHT_DOWN;
cent [ j −1] |= s i t e 1&LEFT_DOWN;
rght [ j ] |= s i t e 1&RIGHT;
cent [ j ] |= s i t e 1 &(STATIONARY|BARRIER ) | s i t e 2&RIGHT_DOWN;
l e f t [ j ] |= s i t e 1&LEFT| s i t e 2&LEFT_DOWN;
rght [ j +1] |= s i t e 1&RIGHT_UP| s i t e 2&RIGHT;
cent [ j +1] |= s i t e 1&LEFT_UP| s i t e 2 &(STATIONARY|BARRIER ) ;
l e f t [ j +1] |= s i t e 2&LEFT ;
cent [ j +2] |= s i t e 2&RIGHT_UP;
l e f t [ j +2] |= s i t e 2&LEFT_UP ;
}
} / / handle c o l l i s i o n s , find average x v e l o c i t y
double vxTotal = 0;
for ( int i = 0; i <Lx ; i ++) {
for ( int j = 0; j <Ly ; j ++) {
int s i t e = rule [ newLattice [ i ] [ j ] ] ; / / use c o l l i s i o n rule
CHAPTER 14. COMPLEX SYSTEMS 568
l a t t i c e [ i ] [ j ] = s i t e ;
newLattice [ i ] [ j ] = 0; / / r e s e t newLattice values to 0
vxTotal += vx [ s i t e ] ;
}
}
int scale = 4;
int i n j e c t i o n s = ( int ) ( ( flowSpeed numParticles −vxTotal )/ scale ) ;
for ( int k = 0;k<Math . abs ( i n j e c t i o n s ) ; k++) {
int i = ( int ) (Math . random ( ) Lx ) ; / / choose s i t e at random
int j = ( int ) (Math . random ( ) Ly ) ;
/ / f l i p d i r e c t i o n of h o r i z o n t a l l y moving p a r t i c l e i f p o s s i b l e
i f ( ( l a t t i c e [ i ] [ j ]&(RIGHT|LEFT))==(( injections >0) ? LEFT : RIGHT) ) {
l a t t i c e [ i ] [ j ] ^= RIGHT|LEFT ;
}
}
}
public void draw ( DrawingPanel panel , Graphics g ) {
i f ( l a t t i c e==null ) {
return ;
}
/ / i f s = 1 draw l a t t i c e and p a r t i c l e d e t a i l s e x p l i c i t l y
/ / otherwise average v e l o c i t y over an s by s square
int s = spatialAveragingLength ;
Graphics2D g2 = ( Graphics2D ) g ;
AffineTransform toPixels = panel . getPixelTransform ( ) ;
Line2D . Double line = new Line2D . Double ( ) ;
for ( int i = 0; i <Lx ; i ++) {
for ( int j = 2; j <Ly−2; j ++) {
double x = i +( j %2) 0.5;
double y = j SQRT3_OVER2;
i f ( s==1) {
g2 . setPaint ( Color .BLACK) ;
for ( int dir = 0; dir <NUM_CHANNELS; dir ++) {
i f ( ( l a t t i c e [ i ] [ j ]&(1<< dir ) ) ! = 0 ) {
line . setLine ( x , y , x+ux [ dir ] 0 . 4 , y+uy [ dir ] 0 . 4 ) ;
g2 . draw ( toPixels . createTransformedShape ( line ) ) ;
}
}
}
/ / draw points at l a t t i c e s i t e s
i f ( ( l a t t i c e [ i ] [ j ]&BARRIER)==BARRIER | | s==1) {
Circle c = new Circle ( x , y ) ;
c . pixRadius = ( ( l a t t i c e [ i ] [ j ]&BARRIER)==BARRIER) ? 2 : 1;
c . draw ( panel , g ) ;
}
}
}
i f ( s==1) {
return ;
}
for ( int i = 0; i <Lx ; i += s ) {
for ( int j = 0; j <Ly ; j += s ) {
CHAPTER 14. COMPLEX SYSTEMS 569
double x = i+s / 2 . 0 ;
double y = ( j +s /2.0) SQRT3_OVER2;
double
wx = 0 , wy = 0; / / compute coarse grained average v e l o c i t y
for ( int m = i ;m!=( i+s)%Lx ;m = (m+1)%Lx ) {
for ( int n = j ; n!=( j +s)%Ly ; n = (n+1)%Ly ) {
wx += vx [ l a t t i c e [m] [ n ] ] ;
wy += vy [ l a t t i c e [m] [ n ] ] ;
}
}
Arrow a = new Arrow ( x , y , arrowSize wx/s , arrowSize wy/ s ) ;
a . setHeadSize ( 2 ) ;
a . draw ( panel , g ) ;
}
}
}
}
Listing 14.14: Listing of the LatticeGasApp class.
package org . opensourcephysics . sip . ch14 . l a t t i c e g a s ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . ;
public class LatticeGasApp extends AbstractSimulation {
LatticeGas model = new LatticeGas ( ) ;
DisplayFrame display = new DisplayFrame ( "Lattice gas" ) ;
public LatticeGasApp ( ) {
display . addDrawable ( model ) ;
display . setSize (800 , ( int ) (400 Math . sqrt ( 3 ) / 2 ) ) ;
}
public void i n i t i a l i z e ( ) {
int lx = control . getInt ( "lx" ) ;
int ly = control . getInt ( "ly" ) ;
double density = control . getDouble ( "Particle density" ) ;
model . i n i t i a l i z e ( lx , ly , density ) ;
model . flowSpeed = control . getDouble ( "Flow speed" ) ;
model . spatialAveragingLength = control . getInt ( "Spatial averaging length" ) ;
model . arrowSize = control . getInt ( "Arrow size" ) ;
display . setPreferredMinMax ( −1 , lx , −Math . sqrt (3)/2 , ly Math . sqrt ( 3 ) / 2 ) ;
}
public void doStep ( ) {
model . flowSpeed = control . getDouble ( "Flow speed" ) ;
model . spatialAveragingLength = control . getInt ( "Spatial averaging length" ) ;
model . arrowSize = control . getDouble ( "Arrow size" ) ;
model . step ( ) ;
}
public void reset ( ) {
control . setValue ( "lx" , 1000);
control . setValue ( "ly" , 500);
CHAPTER 14. COMPLEX SYSTEMS 570
control . setValue ( "Particle density" , 0 . 2 ) ;
control . setAdjustableValue ( "Flow speed" , 0 . 2 ) ;
control . setAdjustableValue ( "Spatial averaging length" , 20);
control . setAdjustableValue ( "Arrow size" , 2 ) ;
enableStepsPerDisplay ( true ) ;
control . setAdjustableValue ( "steps per display" , 100);
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl . createApp (new LatticeGasApp ( ) ) ;
}
}
An important application of lattice gas models is to simulate the ﬂow in and around various
geometries. In Problem 14.20 we will see that the ﬂuid velocity ﬁeld develops vortices, wakes,
and other ﬂuid structures near obstacles. Method initialize in class LatticeGas places an
obstacle in the middle of the lattice and provides initial values for each site. Large lattices
are required to obtain quantitative results, because it is necessary to average the velocity over
many sites. The parameter density is the average number of particles divided by the maximum
possible. The pressure can be varied by changing the flowSpeed parameter.
Problem 14.20. Flow past a barrier
(a) Convince yourself that you understand the collision rules and their implementation in class
LatticeGas. Then download the class FastLatticeGas from the ch14 directory. This latter
class uses all 32 bits of an int variable and runs about twice as fast. The tradeoﬀ is that
the code is more diﬃcult to debug and understand. Use the parameters in Listing 14.14.
Describe the ﬂow once a steady-state velocity ﬁeld begins to appear. Do you see a wake
appearing behind the obstacle? Are there vortices?
(b) Repeat part (a) with diﬀerent size obstacles. Are there any systematic trends? (One limitation
of the present program is that it naively redraws a circle to represent each barrier site.
This redrawing requires a signiﬁcant amount of computer resources and limits the size of
the obstacles that we can consider.)
(c) Reduce the pressure by reducing the ﬂow speed. Are there any noticeable changes in behavior
from parts (a) and (b)? Reduce the pressure still further and describe any changes in
the ﬂuid ﬂow.
Problem 14.21. Approach to equilibrium
(a) Consider the approach of a lattice gas to equilibrium. Modify LatticeGas so that the initial
conﬁguration has zero net momentum, the particles are localized in a b×b region, and there
are no barrier sites. Choose L = 30 and b = 4 and place six particles at every site in the
localized region. The other sites in the lattice are initially empty. Describe what happens
to the particles as a function of time. Approximately how many time steps does it take for
the system to come to equilibrium? Do the particles appear to be at random positions with
random velocities? What is your visual algorithm for determining when equilibrium has
been reached?
(b) Repeat part (a) for b = 2, 6, 8, and 10. Estimate the equilibration time in each case. What is
the qualitative dependence of the equilibration time on b? How does the equilibration time
depend on the number density ρ?
CHAPTER 14. COMPLEX SYSTEMS 571
(c) Repeat part (a) with b = 4, but with L = 10, 20, and 40. Estimate the equilibration time in
each case. How does the equilibration time depend on ρ?
Problem 14.22. Fluid ﬂow in porous media
(a) Modify class LatticeGas so that instead of a rectangular barrier, the barrier sites are placed
at random in the system. We deﬁne the porosity φ as the fraction of sites without a barrier.
The interesting quantity to measure is the permeability, k, which is a measure of the ﬂuid
conductivity. We can compute the permeability using the relation
k ∝
φ i vi,x
j ∆pj,x
(14.8)
where the sum in the numerator is over the horizontal velocity of all particles in the pore
space (the sites at which there are no barriers), and the sum in the denominator is over the
injected momentum at all sites used to maintain the ﬂow. The brackets refer to averages over
time. Compute the permeability as a function of the porosity φ and display your results on
a log-log plot. You should average over at least 10 conﬁgurations of random barrier sites
for each value of the porosity. What value of φ corresponds to the percolation threshold,
deﬁned by k = 0? See Rothman and Zaleski for a discussion of the comparison of this type
of simulation with results for real rocks.
(b)∗ Vary the size of the lattice and use the ﬁnite size scaling procedure discussed in Section 12.4
to estimate the critical exponent µ deﬁned by the dependence of the permeability on the
porosity; that is, k ∼ (φ−φc)µ. Assume that you know the value of the percolation exponent
ν deﬁned by the critical behavior of the connectedness length ξ ∼ |p−pc|−ν (see Table 12.1).
The principal virtues of lattice gas models are their use of simultaneous updating, which
makes them very fast on parallel computers, and their use of integer or boolean arithmetic and
bit manipulation, which is faster than ﬂoating point arithmetic. Their major limitation is that it
is necessary to average over many sites to obtain quantitative results. It is not yet clear whether
lattice gas models are more eﬃcient than standard simulations of the Navier–Stokes equation.
The greatest promise for lattice gas models may not be with simple single component ﬂuids, but
with multicomponent ﬂuids such as binary ﬂuids and ﬂuids containing bubbles (see the book
by Rothman and Zaleski). A related technique that might hold greater promise is the lattice
Boltzmann method (see the references).
14.7 Overview and Projects
The models we have discussed in this chapter have been presented as algorithms rather than in
terms of diﬀerential equations and are a reﬂection of the way that technology aﬀects the way we
think. Can you discuss the models in this chapter without thinking about their implementation
on a computer? Can you imagine understanding these models without the use of computer
graphics?
We have given only a brief introduction to cellular automata and other models that are
relevant to the rapidly developing study of complex systems. There are many more models and
applications that we have not discussed, ranging from aging, the immune system, economic
cycles, and pedestrian movements to name just a few.
CHAPTER 14. COMPLEX SYSTEMS 572
Models of opinion formation have become popular in recent years. The basic idea is that
the opinions of others will inﬂuence the opinion of individuals. The ﬁrst two projects in the
following explore some of the popular models.
Project 14.23. Models of opinion formation
(a) The voter model. On a regular lattice, assign each site the value ±1. Choose a site (the voter)
at random. The voter then adopts the same value as a randomly chosen neighbor. These two
steps continue until all sites have the same value; that is, when they have reached consensus.
Compute the probability of achieving a consensus of +1 given that the initial density of +1
sites is ρ0. Use a 10 × 10 square lattice and make at least 20 runs at each density. Also
compute the time to reach consensus as a function of the lattice size. In two dimensions
this time scales as N lnN, where N is the number of sites. How does the consensus time
scale with N in d = 1 and d = 3 dimensions? How does it scale on a preferential attachment
network (see the article by Sood and Redner)?
(b) The relative agreement interaction model. N individuals are initially assigned an opinion that
takes on a value between 0 and 1. Choose two individuals, i and j, at random. Assume that
the ith opinion Oi is greater than the jth opinion Oj. If their opinions diﬀer by less than
the parameter , then increase Oj by (m/2)(Oi − Oj) and decrease Oi by the same amount,
where m is another parameter. This model implements the idea that two people will inﬂuence
each other only if their opinions are suﬃciently close. Write a program to simulate
this model. Use a LatticeFrame for which each cell can take on one of 256 values. The approximation
of the continuum by 256 values is for visualization purposes only, and the 256
values should be suﬃciently large to approximate a continuum of values. Choose = 10,
50, and 100 (out of 256), and m = 0.3 and 0.6. To speed up the simulation, include in your
program the option to plot conﬁgurations only after a certain number of iterations (use
enableStepsPerDisplay(true)). Choose N ≥ 2500, begin with a random set of opinions.
Discuss whether a single opinion emerges and explain the magnitude of the ﬂuctuations.
(c) The Sznajd model. Place individuals on a square lattice with linear dimension L and periodic
boundary conditions. Each individual has one of two opinions. At each iteration, an individual
and one of the person’s neighbors is chosen at random. If the two individuals have
the same opinion, the opinion of the six neighbors of the pair is changed to that of the pair.
The idea is that people are more likely to change their opinion to those physically near them
if more than one person shares the same opinion (peer pressure). Write a program to simulate
this model and show that consensus is always reached for all sites if the simulation is
run for a suﬃciently long time. Discuss the visual appearance of the groups of like-minded
individuals. Consider initial conﬁgurations where the individuals are randomly assigned
the two opinions and initial conﬁgurations where one opinion has a majority of 1%, 5%,
and 10%. Choose L ≥ 50.
(d) Generalize the Sznajd model so that an individual may be assigned one of more than two
opinions. Is consensus still always reached? What happens if the individuals are not on the
sites of a square lattice, but rather are the nodes of a preferential attachment network of at
least 5000 nodes?
Project 14.24. The minority game
In certain situations we wish to be in the minority. For example, we might wish to go to a
popular restaurant on an oﬀ-night so that we do not have to wait in line. A business might
CHAPTER 14. COMPLEX SYSTEMS 573
want to sell goods and services that are not being sold by other businesses. The following
algorithm, known as the minority game, is a model of adaptive competition where each player
tries to maximize his gain. We will ﬁnd that there is a phase transition between states where
the players mainly act on their own and states for which cooperative behavior emerges.
There are N players, where N is odd. At each iteration, each player can choose one of two
actions which we call 1 or 0 but which we encode as the boolean true or false. A player’s
choice is determined by a strategy based on the previous m (memory) iterations. Each strategy
is represented by a table of all the possible outcomes of the previous m iterations and a decision
on what to do for each outcome. Each player has his own table of strategies. An outcome is
deﬁned as the action that was chosen least by all the players. For example, suppose m = 2.
There are four possible pasts: (1,1), (1,0), (0,1), and (0,0). The past (1,1) means that in each
of the last two iterations, action 1 was chosen by a minority of the players. A strategy would
be encoded by a table such as the following: (1,1,1), (1,0,0), (0,1,0), and (0,0,1). The ﬁrst two
entries in each triple are the possible outcomes of the last two iterations, and the third entry in
the triple gives the action that the strategy suggests taking. Thus the triple (1,1,1) means that if
(1,1) occurred in the past, the strategy is to choose action 1; (1,0,0) means that if (1,0) occurred
in the past, the strategy is to use action 0. At the beginning of the game,each player is assigned
at least two strategies that are chosen at random from all possible strategies. As the game is
played, the performance of each strategy (whether or not it is used) is updated, such that if a
strategy leads to the same action that was in the minority, then this strategy is successful and
its performance is incremented by unity; otherwise, it stays the same. At each iteration each
player chooses the strategy with the best performance and takes the action determined by his
best performing strategy. Then the outcome (which action was in the minority) for that iteration
is determined, and the past m outcomes and the performance for all the strategies are updated.
Note that the strategies available to each player does not change, but which of each player’s
strategy is best changes as the game is iterated.
To simplify the code, represent the past outcomes by an integer where each bit represents an
outcome. For example, the bit 110 means that the outcome was 0 in the last iteration and was 1
for each of the earlier two iterations. You will need the following arrays: strategies[i][j][k],
which gives the action for the ith player using its jth strategy when the kth past occurred;
performance[i][j], which gives the performance for the ith player’s jth strategy, and chosenStrategy[i],
which gives the strategy chosen by the ith player in the current iteration.
Let N1 equal the number of players who chose action 1 in one iteration. The outcome is best
if at each iteration the value of N1 is close to N/2, because in this way there would be as many
players as possible in the minority. The quantity of interest is σ, where σ is deﬁned as
σ2
=
k
(N1(k) − N1 )2
/Nstep, (14.9)
and the sum is over the Nstep iterations the game, and N1(k) is the number of players choosing
action 1 in the kth iteration. The quantity σ decreases as the eﬃciency increases. High eﬃciency
means that on the average more players are in the minority. We might think that the
eﬃciency increases as the number of past outcomes increases, because then the players have
more information to choose their strategy. However, you might be surprised!
(a) Write a program to simulate the minority game. For simplicity, give each player only two
strategies chosen at random. Run your program for a memory m varying from 2 to about 12.
A reasonable choice for N is 101, but for testing purposes choose N = 11. Each game should
be run for at least 1000 iterations, and your results should be averaged over at least 10
CHAPTER 14. COMPLEX SYSTEMS 574
independent runs for the same m, with diﬀerent strategies for the players. Plot the average
of σ versus m and describe the behavior for diﬀerent values of N. Explain why there is a
minimum in these plots.
(b) The results of the minority game scale unambiguously. Plot the average of σ2/N versus
2m/N for diﬀerent values of N. You should ﬁnd that your data fall on the same curve. What
does 2m represent? Discuss this scaling behavior and describe the behavior of the eﬃciency
on either side of the minimum. Can you describe your results as a phase transition? Where
is the ordered phase and where is the disordered phase?
(c) Plot the spread in the values of σ versus m. The spread can be taken to be the standard
deviation of each game’s value of σ over many games. Discuss the signiﬁcance of your
results.
Project 14.25. A cellular automaton for Burger’s equation
In Section 14.6 we mentioned that the partial diﬀerential equation describing the ﬂow of incompressible
ﬂuids, the Navier–Stokes equation, is very diﬃcult to solve numerically. A onedimensional
approximation of the Navier–Stokes equation was given by Burgers, and is given
by
∂n
∂t
+ c
∂
∂x
n −
n2
2
= D
∂2n
∂x2
(14.10)
where n(x,t) corresponds to the velocity ﬁeld at position x at time t, c is the linear advection
(drift) coeﬃcient, and D is a diﬀusion coeﬃcient. Equation (14.10) is of general interest because
it can be solved analytically and its solutions exhibit discontinuities (shock waves) depending
on the values of the parameters and the initial conditions.
Boghosian and Levermore have proposed a cellular automaton that is equivalent to (14.10).
The study of this cellular automaton raises many of the same issues as the lattice gas models of
the incompressible Navier–Stokes equation considered in Section 14.6. Its study also illustrates
the idea that many partial diﬀerential equations can be formulated as cellular automata.
We know that if all particles on the lattice move one lattice site to either the right or the
left in one time step, then the density of the particles obeys the diﬀusion equation (see Appendix
7A),
∂n
∂t
= D
∂2n
∂x2
(14.11)
where D = (∆x)2/2∆t, ∆x is the lattice spacing, and ∆t is the time between successive steps of
the random walk. If add a bias so that the probability of a step to the right is (1 + α)/2 and the
probability of a step to the left is (1 − α)/2, the density of the walkers satisﬁes
∂n
∂t
+ c
∂n
∂x
= D
∂2n
∂x2
(14.12)
where c = α∆x/∆t. To incorporate the quadratic term, we add the rule that no two particles
occupying the same site may be moving in the same direction. In this way the state of each site
is speciﬁed by two bits. The right bit is 1 if a particle moving to the right is present and is 0
otherwise. Similarly, the left bit stores information about the presence of a particle moving to
the left. Thus each site has four possible states labeled by the binary numbers 00, 01, 10, and
11.
In the ﬁrst part of the step, the collision substep, the particles change their direction at
random at their present lattice sites subject to the exclusion rule. In the second substep, the
CHAPTER 14. COMPLEX SYSTEMS 575
b1(i,t) b0(i,t) ˜b1(i,t) ˜b0(i,t)
0 0 0 0
0 1 (1 − α(i,t)/2 (1 + α(i,t)/2
1 0 (1 − α(i,t)/2 (1 + α(i,t)/2
1 1 1 1
Table 14.2: Rules for the collision substep.
particles move to the neighboring lattice site in their new direction. We follow Boghosian and
Levermore and denote the right (left) bit at lattice site i and time step t by b0(x,t) (b1(x,t)). After
the collision substep, the new states are ˜b0,1(x,t) and are given in Table 14.2, where α(x,t) = ±1
with mean α. The rules in Table 14.2 may be written in the form
˜b0(x,t) =
1 + α(x,t)
2
b0(x,t)|b1(x,t) +
1 − α(x,t)
2
b0(x,t)&b1(x,t) (14.13a)
˜b1(x,t) =
1 − α(x,t)
2
b0(x,t)|b1(x,t) +
1 + α(x,t)
2
b0(x,t)&b1(x,t). (14.13b)
where | is the inclusive or operator and & denotes the and operator on a pair of bits. In the
advection substep, the particles move to the neighboring lattice site in their new direction. The
rules for these moves are
˜b0(x + 1,t + 1) = ˜b0(x,t) (14.14a)
˜b1(x − 1,t + 1) = ˜b1(x,t). (14.14b)
We can combine these two substeps to arrive at the rule for one full time step of the cellular
automaton:
b0(x + 1,t + 1) =
1 + α(x,t)
2
b0(x,t)|b1(x,t) +
1 − α(x,t)
2
b0(x,t)&b1(x,t) (14.15a)
b1(x − 1,t + 1) =
1 − α(x,t)
2
b0(x,t)|b1(x,t) +
1 + α(x,t)
2
b0(x,t)&b1(x,t). (14.15b)
Write a program to implement (14.15) using periodic boundary conditions. Choose c = 1,
D = 2−15, and the initial condition
n(x,t = 0) = 1.0 + 0.4cos(2πx) (14.16)
where x denotes the position of a lattice site. Boghosian and Levermore used 216 = 65,536
lattice sites so that ∆x = 2−16. The bias is given by α = c∆x/2D = 0.25 and the time step is
∆t = (∆x)2/2D = 2−18. Average your results for 128 lattice sites and plot the average density as
a function of x for diﬀerent values of t up to t = 1. Do you see any evidence of a shock wave [a
sharp discontinuity in n(x)]?
Project 14.26. Spring-block model of earthquakes
The ﬁrst simulations of earthquakes were done by Burridge and Knopoﬀ in 1967. Their model
represents the motion of one side of a lateral fault that is driven by a slow shear deformation
and subject to a nonlinear, velocity-dependent friction force. The model consists of a onedimensional
array of blocks on a substrate (see Figure 14.9). Each block is connected to its
nearest neighbors by springs with spring constant kc, which represent the linear elastic response
CHAPTER 14. COMPLEX SYSTEMS 576
v
m
kc
kL
loader plate
substrate
Figure 14.9: Schematic of the Burridge–Knopoﬀ model. Blocks with mass m are attached to
their nearest neighbors by springs with spring constant kc. They are also attached to a ﬁxed
loader plate with spring constant kL. The substrate moves with speed v to the left.
of the system to compressional deformations. Each block is also connected by a spring with
spring constant kL to a ﬁxed loader plate.
The system is loaded by moving the substrate at a constant speed v to the left. Eventually,
the force on a block exceeds the static friction threshold F0 and the block slips. As the block
moves, the springs connecting it to its neighbors change length, thus changing the forces acting
on them. The neighboring blocks begin to accelerate if the force is suﬃcient; that is, if the force
due to the springs is greater than the static friction force.
The equation of motion of the Burridge–Knopoﬀ model can be written as
m ¨xj = kc(xj+1 − 2xj + xj−1) − kLxj − F(v + ˙xj) (14.17)
where xj is the displacement of the jth block. The force between the blocks is kc(xj+1 − 2xj + xj−1),
the force from the loader plate is −kLxj, and F represents the friction force due to the substrate.
Periodic boundary conditions are not used.
As usual, it is convenient to introduce dimensionless variables, which we take to be uj =
(kL/F0)xj, ω2
L = kL/m, and τ = ωLt. We rewrite (14.17) as
¨uj = 2
(uj+1 − 2uj + uj−1) − uj − φ(2αν + 2α ˙uj) (14.18)
where φ(w) = F(w)/F0, the stiﬀness parameter =
√
kc/kL, ν = vkL/ωLF0, and 2α = ωLF0/kLv;
the dot now denotes diﬀerentiation with respect to τ. The equation of motion (14.18) can be
solved using the Euler-Richardson algorithm with ∆τ = 10−3.
The velocity of a block is set to zero if at any time the speed of the block relative to the
substrate is less than a parameter v0, its speed is decreasing, and the force due to the springs is
less than F0. Otherwise, the friction force is given by
φ(w) =
1 − σ
1 + w
1−σ
(w > 0), (14.19)
where the parameter σ represents the drop of the friction force at the onset of the slip. If a block
is stuck, the calculation of the static friction force is a bit more involved. If the total force on a
block due to the springs is to the right, then the static friction force is set equal and opposite to
the total spring force up to a maximum value of F0. However, if the total spring force is to the
left, the static friction is chosen so that the acceleration of the block is zero. Typical values of
the parameters are F0 = 1, = 10, σ = 0.01, α = 2.5, and v0 = 10−5.
Initially we set ˙uj = 0 for all j and assign small random displacements to all the blocks. The
blocks will then move according to (14.18). For simplicity we set the substrate velocity v = 0,
and when all the blocks become stuck, we move all the blocks to the left by an equal amount
CHAPTER 14. COMPLEX SYSTEMS 577
such that the total force due to the springs on one block equals unity (F0). This procedure
will then cause one block to move or slip. As this block moves, other neighboring blocks may
move leading to an earthquake. Eventually, all the blocks will again become stuck. The main
quantities of interest are P (s), the distribution of the number of blocks that have moved during
an earthquake, and P (M), the distribution of the net displacement of the blocks during an
earthquake, where
M =
i
∆ui. (14.20)
The sum over i in (14.20) is over the blocks involved in an earthquake, and ∆ui is the net
displacement of the blocks during the earthquake. Do P (s) and P (M) exhibit scaling consistent
with Gutenberg–Richter?
The movement of the blocks represents the slip of the two surfaces of a fault past one another
during an earthquake. The stick-slip behavior of this model is similar to that of a real
earthquake fault. Other interesting questions are posed in the references (see Klein et al., Ferguson
et al., and Mori and Kawamura).
References and Suggestions for Further Reading
Réka Albert and Albert–László Barabási, “Statistical mechanics of complex networks,” Rev.
Mod. Phys. 74, 47–97 (2002).
Per Bak, How Nature Works (Copernicus Books, 1999). A good read about self-organized critical
phenomena from earthquakes to stock markets. Nature is not as simple as Bak believed,
but his interest in complex systems spurred many others to become interested.
P. Bak, “Catastrophes and self-organized criticality,” Computers in Physics 5 (4), 430 (1991). A
good introduction to self-organized critical phenomena.
Per Bak and Michael Creutz, “Fractals and self-organized criticality,” in Fractals in Science,
Armin Bunde and Shlomo Havlin, editors (Springer–Verlag, 1994).
Per Bak and Kim Sneppen, “Punctuated equilibrium and criticality in a simple model of evolution,”
Phys. Rev. Lett. 71, 4083 (1993); Henrik Flyvbjerg, Kim Sneppen, and Per Bak,
“Mean ﬁeld theory for a simple model of evolution,” Phys. Rev. Lett. 71, 4087 (1993).
P. Bak, C. Tang, and K. Wiesenfeld, “Self-organized criticality,” Phys. Rev. A 38, 364–374
(1988).
E. R. Berlekamp, J. H. Conway, and R. K. Guy, Winning Ways for your Mathematical Plays,
Vol. 2 (Academic Press, 1982). A discussion of how the Game of Life simulates a universal
computer.
Bruce M. Boghosian and C. David Levermore, “A cellular automaton for Burger’s equation,”
Complex Systems 1, 17–30 (1987). Reprinted in Doolen et al.
D. Challet and Y.-C. Zhang, “Emergence of cooperation and organization in an evolutionary
game,” Physica A 246, 407–418 (1997), or adap-org/9708006. The authors give the ﬁrst
description of the minority game..
CHAPTER 14. COMPLEX SYSTEMS 578
Debashish Chowdhury, Ludger Santen, and Andreas Schadschneider, “Simulation of vehicular
traﬃc: A statistical physics perspective,” Computing in Science and Engineering 2 (5),
80–87 (2000).
John W. Clark, Johann Rafelski, and Jeﬀrey V. Winston, “Brain without mind: Computer simulation
of neural networks with modiﬁable neuronal interactions,” Physics Reports 123,
215–273 (1985).
Aaron Clauset, M. E. J. Newman, and Cristopher Moore, “Finding community structure in
very large networks,” Phys. Rev. E 70, 066111-1–6 (2004). This paper describes a faster
algorithm than that discussed in Newman and Girvan.
J. P. Crutchﬁeld and M. Mitchell, “The evolution of emergent computation,” Proc. Natl. Acad.
Sci. 92, 10742–10746 (1995). The authors use genetic algorithms to evolve a cellular automata
model.
Guillaume Deﬀuant, Fréd’eric Amblard, Gérard Weisbuch, and Thierry Faure, “How can extremism
prevail? A study based on the relative agreement interaction model,” J. Artiﬁcial
Societies and Social Simulation 5(4) paper #1 (2002). This paper and others can be found
at <jasss.soc.surrey.ac.uk>.
Gary D. Doolen, Uriel Frisch, Brosl Hasslacher, Steven Orszag, and Stephen Wolfram, editors,
Lattice Gas Methods for Partial Diﬀerential Equations (Addison–Wesley, 1990). A collection
of reprints and original articles by many of the leading workers in lattice gas methods.
Stephanie Forrest, editor, Emergent Computation: Self-Organizing, Collective, and Cooperative
Phenomena in Natural and Artiﬁcial Computing Networks (MIT Press, 1991).
Stephen I. Gallant, Neural Network Learning and Expert Systems (MIT Press, 1993).
M. Gardner, Wheels, Life and Other Mathematical Amusements (W. H. Freeman, 1983).
Peter Grassberger, “Eﬃcient large-scale simulations of a uniformly driven system,” Phys. Rev.
E 49, 2436–2444 (1994). Grassberger considered substantially larger lattices and longer
simulation times than that used by Olami et al. and found that the Olami, Feder, Christensen
model does not exhibit power law scaling.
G. Grinstein and C. Jayaprakash, “Simple models of self-organized criticality,” Computers in
Physics 9, 164 (1995).
G. Grinstein, Terence Hwa, and Henrik Jeldtoft Jensen, “1/f α noise in dissipative transport,”
Phys. Rev. A 45, R559–R562 (1992).
B. Hayes, “Computer recreations,” Sci. Am. 250 (3), 12–21 (1984). An introduction to cellular
automata.
J. E. Hanson and J. P. Crutchﬁeld, “Computational mechanics of cellular automata: An example,”
Physica D 103, 169–189 (1997). The authors discuss energence in cellular automata.
Robert Herman, editor, The Theory of Traﬃc Flow (Elsevier, 1961).
John Hertz, Anders Krogh, and Richard G. Palmer, Introduction to the Theory of Neural Computation
(Addison–Wesley, 1991).
CHAPTER 14. COMPLEX SYSTEMS 579
J. J. Hopﬁeld, “Neural networks and physical systems with emergent collective computational
abilities,” Proc. Natl. Acad. Sci. USA 79, 2554–2558 (1982).
Navot Israeli and Nigel Goldenfeld, “Computational irreducibility and the predictability of
complex physical systems,” Phys. Rev. Lett. 92, 074105 (2004).
H. M. Jaeger, Chu–heng Liu, and Sidney R. Nagel, “Relaxation at the angle of repose,” Phys.
Rev. Lett. 62, 40 (1989). These authors discuss experiments on real sandpiles.
W. Klein, C. Ferguson, and J. B. Rundle, “Spinodals and scaling in slider block models,” in
Reduction and Predictability of Natural Disasters, J. B. Rundle, D. L. Turcotte, and W.
Klein, editors (Addison–Wesley, 1995). Also see C. D. Ferguson, W. Klein, and John B.
Rundle, “Spinodals, scaling, and ergodicity in a threshold model with long-range stress
transfer,” Phys. Rev. E 60, 1359–1373 (1999).
J. A. Koza, Genetic Programming: On the Programming of Computers by Means of Natural
Selection (MIT Press, 1992).
Chris Langton, “Studying artiﬁcial life with cellular automata,” Physica D 22, 120–149 (1986).
See also Christopher G. Langton, editor, Artiﬁcial Life (Addison–Wesley, 1989); Christopher
G. Langton, Charles Taylor, J. Doyne Farmer, and Steen Rasmussen, editors, Artiﬁcial
Life II, Addison–Wesley (1989); Christopher G. Langton, editor, Artiﬁcial Life III
(Addison–Wesley, 1994).
Roger Lewin, Complexity: Life at the Edge of Chaos (University of Chicago Press, 2000). A
popular exposition of complexity theory.
Sergei Maslov, Maya Paczuski, and Per Bak, “Avalanches and 1/f noise in evolution and growth
models,” Phys. Rev. Lett. 73, 2162 (1994).
Stephan Mertens, “Computational complexity for physicists,” Computing in Science and Engineering
4 (3), 31–47 (2002).
Takahiro Mori and Hikaru Kawamura, “Simulation study of the one-dimensional Burridge–
Knopoﬀ model of earthquakes,” J. Geophysical Res. 111, B07302 (2006).
K. Nagel and M. Schreckenberg, “A cellular automaton model for freeway traﬃc,” J. Phys. I
France 2, 2221–2229 (1992). Also see <www.traffic.uni-duisburg.de/>.
Kai Nagel, Dietrich E. Wolf, Peter Wagner, and Patrice Simon, “Two-lane traﬃc rules for cellular
automata: A systematic approach,” Phys. Rev. E 58, 1425–1437 (1998).
M. E. J. Newman, “The structure and function of complex networks,” SIAM Rev. 45, 167–256
(2003).
M. E. J. Newman, “Detecting community structure in networks,” Eur. Phys. J. B 38, 321–330
(2004); M. E. J. Newman and M. Girvan, “Finding and evaluating community structure in
networks,” Phys. Rev. E 69, 026113-1–15 (2004). These papers describe an algorithm for
detecting the heirarchical structure of networks.
J. A. Niesse, R. P. White, and H. R. Mayne, “Genetic algorithm approaches to minimum energy
geometry of aromatic hydrocarbon clusters,” J. Chem. Phys. 108, 2208–2218 (1998).
Z. Olami, H. J. S. Feder, and K. Christensen, “Self-organized criticality in a continuous, nonconservative
cellular automaton modeling earthquakes,” Phys. Rev. Lett. 68, 1244 (1992).
CHAPTER 14. COMPLEX SYSTEMS 580
Suzana Moss de Oliveira, Jorge S. Sá Martins, Paulo Murilo C. de Oliveira, Karen Luz-Burgoa,
Armando Ticona, and Thadeu J. P. Penna, “The Penna model for biological aging and
speciation,” Computing in Science and Engineering 6 (3), 74–81 (2004). Also see Dietrich
Stauﬀer, “The complexity of biological ageing,” cond-mat/0310038.
Elaine S. Oran and Jay P. Boris, Numerical Simulation of Reactive Flow, 2nd ed. (Cambridge
University Press, 2002). Although much of this book assumes an understanding of ﬂuid
dynamics, the discussion of simulation methods and the numerical solution of the diﬀerential
equations of ﬂuid ﬂow does not require much background.
Michel Peyrard, “Nonlinear dynamics and statistical physics of DNA,” Nonlinearity 17, R1–
R40 (2004). The author describes a simple mechanical model of DNA [see Figure 10 and
Eq. (1)] that is in the same spirit as the Burridge–Knopoﬀ model of earthquakes.
William Poundstone, The Recursive Universe (Contemporary Books, 1985). A book on the
Game of Life that attempts to draw analogies between the patterns of Life and ideas of
information theory and cosmology.
Drek de Solla Price, “Networks of scientiﬁc papers,” Science 149, 510–515 (1965); “A genral
theory of bibliometric and other cummulative advantage processes,” J. Amer. Soc. Inform.
Sci. 27, 292–306 (1976). Possibly the ﬁrst description of a scale-free network and the explanation
for power law distributions.
Daniel H. Rothman and Stéphane Zalesk, Lattice-Gas Cellular Automata (Cambridge University
Press, 1997). This text includes a discussion of ﬂuid ﬂow through porous media as
well as the lattice Boltzmann method for simulating ﬂuids. Also see Daniel H. Rothman
and Stéphane Zaleski, “Lattice-gas models of phase separation: interfaces, phase transitions,
and multiphase ﬂow,” Rev. Mod. Phys. 66, 1417–1479 (1994).
David E. Rumelhart and James L. McClelland, Parallel Distributed Processing: Explorations in
the Microstructure of Cognition, Vol. 1: Foundations (MIT Press, 1986). See also Vol. 2 on
applications.
Robert Savit, Radu Manuca, and Rick Riolo, “Adaptive competition, market eﬃciency, and
phase transitions,” Phys. Rev. Lett. 82, 2203 (1999). Analysis of the scaling behavior of the
minority game.
Herbert A Simon, “On a class of skew distribution functions,” Biometrika, 42, 425–440 (1955).
An early paper that shows power laws coming from preferential attachment.
V. Sood and S. Redner, “Voter model on heterogeneous graphs,” Phys. Rev. Lett. 94, 178701
(2005).
Dietrich Stauﬀer, “Monte Carlo simulations of Sznajd models,” J. Artiﬁcial Societies and Social
Simulation 5(1) paper #4 (2002). This paper and other relevant papers can be found at
<jasss.soc.surrey.ac.uk>.
Dietrich Stauﬀer, “Cellular automata,” Chapter 9 in Fractals and Disordered Systems, Armin
Bunde and Shlomo Havlin, editors (Springer–Verlag, 1991). Also see Dietrich Stauﬀer,
“Programming cellular automata,” Computers in Physics 5 (1), 62 (1991).
Daniel L. Stein, editor, Lectures in the Sciences of Complexity, Vol. 1 (Addison–Wesley, 1989);
Erica Jen, editor, Lectures in Complex Systems, Vol. 2 (Addison–Wesley, 1990); Daniel
L. Stein and Lynn Nadel, editors, Lectures in Complex Systems, Vol. 3 (Addison–Wesley,
1991).
CHAPTER 14. COMPLEX SYSTEMS 581
Patrick Sutton and Sheri Boyden, “Genetic algorithms: A general search procedure,” Am. J.
Phys. 62, 549–552 (1994). This readable paper discusses the application of genetic algorithms
to Ising models and function optimization.
K. Sznajd-Weron and J. Sznajd, “Opinion evolution in closed community,” Int. J. Mod. Phys. C
11 (6), 1157–1165 (2000).
Tommaso Toﬀoli and Norman Margolus, Cellular Automata Machines—A New Environment
for Modeling (MIT Press, 1987). See also Norman Margolus and Tommaso Toﬀoli, “Cellular
automata machines,” in the volume edited by Doolen et al.
D. J. Tritton, Physical Fluid Dynamics, 2nd ed. (Oxford Science Publications, 1988). An excellent
introductory text that integrates theory and experiment. Although there is only a
brief discussion of numerical work, the text provides the background useful for simulating
ﬂuids.
M. Mitchell Waldrop, Complexity: The Emerging Science at the Edge of Order and Chaos (Simon
and Schuster, 1992). A popular exposition of complexity theory.
Stephen Wolfram, editor, Theory and Applications of Cellular Automata (World Scientiﬁc,
1986). A collection of research papers on cellular automata that range in diﬃculty from
straightforward to specialists only. An extensive annotated bibliography also is given.
Two papers in this collection that discuss the classiﬁcation of one-dimensional cellular
automata are S. Wolfram, “Statistical mechanics of cellular automata,” Rev. Mod. Phys.
55, 601–644 (1983), and S. Wolfram, “Universality and complexity in cellular automata,”
Physica B 10, 1–35 (1984).
Stephen Wolfram, A New Kind of Science (Wolfram Media, 2002). This book discusses many
important ideas and computer experiments on cellular automata. More information can
be found at <www.stephenwolfram.com/>. An interesting review of this book is given by
L. Kadanoﬀ, “Wolfram on cellular automata,” Phys. Today 55 (7), 55–56 (2002).
László Zalányi, Gábor Csárdi, Tamás Kiss, Máté Lengyel, Rebecca Warner, Jan Tobochnik,
and Péter Érdi, “Properties of a random attachment growing network,” Phys. Rev. E. 68,
066104-1–9 (2003).
Chapter 15
Monte Carlo Simulations of
Thermal Systems
We discuss how to simulate thermal systems using a variety of Monte Carlo methods including
the traditional Metropolis algorithm. Applications to the Ising model and various particle
systems are discussed and more eﬃcient Monte Carlo algorithms are introduced.
15.1 Introduction
The Monte Carlo simulation of the particles in the box problem discussed in Chapter 7 and the
molecular dynamics simulations discussed in Chapter 8 have exhibited some of the important
qualitative features of macroscopic systems such as the irreversible approach to equilibrium and
the existence of equilibrium ﬂuctuations in macroscopic quantities. In this chapter we apply
various Monte Carlo methods to simulate the equilibrium properties of thermal systems. These
applications will allow us to explore some of the important concepts of statistical mechanics.
Due in part to the impact of computer simulations, the applications of statistical mechanics
have expanded from the traditional areas of dense gases, liquids, crystals, and simple models
of magnetism to the study of complex materials, particle physics, and theories of the early
universe. For example, the demon algorithm introduced in Section 15.3 was developed by a
physicist interested in lattice gauge theories which are used to describe the interactions of fundamental
particles.
15.2 The Microcanonical Ensemble
We ﬁrst discuss an isolated system for which the number of particles N, the volume V , and
the total energy E are ﬁxed and external inﬂuences such as gravitational and magnetic ﬁelds
can be ignored. The macrostate of the system is speciﬁed by the values of E, V , and N. At
the microscopic level, there are many diﬀerent ways or conﬁgurations in which the macrostate
(E,V ,N) can be realized. A particular conﬁguration or microstate is accessible if its properties
are consistent with the speciﬁed macrostate.
All we know about the accessible microstates is that their properties are consistent with the
known physical quantities of the system. Because we have no reason to prefer one microstate
582
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 583
↓ ↓ ↓ ↓ ↓ ↓ ↓ ↑ ↓ ↓ ↑ ↑ ↓ ↑ ↑ ↑ ↑ ↑ ↑ ↑
↓ ↓ ↑ ↓ ↓ ↑ ↓ ↑ ↑ ↓ ↑ ↑
↓ ↑ ↓ ↓ ↓ ↑ ↑ ↓ ↑ ↑ ↓ ↑
↑ ↓ ↓ ↓ ↑ ↓ ↓ ↑ ↑ ↑ ↑ ↓
↑ ↓ ↑ ↓
↑ ↑ ↓ ↓
4µB 2µB 0 −2µB −4µB
Table 15.1: The sixteen microstates for a one-dimensional system of N = 4 noninteracting spins.
The total energy E of each microstate is also shown. If the total energy of the system is E =
−2µB, then there are four accessible microstates (see the fourth column). Hence, in this case the
ensemble consists of four systems, each in a diﬀerent microstate with equal probability.
over another when the system is in equilibrium, it is reasonable to postulate that the system
is equally likely to be in any one of its accessible microstates. To make this postulate of equal
a priori probabilities more precise, imagine an isolated system with Ω accessible states. The
probability Ps of ﬁnding the system in microstate s is
Ps =



1/Ω, if s is accessible
0, otherwise.
(15.1)
The sum of Ps over all Ω states is equal to unity. Equation (15.1) is applicable only when the
system is in equilibrium.
The averages of physical quantities can be determined in two ways. In the usual laboratory
experiment, the physical quantities of interest are measured over a time interval suﬃciently
long to allow the system to sample a large number of its accessible microstates. We computed
such time averages in Chapter 8, where we used the method of molecular dynamics to compute
the time-averaged values of quantities such as the temperature and pressure. An interpretation
of the probabilities in (15.1) that is consistent with such a time average is that during a sequence
of observations, Ps yields the fraction of times that a single system is found in a given microstate.
Although time averages are conceptually simple, it is convenient to imagine a collection or
ensemble of systems that are identical mental copies characterized by the same macrostate but,
in general, by diﬀerent microstates. In this interpretation, the probabilities in (15.1) describe
an ensemble of identical systems, and Ps is the probability that a system in the ensemble is in
microstate s. An ensemble of systems speciﬁed by E, N, V is called a microcanonical ensemble.
An advantage of ensembles is that statistical averages can be determined by sampling the states
according to the desired probability distribution. Much of the power of Monte Carlo methods is
that we can devise sampling methods based on a ﬁctitious dynamics that is more eﬃcient than
the real dynamics.
Suppose that a physical quantity A has the value As when the system is in microstate s.
Then the ensemble average of A is given by
A =
Ω
s=1
AsPs (15.2)
where Ps is given by (15.1).
To illustrate these ideas, consider a one-dimensional system of N noninteracting spins on
a lattice. The spins can be in one of two possible directions which we take to be up or down.
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 584
The total energy of the system is E = −µB i si, where each lattice site has associated with it a
number si = ±1, where si = +1 for an up spin and si = −1 for a down spin; B is the magnetic
ﬁeld, and µ is the magnetic moment of a spin. A particular microstate of the system of spins
is speciﬁed by the set of variables {s1,s2,...,sN }. In this case the macrostate of the system is
speciﬁed by E and N.
In Table 15.1 we show the 16 microstates with N = 4. If the total energy E = −2µB, we
see that there are four accessible microstates. Hence, in this case there are four systems in the
ensemble each with an equal probability. The enumeration of the systems in the ensemble and
their probability allows us to calculate ensemble averages for the physical quantities of interest.
Problem 15.1. A simple ensemble average
Consider a one-dimensional system of N = 4 noninteracting spins with total energy E = −2µB.
What is the probability Pi that the ith spin is up? Does your answer depend on which spin you
choose?
15.3 The Demon Algorithm
We found in Chapter 8 that we can do a time average of a system of many particles with E, V ,
and N ﬁxed by integrating Newton’s equations of motion for each particle and computing the
time-averaged value of the physical quantities of interest. How can we do an ensemble average
at ﬁxed E, V , and N? And what can we do if there is no equation of motion available? One
way would be to enumerate all the accessible microstates and calculate the ensemble average of
the desired physical quantities as we did in Table 15.1. This approach is usually not practical
because the number of microstates for even a small system is much too many to enumerate. In
the spirit of Monte Carlo, we wish to develop a practical method of obtaining a representative
sample of the total number of microstates. One possible procedure is to ﬁx N, choose each
spin to be up or down at random, and retain the conﬁguration if it has the desired total energy.
However, this procedure is very ineﬃcient because most conﬁgurations would not have the
desired total energy and would have to be discarded.
An eﬃcient Monte Carlo procedure for simulating systems at a given energy was developed
by Creutz in the context of lattice gauge theory. Suppose that we add an extra degree of freedom
to the original macroscopic system of interest. For historical reasons, this extra degree of
freedom is called a demon. The demon transfers energy as it attempts to change the dynamical
variables of the system. If the desired change lowers the energy of the system, the excess energy
is given to the demon. If the desired change raises the energy of the system, the demon gives
the required energy to the system if the demon has suﬃcient energy. The only constraint is that
the demon cannot have negative energy.
We ﬁrst apply the demon algorithm to a one-dimensional classical system of N noninteracting
particles of mass m (an ideal gas). The total energy of the system is E = i mv2
i /2, where
vi is the velocity of particle i. In general, the demon algorithm is summarized by the following
steps:
1. Choose a particle at random and make a trial change in its coordinates.
2. Compute ∆E, the change in the energy of the system due to the change.
3. If ∆E ≤ 0, the system gives the amount |∆E| to the demon, that is, Ed = Ed − ∆E, and the
trial conﬁguration is accepted.
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 585
4. If ∆E > 0 and the demon has suﬃcient energy for this change (Ed ≥ ∆E), then the demon
gives the necessary energy to the system, that is, Ed = Ed − ∆E, and the trial conﬁguration
is accepted. Otherwise, the trial conﬁguration is rejected and the conﬁguration is not
changed.
The above steps are repeated until a representative sample of states is obtained. After a
suﬃcient number of steps, the demon and the system will agree on an average energy for each.
The total energy of the system plus the demon remains constant, and because the demon is only
one degree of freedom in comparison to the many degrees of freedom of the system, the energy
ﬂuctuations of the system will be of order 1/N, which is very small for N 1.
The ideal gas has a trivial dynamics. That is, because the particles do not interact, their
velocities do not change. (The positions of the particles change, but the positions are irrelevant
because the energy depends only on the velocity of the particles.) So the use of the demon
algorithm is equivalent to a ﬁctitious dynamics that lets us sample the microstates of the system.
Of course, we do not need to apply the demon algorithm to an ideal gas because all its properties
can be calculated analytically. However, it is a good idea to consider a simple example ﬁrst.
How do we know that the Monte Carlo simulation of the microcanonical ensemble will
yield results equivalent to the time-averaged results of molecular dynamics? The assumption
that these two types of averages yield equivalent results is called the quasi-ergodic hypothesis.
Although these two averages have not been proven to be identical in general, they have been
found to yield equivalent results in all cases of interest.
IdealDemon and IdealDemonApp implement the microcanonical Monte Carlo simulation of
the ideal classical gas in one dimension. To change a conﬁguration, we choose a particle at
random and change its velocity by a random amount. The parameter mcs, the number of Monte
Carlo steps per particle, plays an important role in Monte Carlo simulations. On the average, the
demon attempts to change the velocity of each particle once per Monte Carlo step per particle.
We frequently will refer to the number of Monte Carlo steps per particle as the “time,” even
though this time has no obvious direct relation to a physical time.
Listing 15.1: The demon algorithm for the one-dimensional ideal gas.
package org . opensourcephysics . sip . ch15 ;
public class IdealDemon {
public double v [ ] ;
public int N;
public double systemEnergy ;
public double demonEnergy ;
public int mcs = 0; / / number of MC moves per p a r t i c l e
public double systemEnergyAccumulator = 0;
public double demonEnergyAccumulator = 0;
public int acceptedMoves = 0;
public double delta ;
public void i n i t i a l i z e ( ) {
v = new double [N] ; / / array to hold p a r t i c l e v e l o c i t i e s
double v0 = Math . sqrt (2.0 systemEnergy/N) ;
for ( int i = 0; i <N;++ i ) {
v [ i ] = v0 ; / / give a l l p a r t i c l e s the same i n i t i a l v e l o c i t y
}
demonEnergy = 0;
resetData ( ) ;
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 586
}
public void resetData ( ) {
mcs = 0;
systemEnergyAccumulator = 0;
demonEnergyAccumulator = 0;
acceptedMoves = 0;
}
public void doOneMCStep ( ) {
for ( int j = 0; j <N;++ j ) {
/ / choose p a r t i c l e at random
int particleIndex = ( int ) (Math . random ( ) N) ;
/ / random change in v e l o c i t y
double dv = (2.0 Math . random () −1.0) delta ;
double t r i a l V e l o c i t y = v [ particleIndex ]+dv ;
double dE = 0.5 ( t r i a l V e l o c i t y t ri al Ve lo ci t y −
v [ particleIndex ] v [ particleIndex ] ) ;
i f (dE<=demonEnergy ) {
v [ particleIndex ] = t r i a l V e l o c i t y ;
acceptedMoves++;
systemEnergy += dE ;
demonEnergy −= dE ;
}
systemEnergyAccumulator += systemEnergy ;
demonEnergyAccumulator += demonEnergy ;
}
mcs++;
}
}
Listing 15.2: The target application for the simulation of an ideal gas using the demon algo-
rithm.
package org . opensourcephysics . sip . ch15 ;
import org . opensourcephysics . controls . ;
public class IdealDemonApp extends AbstractSimulation {
IdealDemon idealGas = new IdealDemon ( ) ;
public void i n i t i a l i z e ( ) {
idealGas .N = control . getInt ( "number of particles N" ) ;
idealGas . systemEnergy = control . getDouble ( "desired total energy" ) ;
idealGas . delta = control . getDouble ( "maximum velocity change" ) ;
idealGas . i n i t i a l i z e ( ) ;
}
public void doStep ( ) {
idealGas . doOneMCStep ( ) ;
}
public void stop ( ) {
double norm = 1.0/( idealGas . mcs idealGas .N) ;
control . println ( "mcs = "+idealGas . mcs ) ;
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 587
control . println ( "<Ed> = "+idealGas . demonEnergyAccumulator norm ) ;
control . println ( "<E> = "+idealGas . systemEnergyAccumulator norm ) ;
control . println ( "acceptance ratio = "+idealGas . acceptedMoves norm ) ;
}
public void reset ( ) {
control . setValue ( "Number of particles N" , 40);
control . setValue ( "desired total energy" , 40);
control . setValue ( "maximum velocity change" , 2 . 0 ) ;
}
public void resetData ( ) {
idealGas . resetData ( ) ;
idealGas . delta = control . getDouble ( "delta" ) ;
control . clearMessages ( ) ;
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl control = SimulationControl . createApp (new IdealDemonApp ( ) ) ;
control . addButton ( "resetData" , "Reset Data" ) ; / /
}
}
Problem 15.2. Monte Carlo simulation of an ideal gas
(a) Use the classes IdealDemon and IdealDemonApp to investigate the equilibrium properties of
an ideal gas. Note that the mass of the particles has been set equal to unity and the initial
demon energy is zero. For simplicity, the same initial velocity has been assigned to all the
particles. Begin by using the default values given in the listing of IdealDemonApp. What is
the mean value of the particle velocities after equilibrium has been reached?
(b) The conﬁguration corresponding to all particles having the same velocity is not very likely,
and it would be better to choose an initial conﬁguration that is more likely to occur when the
system is in equilibrium. In any case, we should let the system evolve until it has reached
equilibrium before we accumulate data for the various averages. We call this time the equilibration
or relaxation time. We can estimate the equilibration time from a plot of the demon
energy versus the time. Alternatively, we can reset the data until the computed averages
stop changing systematically. Clicking the Reset Data button sets the accumulated sums to
zero without changing the conﬁguration. Determine the mean demon energy Ed and the
mean system energy per particle using the default values for the parameters.
(c) Compute the mean energy of the demon and the mean system energy per particle for N =
100 and E = 10 and E = 20, where E is the total energy of the system. Use your result from
part (b) and obtain an approximate relation between the mean demon energy and the mean
system energy per particle.
(d) In the microcanonical ensemble the total energy is ﬁxed with no reference to the temperature.
Deﬁne the kinetic temperature by the relation 1
2 m v2 = 1
2 kTkinetic, where 1
2 m v2
is the mean kinetic energy per particle of the system. Use this relation to obtain Tkinetic.
Choose units such that m and Boltzmann’s constant k are unity. How is Tkinetic related to the
mean demon energy? How do your results compare to the relation given in introductory
physics textbooks that the total energy of an ideal gas of N particles in three dimensions is
E = 3
2 NkT ? (In one dimension the analogous relation is E = 1
2 NkT .)
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 588
(e) A limitation of most simulations is the ﬁnite number of particles. Is the relation between the
mean demon energy and mean kinetic energy per particle the same for N = 2 and N = 10 as
it is for N = 40? If there is no statistically signiﬁcant diﬀerence between your results for the
three values of N, explain why ﬁnite N might not be an important limitation for the ideal
gas in this simulation.
Problem 15.3. Demon energy distribution
(a) Add a method to class IdealDemon to compute the probability P (Ed)∆Ed that the demon has
energy between Ed and Ed + ∆Ed. Choose the same parameters as in Problem 15.2 and be
sure to determine P (Ed) only after equilibrium has been obtained.
(b) Plot the natural logarithm of P (Ed) and verify that lnP (Ed) depends linearly on Ed with a
negative slope. What is the absolute value of the slope? How does the inverse of this value
correspond to the mean energy of the demon and Tkinetic as determined in Problem 15.2?
(c) Generalize the IdealDemon class and determine the relation between the mean demon energy,
the mean energy per particle of the system, and the inverse of the slope of lnP (Ed) for
an ideal gas in two and three dimensions. It is straightforward to write the class so that it is
valid for any spatial dimension.
15.4 The Demon as a Thermometer
We found in Problem 15.3 that the form of P (Ed) is given by
P (Ed) ∝ e−Ed/kT
. (15.3)
We also found that the parameter T in (15.3) is related to the kinetic temperature of an ideal
gas.
In Problem 15.4 we will do some further simulations to determine the generality of the
form (15.3).
Problem 15.4. The Boltzmann probability distribution
Modify your simulation of an ideal gas so that the kinetic energy of a particle is proportional to
the absolute value of its momentum instead of the square of its momentum. Such a dependence
would hold for a relativistic gas where the particles are moving at velocities close to the speed
of light. Choose various values of the total energy E and number of particles N. Is the form of
P (Ed) the same as in (15.3)? How does the inverse slope of lnP (Ed) versus Ed compare to the
mean energy per particle of the system in this case?
According to the equipartition theorem of statistical mechanics, each quadratic degree of
freedom contributes 1
2 kT to the energy per particle. Problem 15.4 shows that the equipartition
theorem is not applicable for other dependencies of the particle energy.
Although the microcanonical ensemble is conceptually simple, it does not represent the
situation usually found in nature. Most systems are not isolated but are in thermal contact with
their environment. This thermal contact allows energy to be exchanged between the laboratory
system and its environment. The laboratory system is usually small relative to its environment.
The larger system with many more degrees of freedom is commonly referred to as the heat
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 589
reservoir or heat bath. The term heat refers to energy transferred from one body to another due
to a diﬀerence in temperature. A heat bath is a system for which such energy transfer causes a
negligible change in its temperature.
A system that is in equilibrium with a heat bath is characterized by the temperature of the
latter. If we are interested in the equilibrium properties of such a system, we need to know the
probability Ps of ﬁnding the system in microstate s with energy Es. The ensemble that describes
the probability distribution of a system in thermal equilibrium with a heat bath is known as
the canonical ensemble. In general, the canonical ensemble is characterized by the temperature
T , the number of particles N, and the volume V , in contrast to the microcanonical ensemble
which is characterized by the energy E, N, and V .
We have already discussed an example of a system in equilibrium with a heat bath, the
demon! In Problems 15.2–15.4, the system of interest was an ideal gas and the demon was an
auxiliary (special) particle that facilitated the exchange of energy between the particles of the
system. If we take the demon to be the system of interest, we see that the demon exchanges
energy with a much bigger system (the ideal gas), which we can take to be the heat bath. We
conclude that the probability distribution of the microstates of a system in equilibrium with a
heat bath has the same form as the probability distribution of the energy of the demon. (Note
that the microstate of the demon is characterized by its energy.) Hence, the probability that a
system in equilibrium with a heat bath at temperature T is in microstate s with energy Es has
the form given by (15.3):
Ps =
1
Z
e−βEs (canonical distribution), (15.4)
where β = 1/kT and Z is a normalization constant. Because Ps = 1, Z is given by
Z =
s
e−Es/kT
. (15.5)
The sum in (15.5) is over the microstates of the system for a given N and V . The quantity Z is
the partition function of the system. The ensemble deﬁned by (15.4) is known as the canonical
ensemble, and the probability distribution (15.4) is the Boltzmann or the canonical distribution.
The derivation of the Boltzmann distribution is given in textbooks on statistical mechanics. We
will simulate systems in equilibrium with a heat bath in Section 15.6.
The partition function plays a key role in statistical mechanics, because the (Helmholtz)
free energy F of a system is deﬁned as
F = −kT lnZ. (15.6)
All thermodynamic quantities can be found from various derivatives of F. In equilibrium the
system will be in the state of minimum F for given values of T , V , and N. (This result follows
from the second law of thermodynamics which says that a system with ﬁxed E, V , and N will
be in the state of maximum entropy.) We will use the free energy concept in a number of the
following sections.
The form (15.4) of P (Ed) provides a simple way of computing the temperature T from the
mean demon energy Ed . The latter is given by
Ed =
∞
0
Ed e−Ed/kT dEd
∞
0
e−Ed/kT dEd
= kT . (15.7)
We see that T is proportional to the mean demon energy. Note that the result Ed = kT in (15.7)
holds only if the energy of the demon can take on a continuum of values and if the upper limit
of integration can be taken to be ∞.
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 590
E = -J E = +J
Figure 15.1: The interaction energy between nearest neighbor spins in the absence of an external
magnetic ﬁeld.
The demon is an excellent example of a thermometer. It has a measurable property, namely,
its energy, which is proportional to the temperature. Because the demon is only one degree of
freedom in comparison to the many degrees of freedom of the system with which it exchanges
energy, it disturbs the system as little as possible. For example, the demon could be added to a
molecular dynamics simulation and provide an independent measure of the temperature.
15.5 The Ising Model
A popular model of a system of interacting variables is the Ising model. The model was proposed
by Lenz and investigated by Ising, his graduate student, to study the phase transition
from a paramagnet to a ferromagnet (cf. Brush). Ising calculated the thermodynamic properties
of the model in one dimension and found that the model does not have a phase transition.
However, for two and three dimensions the Ising model does exhibit a transition. The nature of
the phase transition in two dimensions and some of the diverse applications of the Ising model
are discussed in Section 15.7.
To introduce the Ising model, consider a lattice containing N sites and assume that each
lattice site i has associated with it a number si, where si = ±1. The si are usually referred to as
spins. The macroscopic properties of a system are determined by the nature of the accessible
microstates. Hence, it is necessary to know the dependence of the energy on the conﬁguration
of spins. The total energy E of the Ising model is given by
E = −J
N
i,j=nn(i)
sisj − B
N
i=1
si (15.8)
where B is proportional to the uniform external magnetic ﬁeld. We will refer to B as the magnetic
ﬁeld, even though it includes a factor of µ. The ﬁrst sum in (15.8) represents the energy
of interaction of the spins and is over all nearest neighbor pairs. The exchange constant J is a
measure of the strength of the interaction between nearest neighbor spins (see Figure 15.1). The
second sum in (15.8) represents the energy of interaction between the magnetic moments of the
spins and the external magnetic ﬁeld.
If J > 0, then the states ↑↑ and ↓↓ are energetically favored in comparison to the states ↑↓
and ↓↑. Hence, for J > 0, we expect that the state of lowest total energy is ferromagnetic; that is,
the spins all point in the same direction. If J < 0, the states ↑↓ and ↓↑ are favored and the state
of lowest energy is expected to be antiferromagnetic, that is, alternate spins are aligned. If we
subject the spins to an external magnetic ﬁeld directed upward, the spins ↑ and ↓ possess an
additional energy given by −B and +B, respectively.
An important virtue of the Ising model is its simplicity. Some of its simplifying features
are that the kinetic energy of the atoms associated with the lattice sites has been neglected, only
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 591
nearest neighbor contributions to the interaction energy are included, and the spins are allowed
to have only two discrete values. In spite of the simplicity of the model, we will ﬁnd that the
Ising model exhibits very interesting behavior.
Because we are interested in the properties of an inﬁnite system, we have to choose appropriate
boundary conditions. The simplest boundary condition in one dimension is to choose a
free surface so that the spins at sites 1 and N each have one nearest neighbor interaction only.
Usually a better choice is periodic boundary conditions. For this choice a one-dimensional lattice
becomes a ring, and the spins at sites 1 and N interact with one another and, hence, have
the same number of interactions as do the other spins.
What are some of the physical quantities whose averages we wish to compute? An obvious
physical quantity is the magnetization M given by
M =
N
i=1
si, (15.9)
and the magnetization per spin m = M/N. Usually we are interested in the average values M
and the ﬂuctuations M2 − M 2.
For the familiar case of classical particles with continuously varying position and velocity
coordinates, the dynamics is given by Newton’s laws. For the Ising model the dependence
(15.8) of the energy on the spin conﬁguration is not suﬃcient to determine the time-dependent
properties of the system. That is, the relation (15.8) does not tell us how the system changes
from one conﬁguration to another, and we have to introduce the dynamics separately. This
dynamics will take the form of various Monte Carlo algorithms.
We ﬁrst use the demon algorithm to sample conﬁgurations of the Ising model. The implementation
of the demon algorithm is straightforward. We ﬁrst choose a spin at random. The
trial change corresponds to a ﬂip of the spin from ↑ to ↓ or ↓ to ↑. We then compute the change
in energy of the system and decide whether to accept or reject the trial change. We can determine
the temperature T as a function of the energy of the system in two ways. One way is to
measure the probability that the demon has energy Ed. Because we know that this probability
is proportional to exp(−Ed/kT ), we can determine T from a plot of the logarithm of the probability
as a function of Ed. Another way to determine T is to measure the mean demon energy.
However, because the possible values of Ed are not continuous for the Ising model, T is not
simply proportional to Ed as it is for the ideal gas. We show in Appendix 15A that for B = 0
and the limit of an inﬁnite system, the temperature is related to Ed by
kT /J =
4
ln 1 + 4J/ Ed
. (15.10)
The result (15.10) comes from replacing the integrals in (15.7) by sums over the possible demon
energies. Note that in the limit |J/Ed| 1, (15.10) reduces to kT = Ed as expected.
The IsingDemon class implements the Ising model in one dimension using periodic boundary
conditions and the demon algorithm. Once the initial conﬁguration is chosen, the demon
algorithm is similar to that described in Section 15.3. However, the spins in the one-dimensional
Ising model must be chosen at random. As usual, we will choose units such that J = 1.
Listing 15.3: The implementation of the demon algorithm for the one-dimensional Ising model.
package org . opensourcephysics . sip . ch15 ;
import java . awt . ;
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 592
import org . opensourcephysics . frames . ;
public class IsingDemon {
public int [ ] demonEnergyDistribution ;
int N; / / number of spins
public int systemEnergy ;
public int demonEnergy = 0;
public int mcs = 0; / / number of MC s t e p s per spin
public double systemEnergyAccumulator = 0;
public double demonEnergyAccumulator = 0;
public int magnetization = 0;
public double
mAccumulator = 0 , m2Accumulator = 0;
public int acceptedMoves = 0;
private LatticeFrame l a t t i c e ;
public IsingDemon ( LatticeFrame displayFrame ) {
l a t t i c e = displayFrame ;
}
public void i n i t i a l i z e ( int N) {
this .N = N;
l a t t i c e . r e s i z e L a t t i c e (N, 1 ) ; / / s e t l a t t i c e s i z e
l a t t i c e . setIndexedColor (1 , Color . red ) ;
l a t t i c e . setIndexedColor ( −1 , Color . green ) ;
demonEnergyDistribution = new int [N] ;
for ( int i = 0; i <N;++ i ) {
/ / a l l spins up , second argument i s always 0 f o r 1D l a t t i c e
l a t t i c e . setValue ( i , 0 , 1 ) ;
}
int t r i e s = 0;
int E = −N; / / s t a r t system in ground s t a t e
magnetization = N; / / a l l spins up
/ / try up to 10 N times to f l i p spins so that system has d e s i r e d energy
while ( ( E<systemEnergy)&&( tries <10 N) ) {
int k = ( int ) (N Math . random ( ) ) ;
int dE = 2 l a t t i c e . getValue (k , 0)
( l a t t i c e . getValue ( ( k+1)%N, 0)+ l a t t i c e . getValue ( ( k−1+N)%N, 0 ) ) ;
i f (dE>0) {
E += dE ;
int newSpin = − l a t t i c e . getValue (k , 0 ) ;
l a t t i c e . setValue (k , 0 , newSpin ) ;
magnetization += 2 newSpin ;
}
t r i e s ++;
}
systemEnergy = E ;
resetData ( ) ;
}
public double temperature ( ) {
return 4.0/Math . log (1.0+4.0/( demonEnergyAccumulator /(mcs N) ) ) ;
}
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 593
public void resetData ( ) {
mcs = 0;
systemEnergyAccumulator = 0;
demonEnergyAccumulator = 0;
mAccumulator = 0;
m2Accumulator = 0;
acceptedMoves = 0;
}
public void doOneMCStep ( ) {
for ( int j = 0; j <N;++ j ) {
int i = ( int ) (N Math . random ( ) ) ;
int dE = 2 l a t t i c e . getValue ( i , 0)
( l a t t i c e . getValue ( ( i +1)%N, 0)
+ l a t t i c e . getValue ( ( i −1+N)%N, 0 ) ) ; ;
i f (dE<=demonEnergy ) {
int newSpin = − l a t t i c e . getValue ( i , 0 ) ;
l a t t i c e . setValue ( i , 0 , newSpin ) ;
acceptedMoves++;
systemEnergy += dE ;
demonEnergy −= dE ;
magnetization += 2 newSpin ;
}
systemEnergyAccumulator += systemEnergy ;
demonEnergyAccumulator += demonEnergy ;
mAccumulator += magnetization ;
m2Accumulator += magnetization magnetization ;
demonEnergyDistribution [ demonEnergy]++;
}
mcs++;
}
}
Note that for B = 0, the change in energy due to a spin ﬂip is either 0 or ±4J. Hence, it is
convenient to choose the initial energy of the system plus the demon to be an integer multiple
of 4J. Because the spins interact, it is diﬃcult to choose an initial conﬁguration of spins with
precisely the desired energy. The procedure followed in method initialize is to begin with
an initial conﬁguration where all spins are up (a conﬁguration of minimum energy) and then
randomly ﬂip spins while the energy is less than the desired initial energy.
Problem 15.5. The demon algorithm and the one-dimensional Ising model
(a) Write a target class to use with IsingDemon and simulate the one-dimensional Ising model.
Choose N = 100 and the desired total energy, E = −20. Describe qualitatively how the
conﬁgurations change with time. Then let E = −100 and describe any qualitative changes in
the conﬁgurations.
(b) Compute the demon energy and the magnetization M as a function of the time. As usual,
we interpret the time as the number of Monte Carlo steps per spin. What is the approximate
time for these quantities to approach their equilibrium values?
(c) Compute the equilibrium values of Ed and M2 . About 100 mcs is suﬃcient for testing
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 594
Figure 15.2: One of the 2N possible conﬁgurations of a system of N = 16 Ising spins on a square
lattice. Also shown are the spins in the four nearest periodic images of the central cell that are
used to calculate the energy. An up spin is denoted by ↑ and a down spin is denoted by ↓. Note
that the number of nearest neighbors on a square lattice is four. The energy of this conﬁguration
is E = −8J + 4H with periodic boundary conditions.
the program and yields results of approximately 20% accuracy. To obtain better than 5%
results, choose mcs ≥ 1000.
(d) Compute T for N = 100 and E = −20, −40, −60, and −80 from the inverse slope of P (Ed) and
the relation (15.10). Compare your results to the exact result for an inﬁnite one-dimensional
lattice, E/N = −tanh(J/kT ). How do your computed results for E/N depend on N and on
the number of Monte Carlo steps per spin? Does M2 increase or decrease with T ?
(e)∗ Modify IsingDemon to include a nonzero magnetic ﬁeld and compute Ed , M , and M2
as a function of B for ﬁxed E. Read the discussion in Appendix 15A and determine the
relation of Ed to T for your choices of B. Or determine T from the inverse slope of P (Ed).
Is the equilibrium temperature higher or lower than the B = 0 case for the same total en-
ergy?
Problem 15.6. Antiferromagnetic case
Modify IsingDemon so that the antiferromagnetic case, J = −1, is treated. Before doing the
simulation, describe how you expect the conﬁgurations to diﬀer from the ferromagnetic case.
What is the lowest energy or ground state conﬁguration? Run the simulation with the spins
initially in their ground state and compare your results with your expectations. Compute the
mean energy per spin versus temperature and compare your results with the ferromagnetic
case.
∗Problem 15.7. The demon algorithm and the two-dimensional Ising model
(a) Simulate the Ising model on a square lattice using the demon algorithm. The total number
of spins N = L2, where L is the length of one side of the lattice. Use periodic boundary conditions
as shown in Figure 15.2 so that spins in the left-hand column interact with spins in
the right-hand column, etc. Do not include nonequilibrium conﬁgurations in your averages.
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 595
(b) Compute Ed and M2 as a function of E for B = 0. Choose L = 20 and run for at least 500
mcs. Use (15.10) to determine the dependence of T on E and plot E versus T .
(c) Repeat the simulations in part (b) for L = 20. Run until your averages are accurate to within
a few percent. Describe how the energy versus temperature changes with lattice size.
(d) Modify your program to make “snapshots” of the spin conﬁgurations. Describe the nature
of the conﬁgurations at diﬀerent energies or temperatures. Are they ordered or disordered?
Are there domains of up or down spins?
(e) Instead of choosing a spin at random to make a trial change, choose the spins sequentially;
that is, choose all the x values in ascending order for y = 0, then all the x values for y = 1, etc.
This procedure updates a site and then immediately uses the new spin value when updating
the neighbor. Because this process introduces a directional bias, vary the direction of the
updates after each sweep. Do you obtain the same results as part (b)?
One advantage of the demon algorithm is that it makes fewer demands on the random
number generator than the Metropolis algorithm which we will discuss in Section 15.6. The demon
algorithm also does not require computationally expensive calculations of the exponential
function. Thus, for some systems the demon algorithm can be much faster than the Metropolis
algorithm. In the one-dimensional Ising model we must choose the trial spins at random, but
in higher dimensions, the spins can be chosen sequentially (see Problem 15.7e). In this case
we can do a Monte Carlo simulation without random numbers! Very fast algorithms have been
developed using one computer bit per spin and multiple demons (see Appendix 15B).
There are several disadvantages associated with the microcanonical ensemble. One disadvantage
is the diﬃculty of establishing a system at the desired value of the energy. Another
disadvantage is conceptual; that is, it is more natural to think of the behavior of macroscopic
physical quantities as functions of the temperature rather than the total energy.
15.6 The Metropolis Algorithm
As we have mentioned, most physical systems of interest are not isolated, but exchange energy
with their environment. If a system is placed in thermal contact with a heat bath at temperature
T , the system reaches thermal equilibrium by exchanging energy with the heat bath until the
system reaches the temperature of the heat bath. If we imagine a large number of copies of a
system at ﬁxed volume V and number of particles N in equilibrium at temperature T , then the
probability Ps that the system is in microstate s with energy Es is given by (15.4)
We can use the Boltzmann distribution (15.4) to obtain the ensemble average of the physical
quantities of interest. For example, the mean energy is given by
E =
s
Es Ps =
1
Z s
Es e−βEs . (15.11)
Note that the energy ﬂuctuates in the canonical ensemble.
How can we simulate a system of N particles conﬁned in a volume V at a ﬁxed temperature
T ? Because we can generate only a ﬁnite number m of the total number of M microstates, an
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 596
estimate for the mean value of a physical quantity A would be given by
A ≈ Am =
m
s=1
As e−βEs
m
s=1
e−βEs
(15.12)
where As is the value of the physical quantity A in microstate s. A crude Monte Carlo procedure
is to generate a microstate s at random, calculate Es, As, and e−βEs , and evaluate the corresponding
contribution of the microstate to the sums in (15.12). However, a microstate generated in
this way would be very improbable and hence, contribute little to the sums. Instead, we use an
importance sampling method and generate microstates according to the probability distribution
function πs, which we will choose in the following.
To introduce importance sampling, we rewrite (15.12) by multiplying and dividing by πs:
Am =
m
s=1
(As/πs)e−βEs πs
m
s=1
(1/πs)e−βEs πs
(no importance sampling). (15.13)
If we generate the microstates (conﬁgurations) with probability πs, then (15.13) becomes
Am =
m
s=1
(As/πs)e−βEs
m
s=1
(1/πs)e−βEs
(importance sampling). (15.14)
That is, if we average over a biased sample generated according to πs, we need to weight each
microstate by 1/πs to eliminate the bias. Although any form of πs could be used, the form of
(15.14) suggests that a reasonable choice of πs is the Boltzmann probability itself, that is,
πs =
e−βEs
m
s=1
e−βEs
. (15.15)
This choice of πs implies that the estimate Am of the mean value of A can be written as
Am =
1
m
m
s=1
As (15.16)
where each state is sampled according to the Boltzmann distribution. The choice (15.15) for πs
is due to Metropolis et al.
Although we discussed the Metropolis sampling method in Section 11.7 in the context of
the numerical evaluation of integrals, it is not necessary to read Section 11.7 to understand the
Metropolis algorithm in the present context. The Metropolis algorithm can be summarized in
the context of the simulation of a system of spins as follows. The extension to other types of
systems is straightforward.
1. Establish an initial microstate. (The energy of the initial microstate is not important.)
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 597
2. Choose a spin at random and make a trial ﬂip.
3. Compute ∆E ≡ Etrial − Eold, the change in the energy of the system due to the trial ﬂip.
4. If ∆E is less than or equal to zero, accept the new microstate and go to step 8.
5. If ∆E is positive, compute the quantity w = e−β∆E.
6. Generate a uniform random number r in the unit interval [0,1].
7. If r ≤ w, accept the new microstate; otherwise retain the previous microstate.
8. Determine the value of the desired physical quantities.
9. Repeat steps (2) through (8) to obtain a suﬃcient number of microstates.
10. Periodically compute averages over the microstates.
Steps (2) to (7) lead to a transition probability that the system moves from microstate {si} to
{sj} proportional to
W (i → j) = min 1,e−β∆E
(Metropolis algorithm) (15.17)
where ∆E = Ej − Ei. Because it is necessary to evaluate only the ratio Pj/Pi = e−β∆E, it is not
necessary to normalize the probability. Note that because the microstates are generated with a
probability proportional to the desired probability, all averages become arithmetic averages as
in (15.16). However, because the constant of proportionally is not known, it is not possible to
estimate the partition function Z in this way.
Although we chose πs to be the Boltzmann distribution, other choices of πs are possible and
are useful in some contexts. In addition, the choice (15.17) of the transition probability is not
the only one that leads to the Boltzmann distribution. It can be shown that if W satisﬁes the
detailed balance condition
W (i → j)e−βEi = W (j → i)e−βEj (detailed balance), (15.18)
then the corresponding Monte Carlo algorithm generates a sequence of states distributed according
to the Boltzmann distribution. The proof that the Metropolis algorithm generates states
with a probability proportional to the Boltzmann probability distribution after a suﬃcient number
of steps does not add much to our physical understanding of the algorithm. Instead, in Problems
15.8 and 15.9 we apply the algorithm to the ideal classical gas and to a classical magnet
in a magnetic ﬁeld, respectively, and verify that the Metropolis algorithm yields the Boltzmann
distribution after a suﬃcient number of trial changes have been made.
Note that we have implicitly assumed in our discussion of the demon and Metropolis algorithms
that the system is ergodic. That is, we have assumed that the important microstates
of the system are being sampled with the desired probability. The existence of ergodicity depends
on the way the trial moves are made and on the nature of the energy barriers between
microstates. For example, consider a one-dimensional lattice of Ising spins with all spins up. If
the spins are updated sequentially from right to left, then if one spin is ﬂipped, all remaining
ﬂips would be accepted regardless of the temperature, because the change in energy would be
zero. The system would not be ergodic for this implementation of the algorithm, and we would
not obtain the correct thermodynamic behavior. A measure of the ergodicity of a system was
discussed in Project 8.23.
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 598
We ﬁrst consider the application of the Metropolis algorithm to an ideal classical gas in one
dimension and verify that the Metropolis algorithm samples states according to the Boltzmann
algorithm. The energy of an ideal gas depends only on the velocity of the particles, and hence
a microstate is completely described by a speciﬁcation of the velocity (or momentum) of each
particle. Because the velocity is a continuous variable, it is necessary to describe the accessible
microstates so that they are countable, and hence we place the velocity into bins. Suppose we
have N = 10 particles and divide the possible values of the velocity into twenty bins. Then the
total number of microstates would be 2010. Not only would it be diﬃcult to label these 2010
states, it would take a prohibitively long time to obtain an accurate estimate of their relative
probabilities, and it would be diﬃcult to verify directly that the Metropolis algorithm yields
the Boltzmann distribution. For this reason we consider a single classical particle in one dimension
in equilibrium with a heat bath and adopt the less ambitious goal of verifying that the
Metropolis algorithm generates the Boltzmann distribution for this system.
The Metropolis algorithm is implemented in method doStep in class BoltzmannApp, and
the velocity distribution is plotted. One quantity of interest is the probability P (v)∆v that the
particle has a velocity between v and v +∆v. We will choose the temperature to be large enough
such that ∆v = 1 provides a suﬃciently small bin size to compute P (v) accurately. As usual, we
choose units such that the mass of the particle is unity.
Listing 15.4: The Metropolis algorithm for a single particle.
package org . opensourcephysics . sip . ch15 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . HistogramFrame ;
public class BoltzmannApp extends AbstractSimulation {
double beta ; / / i n v e r s e temperature
int mcs ;
int accepted ;
double velocity ;
HistogramFrame velocityDistribution =
new HistogramFrame ( "v" , "P(v)" , "Velocity distribution" ) ;
public void i n i t i a l i z e ( ) {
velocityDistribution . clearData ( ) ;
beta = 1.0/ control . getDouble ( "Temperature" ) ;
velocity = control . getDouble ( "Initial velocity" ) ;
accepted = 0;
mcs = 0;
}
public void doStep ( ) {
double delta = control . getDouble ( "Maximum velocity change" ) ;
mcs++;
double ke = 0.5 velocity velocity ;
double vTrial = velocity+delta ( 2 . 0 Math . random ( ) − 1 . 0 ) ;
double keTrial = 0.5 vTrial vTrial ;
double dE = keTrial −ke ;
i f ( ( dE<0)||( Math . exp(− beta dE)>Math . random ( ) ) ) {
accepted ++;
ke = keTrial ;
velocity = vTrial ;
}
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 599
velocityDistribution . append ( velocity ) ;
control . clearMessages ( ) ;
control . println ( "mcs = "+mcs ) ;
control . println ( "acceptance probability = "+(double ) ( accepted )/mcs ) ;
}
public void reset ( ) {
control . setValue ( "Maximum velocity change" , 1 0 . 0 ) ;
control . setValue ( "Temperature" , 1 0 . 0 ) ;
control . setValue ( "Initial velocity" , 0 . 0 ) ;
enableStepsPerDisplay ( true ) ;
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl . createApp (new BoltzmannApp ( ) ) ;
}
}
Problem 15.8. Simulation of a particle in equilibrium with a heat bath
(a) Choose the temperature T = 10, the initial velocity equal to zero, and the maximum change
in the particle’s velocity to be δ = 10.0. Run for a number of Monte Carlo steps until a plot
of lnP (v) versus v is reasonably smooth. Describe the qualitative form of P (v). (Remember
that the velocity v can be either positive or negative.)
(b) Because the velocity of the particle characterizes the microstate of this single particle system,
we need to plot lnP (Es) versus Es = mv2
s /2 to test if the Metropolis algorithm yields
the Boltzmann distribution in this case. (The two values of v, one positive and one negative,
for each value of E, correspond to diﬀerent microstates.) Add code to BoltzmannApp
to compute P (Es) and determine the slope of lnP (Es) versus Es. The code for extracting
information from the HistogramFrame class is given on page 206. Is this slope equal to
−β = −1/T , where T is the temperature of the heat bath?.
(c) Add code to compute the mean energy and velocity. How do your results for the mean
energy compare to the exact value? Explain why the computed mean particle velocity is
approximately zero even though the initial particle velocity was not zero. To insure that
your results do not depend on the initial conditions, let the initial velocity equal zero and
recompute the mean energy and velocity. Do your equilibrium results diﬀer from what you
found previously?
(d) Add another HistogramFrame object to compute the probability P (E)∆E where E is the
energy of the conﬁguration. Does P (E) have the form of a Boltzmann distribution? If not,
what is the functional form of P (E)?
(e) The acceptance probability is the fraction of trial moves that are accepted. What is the eﬀect
of changing the value of δ on the acceptance probability?
Problem 15.9. Planar spin in an external magnetic ﬁeld
(a) Consider a classical planar magnet with magnetic moment µ0. The magnet can be oriented
in any direction in the x-y plane, and the energy of interaction of the magnet with an external
magnetic ﬁeld B is −µ0Bcosφ, where φ is the angle between the moment and B. Write a
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 600
Monte Carlo program to sample the microstates of this system in thermal equilibrium with
a heat bath at temperature T . Compute the mean energy as a function of the ratio βµ0B.
(b) Compute the probability density P (φ) and analyze its dependence on the energy.
In Problem 15.10 we consider the Monte Carlo simulation of a classical ideal gas of N
particles in equilibrium with a heat bath. It is convenient to say that one time unit or one
Monte Carlo step per particle (mcs) has elapsed after N particles have had a chance to change
their coordinates. If the particles are chosen at random, then during one Monte Carlo step
per particle, some particles might not be chosen, but all particles will be chosen equally on
the average. The advantage of this deﬁnition is that the time is independent of the number of
particles. However, this deﬁnition of time has no obvious relation to a physical time.
Problem 15.10. Simulation of an ideal gas in one dimension
(a) Modify class BoltzmannApp to simulate an ideal gas of N particles in one dimension. For
simplicity, assume that all particles have the same initial velocity of 10. Let N = 20 and
T = 10 and consider at least 2000 Monte Carlo steps per particle. Choose the value of δ so
that the acceptance probability is approximately 40%. What are the mean kinetic energy
and mean velocity of the particles?
(b) We might expect the total energy of an ideal gas to remain constant because the particles
do not interact with one another and, hence, cannot exchange energy directly. What is the
value of the initial total energy of the system in part (a)? Does the total energy remain
constant? If not, explain how the energy changes.
(c) What is the nature of the time dependence of the total energy starting from the initial condition
in (a)? Estimate the number of Monte Carlo steps per particle necessary for the system
to reach thermal equilibrium by computing a moving average of the total energy over a ﬁxed
time interval. Does this average change with time after a suﬃcient time has elapsed? What
choice of the initial velocities allows the system to reach thermal equilibrium at temperature
T as quickly as possible?
(d) Compute the probability P (E)∆E for the system of N particles to have a total energy between
E and E+∆E. Plot P (E) as a function of E and describe the qualitative behavior of P (E). Does
P (E) have the form of the Boltzmann distribution? If not, describe the qualitative features
of P (E) and determine its functional form.
(e) Compute the mean energy for T = 10, 20, 40, 80, and 120 and estimate the heat capacity
from its deﬁnition C = ∂E/∂T .
(f) Compute the mean square energy ﬂuctuations (∆E)2 = E2 − E 2 for T = 10 and T = 40.
Compare the magnitude of the ratio (∆E)2 /T 2 with the heat capacity determined in part
(e).
You might have been surprised to ﬁnd in Problem 15.10d that the form of P (E) is a Gaussian
centered about the mean energy of the system. What is the relation of this form of P (E)
to the central limit theorem (see Problem 7.15)? If the microstates are distributed according
to the Boltzmann probability, why is the total energy distributed according to the Gaussian
distribution?
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 601
15.7 Simulation of the Ising Model
You are probably familiar with ferromagnetic materials, such as iron and nickel, which exhibit
a spontaneous magnetization in the absence of an applied magnetic ﬁeld. This nonzero magnetization
occurs only if the temperature is less than a well-deﬁned temperature known as the
Curie or critical temperature Tc. For temperatures T > Tc, the magnetization vanishes. Hence,
Tc separates the disordered phase for T > Tc from the ferromagnetic phase for T < Tc.
The origin of magnetism is quantum mechanical in nature and its study is of much experimental
and theoretical interest. The study of simple classical models of magnetism has
provided much insight. The two- and three-dimensional Ising model is the most commonly
studied classical model and is particularly useful in the neighborhood of the magnetic phase
transition.
The thermal quantities of interest for the Ising model include the mean energy E and
the heat capacity C. One way to determine C at constant external magnetic ﬁeld is from its
deﬁnition C = ∂ E /∂T . An alternative way is to relate C to the statistical ﬂuctuations of the
total energy in the canonical ensemble (see Appendix 15B):
C =
1
kT 2
E2
− E 2
(canonical ensemble). (15.19)
Another quantity of interest is the mean magnetization M and the corresponding zero ﬁeld
magnetic susceptibility:
χ =
∂ M
∂B B=0
. (15.20)
The zero ﬁeld magnetic susceptibility χ is an example of a linear response function, because it
measures the ability of a spin to respond to a change in the external magnetic ﬁeld. In analogy
to the heat capacity, χ is related to the ﬂuctuations of the magnetization (see Appendix 15C):
χ =
1
kT
M2
− M 2
(15.21)
where M and M2 are evaluated in zero external magnetic ﬁeld. The relations (15.19) and
(15.21) are examples of the general relation between linear response functions and equilibrium
ﬂuctuations.
The Metropolis algorithm was stated in Section 15.6 as a method for generating states with
the desired Boltzmann probability, but the ﬂipping of single spins can also be interpreted as
a reasonable approximation to the real dynamics of an anisotropic magnet whose spins are
coupled to the vibrations of the lattice. The coupling leads to random spin ﬂips, and we expect
that one Monte Carlo step per spin is proportional to the average time between single spin ﬂips
observed in the laboratory. Hence, we can regard single spin ﬂips as a time dependent process
and observe the relaxation to equilibrium. In the following, we will frequently refer to the
application of the Metropolis algorithm to the Ising model as single spin ﬂip dynamics.
In Problem 15.11 we use the Metropolis algorithm to simulate the one-dimensional Ising
model. Note that the parameters J and kT do not appear separately, but appear together in the
dimensionless ratio J/kT . Unless otherwise stated, we measure temperature in units of J/k and
set B = 0.
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 602
Problem 15.11. One-dimensional Ising model
(a) Write a program to simulate the one-dimensional Ising model in equilibrium with a heat
bath. Modify method doOneMCStep in IsingDemon (see class IsingDemon on page 591 or
class Ising on page 602). Use periodic boundary conditions. Assume that the external
magnetic ﬁeld is zero. Draw the microscopic state (conﬁguration) of the system after each
Monte Carlo step per spin.
(b) Choose N = 20 and T = 1 and start with all spins up. What is the initial eﬀective temperature
of the system? Run for at least 1000 mcs, where mcs is the number of Monte Carlo
steps per spin. Visually inspect the conﬁguration of the system after each Monte Carlo step
per spin and estimate the time it takes for the system to reach equilibrium. Does the sign of
the magnetization change during the simulation? Increase N and estimate the time for the
system to reach equilibrium and for the magnetization to change sign.
(c) Change the initial condition so that the orientation of each spin is chosen at random. What
is the initial eﬀective temperature of the system in this case? Estimate the time it takes for
the system to reach equilibrium.
(d) Choose N = 50 and determine E , E2 , and M2 as a function of T in the range 0.1 ≤ T ≤ 5.
Plot E as a function of T and discuss its qualitative features. Compare your computed
results for E to the exact result (for B = 0):
E(T ) = −N tanhβJ. (15.22)
Use the relation (15.19) to determine the T - dependence of C.
(e) As you probably noticed in part (b), the system can overturn completely during a long run
and thus the value of M can vary widely from run to run. Because M = 0 for T > 0 for
the one-dimensional Ising model, it is better to assume M = 0 and compute χ from the
relation χ = M2 /kT . Use this relation (15.21) to estimate the T -dependence of χ.
(f) One of the best laboratory realizations of a one-dimensional Ising ferromagnet is a chain of
bichloride-bridged Fe2+ ions known as FeTAC (see Greeney et al.). Measurements of χ yield
a value of the exchange interaction J given by J/k = 17.4K. Note that experimental values
of J are typically given in temperature units. Use this value of J to plot your Monte Carlo
results for χ versus T with T given in Kelvin. At what temperature is χ a maximum for
FeTAC?
(g) Is the acceptance probability an increasing or decreasing function of T ? Does the Metropolis
algorithm become more or less eﬃcient as the temperature is lowered?
(h) Compute the probability P (E) for a system of N = 50 spins at T = 1. Run for at least
1000 mcs. Plot lnP (E) versus (E − E )2 and discuss its qualitative features.
We next apply the Metropolis algorithm to the Ising model on the square lattice. The Ising
class is listed in the following.
Listing 15.5: The Ising class.
package org . opensourcephysics . sip . ch15 ;
import java . awt . ;
import org . opensourcephysics . frames . ;
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 603
public class Ising {
public s t a t i c final double criticalTemperature =
2.0/Math . log (1.0+Math . sqrt ( 2 . 0 ) ) ;
public int L = 32;
public int N = L L ; / / number of spins
public double temperature = criticalTemperature ;
public int mcs = 0; / / number of MC moves per spin
public int energy ;
public double energyAccumulator = 0;
public double energySquaredAccumulator = 0;
public int magnetization = 0;
public double magnetizationAccumulator = 0;
public double magnetizationSquaredAccumulator = 0;
public int acceptedMoves = 0;
private double [ ] w = new double [ 9 ] ; / / array to hold Boltzmann f a c t o r s
public LatticeFrame l a t t i c e ;
public void i n i t i a l i z e ( int L , LatticeFrame displayFrame ) {
l a t t i c e = displayFrame ;
this . L = L ;
N = L L ;
l a t t i c e . r e s i z e L a t t i c e (L , L ) ; / / s e t l a t t i c e s i z e
l a t t i c e . setIndexedColor (1 , Color . red ) ;
l a t t i c e . setIndexedColor ( −1 , Color . green ) ;
for ( int i = 0; i <L;++ i ) {
for ( int j = 0; j <L;++ j ) {
l a t t i c e . setValue ( i , j , 1 ) ; / / a l l spins up
}
}
magnetization = N;
energy = −2 N; / / minimum energy
resetData ( ) ;
/ / other array elements never occur f o r H = 0
w[ 8 ] = Math . exp ( −8.0/ temperature ) ;
w[ 4 ] = Math . exp ( −4.0/ temperature ) ;
}
/ / allow temperature to be changed in the middle of a simulation
public void changeTemperature ( double newTemperature ) {
temperature = newTemperature ;
w[ 8 ] = Math . exp ( −8.0/ temperature ) ;
w[ 4 ] = Math . exp ( −4.0/ temperature ) ;
}
public double specificHeat ( ) {
double energySquaredAverage = energySquaredAccumulator/mcs ;
double energyAverage = energyAccumulator/mcs ;
double heatCapacity = energySquaredAverage−energyAverage energyAverage ;
heatCapacity = heatCapacity /( temperature temperature ) ;
return ( heatCapacity /N) ;
}
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 604
public double s u s c e p t i b i l i t y ( ) {
double magnetizationSquaredAverage = magnetizationSquaredAccumulator/mcs ;
double magnetizationAverage = magnetizationAccumulator/mcs ;
return ( magnetizationSquaredAverage −
Math .pow( magnetizationAverage , 2 ) ) / ( temperature N) ;
}
public void resetData ( ) {
mcs = 0;
energyAccumulator = 0;
energySquaredAccumulator = 0;
magnetizationAccumulator = 0;
magnetizationSquaredAccumulator = 0;
acceptedMoves = 0;
}
public void doOneMCStep ( ) {
for ( int k = 0;k<N;++k ) {
int i = ( int ) (Math . random ( ) L ) ;
int j = ( int ) (Math . random ( ) L ) ;
int dE = 2 l a t t i c e . getValue ( i , j )
( l a t t i c e . getValue ( ( i +1)%L , j )
+ l a t t i c e . getValue ( ( i −1+L)%L , j )
+ l a t t i c e . getValue ( i , ( j +1)%L)
+ l a t t i c e . getValue ( i , ( j −1+L)%L ) ) ;
i f ( ( dE<=0)||(w[dE]>Math . random ( ) ) ) {
int newSpin = − l a t t i c e . getValue ( i , j ) ;
l a t t i c e . setValue ( i , j , newSpin ) ;
acceptedMoves++;
energy += dE ;
magnetization += 2 newSpin ;
}
}
energyAccumulator += energy ;
energySquaredAccumulator += energy energy ;
magnetizationAccumulator += magnetization ;
magnetizationSquaredAccumulator += magnetization magnetization ;
mcs++;
}
}
One of the most time consuming parts of the Metropolis algorithm is the calculation of the
exponential function e−β∆E. Because there are only a small number of possible values of β∆E
for the Ising model (see Figure 15.11), we store the small number of diﬀerent probabilities for
the spin ﬂips in the array w. The values of this array are computed in method initialize.
To implement the Metropolis algorithm, we determine the change in the energy ∆E and
then accept the trial ﬂip if ∆E ≤ 0. If this condition is not satisﬁed, we generate a random
number in the unit interval and compare it to e−β∆E. We can use a single if statement for these
two conditions, because in Java (and C/C++) the second condition of an || (or) statement is
evaluated only if the ﬁrst is false. This feature is very useful because we do not want to generate
random numbers when they are not needed, as is the case for ∆E ≤ 0. (The same feature holds
for the compound & & (and) statement for which the second condition is only evaluated if the
ﬁrst is true.)
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 605
A typical laboratory system has at least 1018 spins. In contrast, the number of spins that
can be simulated typically ranges from 103 to 109. As we have discussed in other contexts, the
use of periodic boundary conditions minimizes ﬁnite size eﬀects. However, more sophisticated
boundary conditions are sometimes convenient. For example, we can give the surface spins
extra neighbors, whose direction is related to the mean magnetization of the microstate (see
Saslow).
In class Ising data for the values of the physical observables are accumulated after each
Monte Carlo step per spin. The optimum time for sampling various physical quantities is explored
in Problem 15.13. Note that if a ﬂip is rejected, the old conﬁguration is retained. Thermal
equilibrium is not described properly unless the old conﬁguration is again included in computing
the averages.
Achieving thermal equilibrium can account for a substantial fraction of the total run time
for very large systems. The most practical choice of initial conditions in these cases is a conﬁguration
from a previous run that is at a temperature close to the desired temperature. The code
for reading and saving conﬁgurations can be found in Appendix 8A.
Problem 15.12. Equilibration of the two-dimensional Ising model
(a) Write a target class that uses class Ising and plots the magnetization and energy as a function
of the number of Monte Carlo steps. Your program should also display the mean magnetization,
the energy, the speciﬁc heat, the susceptibility, and the acceptance probability
when the simulation is stopped. Averages such as the mean energy and the susceptibility
should be normalized by the number of spins so that it is easy to compare systems with different
values of N. Choose the linear dimension L = 32 and the heat bath temperature T = 2.
Estimate the time needed to equilibrate the system given that all the spins are initially up.
(b) Visually determine if the spin conﬁgurations are “ordered” or “disordered” at T = 2 after
equilibrium has been established.
(c) Repeat part (a) with the initial direction of each spin chosen at random. Make sure you
explicitly compute the initial energy and magnetization in initialize. Does the equilibration
time increase or decrease?
(d) Repeat parts (a)–(c) for T = 2.5.
Problem 15.13. Comparison with exact results
In general, a Monte Carlo simulation yields exact answers only after an inﬁnite number of
conﬁgurations have been sampled. How then can we be sure that our program works correctly,
and our results are statistically meaningful? One way is to reproduce exact results in known
limits. In the following, we test class Ising by considering a small system for which the mean
energy and magnetization can be calculated analytically.
(a) Calculate analytically the T -dependence of E, M, C, and χ for the Ising model on the square
lattice with L = 2. (A summary of the calculation is given in Appendix 15C.) For simplicity,
we have omitted the brackets denoting the thermal averages.)
(b) Simulate the Ising model with L = 2 and estimate E, M, C, and χ for T = 0.5 and 0.25.
Use the relations (15.19) to compute C. Compare your estimated values to the exact results
found in part (a). Approximately how many Monte Carlo steps per spin are necessary to
obtain E and M to within 1%? How many Monte Carlo steps per spin are necessary to obtain
C to within 1%?
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 606
(c) Choose L = 4 and the direction of each spin at random and equilibrate the system at T = 3.
Look at the time series of M and E after every Monte Carlo step per spin and estimate how
often M changes sign. Does E change sign when M changes sign? How often does M change
sign for L = 8 and L = 32 (and T = 3)? Although the direction of the spins is initially chosen
at random, it is likely that the number of up spins will not exactly cancel the number of
down spins. Is that statement consistent with your observations? If the net number of spins
is up, how long does the net magnetization remain positive for a given value of L?
(d) The calculation of χ is more complicated because the sign of M can change during the
simulation for smaller values of L. Compare your results for χ from using (15.21) and from
using (15.21) with M replaced by |M| . Which way of computing χ gives more accurate
results?
Now that you have checked your program and obtained typical equilibrium conﬁgurations,
we consider in more detail the calculation of the mean values of the physical quantities of
interest. Suppose we wish to compute the mean value of the physical quantity A. In some
cases, the calculation of A for a given conﬁguration is time consuming, and we do not want
to compute its value more often than necessary. For example, we would not compute A after
the ﬂip of only one spin because the values of A in the two conﬁgurations would almost be
the same. Ideally, we wish to compute A for conﬁgurations that are statistically independent.
Because we do not know a priori the mean number of spin ﬂips needed to obtain conﬁgurations
that are statistically independent, it is a good idea to estimate this time in your preliminary
calculations.
One way to estimate the time interval over which conﬁgurations are correlated is to compute
the time displaced autocorrelation function CA(t) which is deﬁned as
CA(t) =
A(t + t0)A(t0) − A 2
A2 − A 2
(15.23)
where A(t) is the value of the quantity A at time t. The averages in (15.23) are over all possible
time origins t0. Because the choice of the time origin is arbitrary for an equilibrium system, CA
depends only on the time diﬀerence t rather than t and t0 separately. For suﬃciently large t,
A(t) and A(0) will become uncorrelated, and hence A(t + t0)A(t0) → A(t + t0) A(t0) = A 2.
Hence CA(t) → 0 as t → ∞. Also, CA(t = 0) is normalized to unity. In general, CA(t) will
decay exponentially with t with a decay or correlation time τA whose magnitude depends on the
choice of the physical quantity A as well as the physical parameters of the system, for example,
the temperature.
The time dependence of the two most common correlation functions CM(t) and CE(t) is investigated
in Problem 15.14. As an example of the calculation of CE(t), consider the equilibrium
time series for E for the L = 4 Ising model on the square lattice at T = 4: −4, −8, 0, −8, −20, −4,
0, 0, −24, −32, −24, −24, −8, −8, −16, −12. The averages of E and E2 over these sixteen values
are E = −12, E2 = 240, and E2 − E 2 = 96. We wish to compute E(t)E(0) for all possible
choices of the time origin. For example, E(4)E(0) is given by
E(4)E(0) =
1
12
(−20 × −4) + (−4 × −8) + (0 × 0)
+ (0 × −8) + (−24 × −20) + (−32 × −4)
+ (−24 × 0) + (−24 × 0) + (−8 × −24)
+ (−8 × −32) + (−16 × −24) + (−12 × −24) . (15.24)
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 607
We averaged over the twelve possible choices of the origin for the time diﬀerence t = 4. Verify
that E(4)E(0) = 460/3 and CE(4) = 7/72.
To implement this procedure on a computer, we could store the time series in memory, if it
is not too long, or save it in a data ﬁle. You can save the data for M(t) and E(t) by pressing the
Save XML menu item under the File menu on the frame containing the plots for M(t) and E(t).
The class IsingAutoCorrelatorApp in Listing 15.6 reads in data created by the IsingApp class.
Method computeCorrelation computes the mean and mean square of the magnetization and
the energy, which are needed to compute CM and CE as deﬁned in (15.23). Then it computes
the time displaced autocorrelation for all possible choices of t0.
Listing 15.6: Listing of class for computing autocorrelation function of M and E.
package org . opensourcephysics . sip . ch15 ;
import java . u t i l . ;
import javax . swing . ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . frames . ;
public class IsingAutoCorrelatorApp extends AbstractCalculation {
PlotFrame plotFrame =
new PlotFrame ( "tau" , "<E(t+tau)E(t)> and <M(t+tau)M(t)>" ,
"Time correlations" ) ;
double [ ] energy = new double [ 0 ] , magnetization = new double [ 0 ] ;
int numberOfPoints ;
public void calculate ( ) {
computeCorrelation ( control . getInt ( "Maximum time interval, tau" ) ) ;
}
public void readXMLData ( ) {
energy = new double [ 0 ] ;
magnetization = new double [ 0 ] ;
numberOfPoints = 0;
String filename = "ising_data.xml" ;
JFileChooser chooser = OSPFrame . getChooser ( ) ;
int result = chooser . showOpenDialog ( null ) ;
i f ( result==JFileChooser .APPROVE_OPTION) {
filename = chooser . getSelectedFile ( ) . getAbsolutePath ( ) ;
} else {
return ;
}
XMLControlElement xmlControl = new XMLControlElement ( filename ) ;
i f ( xmlControl . failedToRead ( ) ) {
control . println ( "failed to read: "+filename ) ;
} else {
/ / g e t s the d a t a s e t s in the xml f i l e
I t e r a t o r i t =
xmlControl . getObjects ( Dataset . class , false ) . i t e r a t o r ( ) ;
while ( i t . hasNext ( ) ) {
Dataset dataset = ( Dataset ) i t . next ( ) ;
i f ( dataset . getName ( ) . equals ( "magnetization" ) ) {
magnetization = dataset . getYPoints ( ) ;
}
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 608
i f ( dataset . getName ( ) . equals ( "energy" ) ) {
energy = dataset . getYPoints ( ) ;
}
}
numberOfPoints = magnetization . length ;
control . println ( "Reading: "+filename ) ;
control . println ( "Number of points = "+numberOfPoints ) ;
}
calculate ( ) ;
plotFrame . repaint ( ) ;
}
public void computeCorrelation ( int tauMax ) {
plotFrame . clearData ( ) ;
double
energyAccumulator = 0 , magnetizationAccumulator = 0;
double
energySquaredAccumulator = 0 , magnetizationSquaredAccumulator = 0;
for ( int t = 0; t<numberOfPoints ; t ++) {
energyAccumulator += energy [ t ] ;
magnetizationAccumulator += magnetization [ t ] ;
energySquaredAccumulator += energy [ t ] energy [ t ] ;
magnetizationSquaredAccumulator += magnetization [ t ] magnetization [ t ] ;
}
double averageEnergySquared =
Math .pow( energyAccumulator/numberOfPoints , 2 ) ;
double averageMagnetizationSquared =
Math .pow( magnetizationAccumulator/numberOfPoints , 2 ) ;
/ / compute normalization f a c t o r s
double normE =
( energySquaredAccumulator/numberOfPoints)− averageEnergySquared ;
double normM =
( magnetizationSquaredAccumulator/numberOfPoints)− averageMagnetizationSquared ;
for ( int tau = 1; tau<=tauMax ; tau++) {
double c_MAccumulator = 0;
double c_EAccumulator = 0;
int counter = 0;
for ( int t = 0; t<numberOfPoints−tau ; t ++) {
c_MAccumulator += magnetization [ t ] magnetization [ t+tau ] ;
c_EAccumulator += energy [ t ] energy [ t+tau ] ;
counter ++;
}
/ / c o r r e l a t i o n function defined so that c (0) = 1 and c ( i n f i n i t y ) −> 0
plotFrame . append (0 , tau , ( ( c_MAccumulator/ counter )−
averageMagnetizationSquared )/normM ) ;
plotFrame . append (1 , tau , ( ( c_EAccumulator/ counter )−
averageEnergySquared )/normE ) ;
}
plotFrame . s e t V i s i b l e ( true ) ;
}
public void reset ( ) {
control . setValue ( "Maximum time interval, tau" , 20);
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 609
readXMLData ( ) ;
}
public s t a t i c void main ( String args [ ] ) {
CalculationControl . createApp (new IsingAutoCorrelatorApp ( ) ) ;
}
}
Problem 15.14. Correlation times
(a) As a check on IsingAutoCorrelatorApp, use the time series for E given in the text to do a
hand calculation of CE(t) in the way that it is computed in the computeCorrelation method.
(b) Use class IsingAutoCorrelatorApp to compute the equilibrium values of CM(t) and CE(t).
Save the values of the magnetization and energy only after the system has reached equilibrium.
Estimate the correlation times from the energy and the magnetization correlation
functions for L = 8, and T = 3, T = 2.3, and T = 2. One way to determine τ is to ﬁt C(t) to
the exponential form C(t) ∼ e−t/τ. Another way is to deﬁne the integrated correlation time
as
τ =
t=1
C(t). (15.25)
The sum is cut oﬀ at the ﬁrst negative value of C(t). Are the negative values of C(t) physically
meaningful? How does the behavior of C(t) change if you average your results over
longer runs? How do your estimates for the correlation times compare with your estimates
of the relaxation time found in Problem 15.12? Why would the term “decorrelation time” be
more appropriate than “correlation time?” Are the correlation times τM and τE comparable?
(c) To simulate the relaxation to equilibrium as realistically as possible, we have randomly selected
the spins to be ﬂipped. However, if we are interested only in equilibrium properties,
it might be possible to save computer time by selecting the spins sequentially. Determine
if the correlation time is greater, smaller, or approximately the same if the spins are chosen
sequentially rather than randomly. If the correlation time is greater, does it still save CPU
time to choose spins sequentially? Why is it not desirable to choose spins sequentially in
the one-dimensional Ising model?
How can we quantify the accuracy of our measurements, for example, the accuracy of the
estimated mean energy? As discussed in Chapter 11, the usual measure of the accuracy is the
standard deviation of the mean. If we make n measurements of E, then the most probable error
in E is given by
σm =
σ
√
n
(15.26)
where the standard deviation σ is deﬁned as
σ2
= E2
− E 2
. (15.27)
The diﬃculty is that, in general, our measurements of the time series Ei are not independent,
but are correlated. Hence, σm as given by (15.26) is an underestimate of the actual error.
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 610
∗Problem 15.15. Estimate of errors
One way to determine whether the measurements are independent is to compute the correlation
time. Another way is based on the idea that the magnitude of the error should not depend on
how we group the data (see Section 11.4). For example, suppose that we group every two data
points to form n/2 new data points E
(2)
i given by E
(g=2)
i = (1/2)[E2i−1 +E2i]. If we replace n by n/2
and E by E(2) in (15.26) and (15.27), we would ﬁnd the same value of σm as before, provided that
the original Ei are independent. If the computed σm is not the same, we continue this averaging
process until σm calculated from
E
(g)
i =
1
2
E
(g/2)
2i−1 + E
(g/2)
2i (g = 2,4,8,...), (15.28)
is approximately the same as that calculated from E(g/2).
(a) Use this averaging method to estimate the errors in your measurements of E and M .
Choose L = 8, T = Tc = 2/ ln(1 +
√
2) ≈ 2.269, and mcs ≥ 16384 and calculate averages after
every Monte Carlo step per spin after the system has equilibrated. (The signiﬁcance of Tc
will be explored in Section 15.8.) A rough measure of the correlation time is the number
of terms in the time series that need to be averaged for σm to be approximately unchanged.
What is the qualitative dependence of the correlation time on T − Tc?
(b) Repeat for L = 16. Do you need more Monte Carlo steps than in part (a) to obtain statistically
independent data? If so, why?
(c) The exact value of E/N for the Ising model on a square lattice with L = 16 and T = Tc =
2/ ln(1 +
√
2) is given by E/N = −1.45306 (to ﬁve decimal places). The exact result for E/N
allows us to determine the actual error in this case. Compute E by averaging E after each
Monte Carlo step per spin for mcs ≥ 106. Compare your actual error to the estimated error
given by (15.26) and (15.27) and discuss their relative values.
15.8 The Ising Phase Transition
Now that we have tested our program for the two-dimensional Ising model, we explore some of
its properties.
Problem 15.16. Qualitative behavior of the two-dimensional Ising model
(a) Use class Ising and your version of IsingApp to compute the mean magnetization, the
mean energy, the heat capacity, and the susceptibility. Because we will consider the Ising
model for diﬀerent values of L, it will be convenient to convert these quantities to intensive
quantities such as the mean energy per spin, the speciﬁc heat (per spin), and the susceptibility
per spin. For simplicity, we will use the same notation for both the extensive and the
corresponding intensive quantities. Choose L = 4 and consider T in the range 1.5 ≤ T ≤ 3.5
in steps of ∆T = 0.2. Choose the initial condition at T = 3.5 such that the orientation of
the spins is chosen at random. Because all the spins might overturn and the magnetization
would change sign during the course of your observation, estimate the mean value of |M| in
addition to that of M. The susceptibility should be calculated as
χ =
1
kT
[ M2
− |M| 2
]. (15.29)
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 611
0.0
1.0
2.0
3.0
4.0
1.5 2.0 2.5 3.0 3.5
T
CV
Figure 15.3: The temperature dependence of the speciﬁc heat C (per spin) of the Ising model
on a square lattice with periodic boundary conditions for L = 8 and L = 16. One thousand
Monte Carlo steps per spin were used for each value of the temperature. The continuous line
represents the temperature dependence of C in the limit of an inﬁnite lattice. (Note that C is
inﬁnite at T = Tc for an inﬁnite lattice.)
Use at least 1000 Monte Carlo steps per spin and estimate the number of equilibrium conﬁgurations
needed to obtain M and E to 5% accuracy. Plot E , m, |m|, C, and χ as a
function of T and describe their qualitative behavior. Do you see any evidence of a phase
transition?
(b) Repeat the calculations of part (a) for L = 8 and L = 16. Plot E , m, |m|, C, and χ as a
function of T and describe their qualitative behavior. Is the evidence of a phase transition
more obvious?
(c) The correlation length ξ can be obtained from the r-dependence of the spin correlation
function c(r). The latter is deﬁned as
c(r) = sisj − m2
(15.30)
where r is the distance between sites i and j. The system is translationally invariant so
we write si = sj = m. The average is over all sites for a given conﬁguration and over
many conﬁgurations. Because the spins are not correlated for large r, c(r) → 0 in this limit.
Assume that c(r) ∼ e−r/ξ for r suﬃciently large and estimate ξ as a function of T . How does
your estimate of ξ compare with the size of the domains of spins with the same orientation?
Our studies of phase transitions are limited by the relatively small system sizes we can
simulate. Nevertheless, we observed in Problem 15.16 that even systems as small as L = 4
exhibit behavior that is reminiscent of a phase transition. In Figure 15.3 we show our Monte
Carlo data for the T -dependence of the speciﬁc heat of the two-dimensional Ising model for
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 612
M
TTc
1
0
Figure 15.4: The temperature dependence of m(T ), the mean magnetization per spin, for the
Ising model in two dimensions in the thermodynamic limit.
L = 8 and L = 16. We see that C exhibits a broad maximum which becomes sharper for larger L.
Does your data for C exhibit similar behavior?
We next summarize some of the qualitative properties of ferromagnetic systems in zero
magnetic ﬁeld in the thermodynamic limit (N → ∞). At T = 0, the spins are perfectly aligned
in either direction; that is, the mean magnetization per spin m(T ) = M(T ) /N is given by m(T =
0) = ±1. As T is increased, the magnitude of m(T ) decreases continuously until T = Tc at which
m(T ) vanishes (see Figure 15.4). Because m(T ) vanishes continuously rather than abruptly, the
transition is termed continuous rather than discontinuous. (The term ﬁrst order describes a
discontinuous transition.)
How can we characterize a continuous magnetic phase transition? Because a nonzero m
implies that a net number of spins are spontaneously aligned, we designate m as the order
parameter of the system. Near Tc, we can characterize the behavior of many physical quantities
by power law behavior just as we characterized the percolation threshold (see Table 12.1). For
example, we can write m near Tc as
m(T ) ∼ (Tc − T )β
(15.31)
where β is a critical exponent (not to be confused with the inverse temperature). Various thermodynamic
derivatives such as the susceptibility and speciﬁc heat diverge at Tc and are characterized
by critical exponents. We write
χ ∼ |T − Tc|−γ
, (15.32)
and
C ∼ |T − Tc|−α
(15.33)
where we have introduced the critical exponents γ and α. We have assumed that χ and C are
characterized by the same critical exponents above and below Tc.
Another measure of the magnetic ﬂuctuations is the linear dimension ξ(T ) of a typical
magnetic domain. We expect the correlation length ξ(T ) to be the order of a lattice spacing for
T Tc. Because the alignment of the spins becomes more correlated as T approaches Tc from
above, ξ(T ) increases as T approaches Tc. We can characterize the divergent behavior of ξ(T )
near Tc by the critical exponent ν:
ξ(T ) ∼ |T − Tc|−ν
. (15.34)
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 613
As we found in our discussion of percolation in Chapter 12, a ﬁnite system cannot exhibit
a true phase transition. We expect that if ξ(T ) is less than the linear dimension L of the system,
our simulations will yield results comparable to an inﬁnite system. In contrast, if T is close to
Tc, our simulations will be limited by ﬁnite-size eﬀects. Because we can simulate only ﬁnite
lattices, it is diﬃcult to obtain estimates for the critical exponents α, β, and γ by using the definitions
(15.31)–(15.33) directly. We learned in Section 12.4 that we can use ﬁnite-size scaling to
extrapolate ﬁnite L results to L → ∞. For example, from Figure 15.3 we see that the temperature
at which C exhibits a maximum becomes better deﬁned for larger lattices. This behavior
provides a simple deﬁnition of the transition temperature Tc(L) for a ﬁnite system. According
to ﬁnite size scaling theory, Tc(L) scales as
Tc(L) − Tc(L = ∞) ∼ aL−1/ν
(15.35)
where a is a constant and ν is deﬁned in (15.34). The ﬁnite size of the lattice is important when
the correlation length is comparable to the linear dimension of the system:
ξ(T ) ∼ L ∼ |T − Tc|−ν
. (15.36)
As in Section 12.4, we can set T = Tc and consider the L-dependence of M, C, and χ:
m(T ) ∼ (Tc − T )β
→ L−β/ν
(15.37)
C(T ) ∼ |T − Tc|−α
→ Lα/ν
(15.38)
χ(T ) ∼ |T − Tc|−γ
→ Lγ/ν
. (15.39)
In Problem 15.17 we use the relations (15.37)–(15.39) to estimate the critical exponents β, γ,
and α.
Problem 15.17. Finite-size scaling for the two-dimensional Ising model
(a) Use the relation (15.35) together with the exact result ν = 1 to estimate the value of Tc for
an inﬁnite square lattice. Because it is diﬃcult to obtain a precise value for Tc with small
lattices, we will use the exact result kTc/J = 2/ ln(1+
√
2) ≈ 2.269 for the inﬁnite lattice in the
remaining parts of this problem.
(b) Determine the mean value of the absolute value of the magnetization per spin |m|, the speciﬁc
heat C, and the susceptibility χ at T = Tc for L = 4, 8, 16, and 32. Compute χ using
(15.21) with |M| instead of M . Use as many Monte Carlo steps per spin as possible. Plot
the logarithm of |m| and χ versus L and use the scaling relations (15.37)–(15.39) to determine
the critical exponents β and γ. Use the exact result ν = 1. Do your log-log plots of |m|
and χ yield reasonably straight lines? Compare your estimates for β and γ with the exact
values given in Table 12.1.
(c) Make a log-log plot of C versus L. If your data for C is suﬃciently accurate, you will ﬁnd
that the log-log plot of C versus L is not a straight line but shows curvature. The reason is
that the exponent α in (15.33) equals zero for the two-dimensional Ising model, and hence
(15.38) needs to be interpreted as
C ∼ C0 lnL. (15.40)
Is your data for C consistent with (15.40)? The constant C0 in (15.40) is approximately
0.4995.
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 614
So far we have performed our Ising model simulations on a square lattice. How do the
critical temperature and the critical exponents depend on the symmetry and the dimension
of the lattice? Based on your experience with the percolation transition in Chapter 12, you
probably know the answer.
Problem 15.18. The eﬀects of symmetry and dimension on the critical properties
(a) The simulation of the Ising model on the triangular lattice is relevant to the understanding
of the experimentally observed phases of materials that can be absorbed on the surface of
graphite. The nature of the triangular lattice is discussed in Chapter 8 (see Figure 8.5). The
main diﬀerence between the triangular lattice and the square lattice is the number of nearest
neighbors. Make the necessary modiﬁcations in your program’ for example, determine the
energy changes due to a ﬂip of a single spin and the corresponding values of the transition
probabilities. Compute C and χ for diﬀerent values of T in the interval [2,5]. Assume that
ν = 1 and use ﬁnite-size scaling to estimate Tc in the limit of an inﬁnite triangular lattice.
Compare your estimate of Tc to the known value kTc/J = 3.641 (to three decimal places).
(b) No exact analytic results are available for the Ising model in three dimensions. (It has
been shown by Istrail that this model cannot be solved analytically.) Write a Monte Carlo
program to simulate the Ising model on the simple cubic lattice. Compute C and χ for T
in the range 3.2 ≤ T ≤ 5 in steps of 0.2 for diﬀerent values of L. Estimate Tc(L) from the
maximum of C and χ. How do these estimates of Tc(L) compare? Use the values of Tc(L)
that exhibit a stronger L-dependence and plot Tc(L) versus L−1/ν for diﬀerent values of ν
in the range 0.5 to 1 (see [15.35)]. Show that the extrapolated value of Tc(L = ∞) does not
depend sensitively on the value of ν. Compare your estimate for Tc(L = ∞) to the known
value kTc/J = 4.5108 (to four decimal places).
(c) Compute |m|, C, and χ at T = Tc ≈ 4.5108 for diﬀerent values of L on the simple cubic lattice.
Do a ﬁnite-size scaling analysis to estimate β/ν, α/ν, and γ/ν. The best known values of
the critical exponents for the three-dimensional Ising model are given in Table 12.1. For
comparison, published Monte Carlo results in 1976 for the ﬁnite-size behavior of the Ising
model on the simple cubic Ising lattice are for L = 6 to L = 20; 2000–5000 Monte Carlo steps
per spin were used for calculating the averages after equilibrium had been reached. Can
you obtain more accurate results?
Problem 15.19. Critical slowing down
(a) Consider the Ising model on a square lattice with L = 16. Compute the autocorrelation
functions CM(t) and CE(t) and determine the correlation times τM and τE for T = 2.5, 2.4,
and 2.3. Determine the correlation times as discussed in Problem 15.14b. How do these correlation
times compare with one another? Show that τ increases as the critical temperature
is approached, an eﬀect known as critical slowing down.
(b) We can characterize critical slowing down by the dynamical critical exponent z deﬁned by
τ ∼ ξz
. (15.41)
On a ﬁnite lattice we have τ ∼ Lz at T = Tc. Compute τ for diﬀerent values of L at T = Tc
and make a very rough estimate of z. (The value of z for the two-dimensional Ising model
with spin ﬂip dynamics is ≈ 2.167.)
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 615
The values of τ and z found in Problem 15.19 depend on our choice of dynamics (algorithm).
The reason for the large value of z is the existence of large domains of parallel spins
near the critical point. It is diﬃcult for the Metropolis algorithm to decorrelate a domain because
it has to do so one spin at a time. What is the probability of ﬂipping a single spin in
the middle of a domain at T = Tc? Which spins in a domain are more likely to ﬂip? What is
the dominant mechanism for decorrelating a domain of spins? In one dimension Cordery et al.
showed how z can be calculated exactly by considering the motion of a domain wall as a random
walk.
Although we have generated a trial change by ﬂipping a single spin, it is possible that
other types of trial changes would be more eﬃcient. A problem of much current interest is the
development of more eﬃcient algorithms near phase transitions (see Project 15.32).
15.9 Other Applications of the Ising Model
Because the applications of the Ising model range from ﬂocking birds to beating hearts, we can
mention only a few of the applications here. In the following, we brieﬂy describe applications
of the Ising model to ﬁrst-order phase transitions, lattice gases, antiferromagnetism, and the
order-disorder transition in binary alloys.
So far we have discussed the continuous phase transition in the Ising model and have found
that the energy and magnetization vary continuously with the temperature, and thermodynamic
derivatives such as the speciﬁc heat and the susceptibility diverge near Tc (in the limit
of an inﬁnite lattice). In Problem 15.20 we discuss a simple example of a ﬁrst-order phase transition.
Such transitions are accompanied by discontinuous (ﬁnite) changes in thermodynamic
quantities such as the energy and the magnetization.
Problem 15.20. The Ising model in an external magnetic ﬁeld
(a) Modify your two-dimensional Ising program so that the energy of interaction with an external
magnetic ﬁeld B is included. It is convenient to measure B in terms of the dimensionless
ratio h = βB. (Remember that B has already absorbed a factor of µ.) Compute m, the mean
magnetization per spin, as a function of h for T < Tc. Consider a square lattice with L = 32
and equilibrate the system at T = 1.8 and h = 0. Adopt the following procedure to obtain
m(h).
(i) Use an equilibrium conﬁguration at h = 0 as the initial conﬁguration for h1 = ∆h = 0.2.
(ii) Run the system for 100 Monte Carlo steps per spin before computing averages.
(iii) Average m over 100 Monte Carlo steps per spin.
(iv) Use the last conﬁguration for hn as the initial conﬁguration for hn+1 = hn + ∆h.
(v) Repeat steps (ii)–(iv) until m ≈ 0.95. Plot m versus h. Do the measured values of m
correspond to equilibrium averages?
(b) Start from the last conﬁguration in part (a) and decrease h by ∆h = −0.2 in the same way
as in part (a) until h passes through zero and m ≈ −0.95. Extend your plot of m versus h to
include negative h values. Does m remain positive for small negative h? Do the measured
values of m for negative h correspond to equilibrium averages? Draw the spin conﬁgurations
for several values of h. Do you see evidence of domains?
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 616
(c) Now increase h by ∆h = 0.2 until the m versus h curve forms an approximately closed loop.
What is the value of m at h = 0? This value of m is the spontaneous magnetization.
(d) A ﬁrst-order phase transition is characterized by a discontinuity (for an inﬁnite lattice) in
the order parameter. In the present case the transition is characterized by the behavior of
m as a function of h. What is your measured value of m for h = 0.2? If m(h) is double
valued, which value of m corresponds to the equilibrium state, an absolute minima in the
free energy? Which value of m corresponds to a metastable state, a relative minima in the
free energy? What are the equilibrium and metastable values of m for h = −0.2? First-order
transitions exhibit hysteresis, and the properties of the system depend on the history of the
system, for example, whether h is increasing or decreasing. Because of the long lifetime of
metastable states near a ﬁrst–order phase transition, a system can mistakenly be interpreted
as being in the state of minimum free energy. We also know that near a continuous phase
transition, the relaxation to equilibrium becomes very long (see Problem 15.19), and hence
a system with a continuous phase transition can also behave as if it were in a metastable
state. For these reasons it is diﬃcult to distinguish the nature of a phase transition using
computer simulations. This problem is discussed further in Section 15.11.
(e) Repeat the above simulations for T = 3, a temperature above Tc. Why do your results diﬀer
from the simulations in parts (a)–(c) done for T < Tc?
The Ising model also describes systems that might appear to have little in common with
ferromagnetism. For example, we can interpret the Ising model as a lattice gas, where a down
spin represents a lattice site occupied by a molecule and an up site represents an empty site.
Each lattice site can be occupied by at most one molecule, and the molecules interact with their
nearest neighbors. The lattice gas is a crude model of the behavior of a real gas of molecules
and is a simple model of the liquid-gas transition and the critical point. What properties does
the lattice gas have in common with a real gas? What properties of real gases does the lattice
gas neglect?
If we wish to simulate a lattice gas, we have to decide whether to do the simulation at ﬁxed
density or at ﬁxed chemical potential µ and a variable number of particles. The implementation
of the latter is straightforward because the grand canonical ensemble for a lattice gas is equivalent
to the canonical ensemble for Ising spins in an external magnetic ﬁeld; that is, the eﬀect of
the magnetic ﬁeld is to ﬁx the mean number of up spins. Hence, we can simulate a lattice gas
in the grand canonical ensemble by doing spin ﬂip dynamics. (The volume of the lattice is an
irrelevant parameter.)
Another application of a lattice gas model is to phase separation in a binary or A-B alloy.
In this case spin up and spin down correspond to a site occupied by an A atom and B atom,
respectively. As an example, the alloy β-brass has a low temperature ordered phase in which
the two components (copper and zinc) have equal concentrations and form a cesium chloride
structure. As the temperature is increased, some zinc atoms exchange positions with copper
atoms, but the system is still ordered. However, above the critical temperature Tc = 742 K,
the zinc and copper atoms become mixed and the system is disordered. This transition is an
example of an order-disorder transition.
If we wish to approximate the actual dynamics of an alloy, then the number of A atoms and
the number of B atoms is ﬁxed, and we cannot use spin ﬂip dynamics to simulate a binary alloy.
A dynamics that does conserve the number of down and up spins is known as spin exchange or
Kawasaki dynamics. In this dynamics a trial interchange of two nearest neighbor spins is made
and the change in energy ∆E is calculated. The criterion for the acceptance or rejection of the
trial change is the same as before.
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 617
Problem 15.21. Simulation of a lattice gas
(a) Modify your Ising program so that spin exchange dynamics rather than spin ﬂip dynamics
is implemented. Determine the possible values of ∆E on the square lattice and the possible
values of the transition probability and change the way a trial change is made. If we are
interested only in the mean value of quantities such as the total energy, we can reduce
the computation time by not considering the interchange of parallel spins (which has no
eﬀect). For example, we can keep a list of bonds between occupied and empty sites and
make trial moves by choosing bonds at random from this list. For small lattices such a list
is unnecessary, and a trial move can be generated by simply choosing a spin and one of its
nearest neighbors at random.
(b) Consider a square lattice with L = 32 and 512 sites initially occupied. (The number of
occupied sites is a conserved variable and must be speciﬁed initially.) Determine the mean
energy for T in the range 1 ≤ T ≤ 4. Plot the mean energy as a function of T . Does the
energy appear to vary continuously?
(c) Repeat the calculations of part (b) with 612 sites initially occupied and plot the mean energy
as a function of T . Does the energy vary continuously? Do you see any evidence of a ﬁrstorder
phase transition?
(d) Because down spins correspond to particles, we can compute their single particle diﬀusion
coeﬃcient. Use an array to record the position of each particle as a function of time. After
equilibrium has been reached, compute R(t)2 , the mean square displacement of a particle.
Is it necessary to “interchange” two like spins? If the particles undergo a random walk, the
self-diﬀusion constant D is deﬁned as
D = lim
t→∞
1
2dt
R(t)2
. (15.42)
Estimate D for diﬀerent temperatures and numbers of occupied sites. Note that if a particle
starts at x0 and returns to x0 by moving in one direction on the average using periodic
boundary conditions, the net displacement in the x direction is L not 0 (see Section 8.10 for
a discussion of how to compute the diﬀusion constant for systems with periodic boundary
conditions).
Although you are probably familiar with ferromagnetism, for example, a magnet on a refrigerator
door, nature provides more examples of antiferromagnetism. In the language of the
Ising model, antiferromagnetism means that the exchange parameter J is negative and nearest
neighbor spins prefer to be aligned in opposite directions. As we will see in Problem 15.22,
the properties of the antiferromagnetic Ising model on a square lattice are similar to the ferromagnetic
Ising model. For example, the energy and speciﬁc heat of the ferromagnetic and
antiferromagnetic Ising models are identical at all temperatures in zero magnetic ﬁeld, and the
system exhibits a phase transition at the Néel temperature TN . On the other hand, the total
magnetization and susceptibility do not exhibit critical behavior near TN . Instead, we need to
deﬁne two sublattices for the square lattice corresponding to the red and black squares of a
checkerboard and introduce the staggered magnetization Ms, which is equal to the diﬀerence
of the magnetization of the two sublattices. We will ﬁnd in Problem 15.22 that the temperature
dependence of Ms and the staggered susceptibility χs are identical to the analogous quantities
in the ferromagnetic Ising model.
Problem 15.22. Antiferromagnetic Ising model
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 618
?
Figure 15.5: An example of frustration on a triangular lattice. The interaction in antiferromag-
netic.
(a) Modify the Ising class to simulate the antiferromagnetic Ising model on the square lattice
in zero magnetic ﬁeld. Because J does not appear explicitly in class Ising, change the sign of
the energy calculations in the appropriate places in the program. To compute the staggered
magnetization on a square lattice, deﬁne one sublattice to be the sites (x,y) for which the
product mod(x,2) × mod(y,2) = 1; the other sublattice corresponds to the remaining sites.
(b) Choose L = 32 and all spins up initially. What conﬁguration of spins corresponds to the
state of lowest energy? Compute the temperature dependence of the mean energy, the magnetization,
the speciﬁc heat, and the susceptibility. Does the temperature dependence of
any of these quantities show evidence of a phase transition?
(c) In part (b) you might have noticed that χ shows a cusp. Compute χ for diﬀerent values of
L at T = TN ≈ 2.269. Do a ﬁnite-size scaling analysis and verify that χ does not diverge at
T = TN .
(d) Compute the temperature dependence of Ms and the staggered susceptibility χs deﬁned as
[see (15.21)]
χs =
1
kT
Ms
2
− Ms
2
. (15.43)
(Below Tc it is better to compute |Ms| instead of Ms for small lattices.) Verify that the
temperature dependence of Ms for the antiferromagnetic Ising model is the same as the
temperature dependence of M for the Ising ferromagnet. Could you have predicted this
similarity without doing the simulation? Does χs show evidence of a phase transition?
(e) Consider the behavior of the antiferromagnetic Ising model on a triangular lattice. Choose
L ≥ 32 and compute the same quantities as before. Do you see any evidence of a phase
transition? Draw several conﬁgurations of the system at diﬀerent temperatures. Do you see
evidence of many small domains at low temperatures? Is there a unique ground state? If
you cannot ﬁnd a unique ground state, you share the same frustration as do the individual
spins in the antiferromagnetic Ising model on the triangular lattice. We say that this model
exhibits frustration because there is no spin conﬁguration on the triangular lattice such that
all spins are able to minimize their energy (see Figure 15.5).
The Ising model is one of many models of magnetism. The Heisenberg, Potts, and x-y
models are other examples of models of magnetic materials. Monte Carlo simulations of these
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 619
temperature
fusion
curve
solid
gas
liquid
critical
point
triple
point
sublimation
curve
vapor pressure
curve
pressure
Figure 15.6: A sketch of the phase diagram for a simple material.
models and others have been important in the development of our understanding of phase
transitions in both magnetic and nonmagnetic materials. Some of these models are discussed
in Section 15.14.
15.10 Simulation of Classical Fluids
The existence of matter as a solid, liquid, and gas is well known (see Figure 15.6). Our goal
in this section is to use Monte Carlo methods to gain additional insight into the qualitative
diﬀerences between these phases.
The Monte Carlo simulation of classical systems is simpliﬁed considerably by the fact that
the velocity (momentum) variables are decoupled from the position variables. The total energy
of the system can be written as E = K({vi}) + U({ri}), where the kinetic energy K is a function
of only the particle velocities {vi}, and the potential energy U is a function of only the particle
positions {ri}. This separation implies we need to sample only the positions of the molecules,
that is, the “conﬁgurational” degrees of freedom. Because the velocity appears quadratically in
the kinetic energy, it can be shown using classical statistical mechanics that the contribution of
the velocity coordinates to the mean energy is 1
2 kT per degree of freedom. Is this simpliﬁcation
possible for quantum systems?
The physically relevant quantities of a ﬂuid include its mean energy, speciﬁc heat, and
equation of state. Another interesting quantity is the radial distribution function g(r) which we
introduced in Chapter 8. We will ﬁnd in Problems 15.23–15.25 that g(r) is a probe of the density
ﬂuctuations and, hence, a probe of the local order in the system. If only two-body forces are
present, the mean potential energy per particle can be expressed as [see (8.16)]
U
N
=
ρ
2
g(r)u(r)dr, (15.44)
and the (virial) equation of state can be written as [see (8.17)]
βP
ρ
= 1 −
βρ
2d
g(r)r
du(r)
dr
dr. (15.45)
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 620
Hard core interactions. To separate the eﬀects of the short range repulsive interaction from the
longer range attractive interaction, we ﬁrst investigate a model of hard disks with the interparticle
interaction
u(r) =



+∞ (r < σ)
0 (r ≥ σ).
(15.46)
Such an interaction has been extensively studied in one dimension (hard rods), two dimensions
(hard disks), and three dimensions (hard spheres). Hard sphere systems were the ﬁrst systems
studied by Metropolis and coworkers.
Because there is no attractive interaction present in (15.46), there is no transition from a gas
to a liquid. Is there a phase transition between a ﬂuid phase at low densities and a solid at high
densities? Can a solid form in the absence of an attractive interaction?
What are the physically relevant quantities for a system with a hard core interaction? The
mean potential energy is of no interest because the potential energy is always zero. The major
quantity of interest is g(r) which yields information about the correlations of the particles and
the equation of state. If the interaction is given by (15.46), it can be shown that (15.45) reduces
to
βP
ρ
= 1 +
2π
3
ρσ3
g(σ) (d = 3) (15.47a)
= 1 +
π
2
ρσ2
g(σ) (d = 2) (15.47b)
= 1 + ρσg(σ). (d = 1) (15.47c)
We will calculate g(r) for diﬀerent values of r and then extrapolate our results to r = σ (see
Problem 15.23b).
Because the application of molecular dynamics and Monte Carlo methods to hard disks is
similar, we discuss the latter method only brieﬂy and do not include a program. The idea is to
choose a disk at random and move it to a trial position as implemented in the following:
int i = ( int ) (N Math . random ( ) ) ; / / choose a p a r t i c l e at random
x t r i a l += (2.0 Math . random ( ) − 1.0) delta ; / / d e l t a i s maximum displacement
y t r i a l += (2.0 Math . random ( ) − 1.0) delta ;
If the new position overlaps another disk, the move is rejected and the old conﬁguration is
retained; otherwise, the move is accepted. A reasonable, although not necessarily optimum,
choice for the maximum displacement δ is to choose δ such that approximately 20% of the trial
moves are accepted.
The major diﬃculty in implementing this algorithm is determining the overlap of two particles.
If the number of particles is not too large, it is suﬃcient to compute the distances between
the trial particle and all the other particles, instead of just considering the smaller number of
particles that are in the immediate vicinity of the trial particle. For larger systems this procedure
is too time consuming, and it is better to divide the system into cells and to only compute
the distances between the trial particle and particles in the same and neighboring cells.
The choice of initial positions for the disks is more complicated than it might ﬁrst appear.
One strategy is to place each successive disk at random in the box. If a disk overlaps one that is
already present, generate another pair of random numbers and attempt to place the disk again.
If the desired density is low, an acceptable initial conﬁguration can be computed fairly quickly
in this way, but if the desired density is high, the probability of adding a disk will be very small
(see Problem 15.24a). To reach higher densities, we might imagine beginning with the desired
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 621
number of particles in a low density conﬁguration and moving the boundaries of the central
cell inward until a boundary just touches one of the disks. Then the disks are moved a number
of Monte Carlo steps and the boundaries are moved inward again. This procedure also becomes
more diﬃcult as the density increases. The most eﬃcient procedure is to start the disks on a
lattice at the highest density of interest such that no overlap of disks occurs.
We ﬁrst consider a one-dimensional system of hard rods for which the equation of state and
g(r) can be calculated exactly. The equation of state is given by
P
NkT
=
1
L − Nσ
. (15.48)
Because hard rods cannot pass through one another, the excluded volume is Nσ and the available
volume is L − Nσ. Note that the form of (15.48) is the same as the van der Waals equation
of state (cf. Reif) with the contribution from the attractive part of the interaction equal to zero.
Problem 15.23. Monte Carlo simulation of hard rods
(a) Write a program to do a Monte Carlo simulation of a system of hard rods. Adopt periodic
boundary conditions and refer to class HardDisks in Chapter 8 for the structure of the
program. The major diﬀerence is the nature of the trial moves. Measure all lengths in terms
of the hard rod diameter σ. Choose L = 36 and N = 30. How does the number density
ρ = N/L compare to the maximum possible density? Choose the initial positions to be
on a one-dimensional grid and let the maximum displacement be δ = 0.1. Approximately
how many Monte Carlo steps per particle are necessary to reach equilibrium? What is the
equilibrium acceptance probability? Compute the pair correlation function g(x).
(b) Compute g(x) as a function of the distance x for x ≤ L/2. Why does g(x) = 0 for x < 1?
What is the physical interpretation of the peaks in g(x)? Because the mean pressure can
be determined from g(x) at x = 1+ [see (15.47)], determine g(x) at contact. An easy way to
extrapolate your results for g(x) to x = 1 is to ﬁt the three values of g(x) closest to x = 1 to a
parabola. Use your result for g(x = 1+) to determine the mean pressure.
(c) Compute g(x) at several lower densities by using an equilibrium conﬁguration from a previous
run and increasing L. How do the size and the location of the peaks in g(x) change?
Problem 15.24. Monte Carlo simulation of hard disks
(a) The maximum packing density can be found by placing the disks on a triangular lattice
with the nearest neighbor distance equal to the disk diameter σ. What is the maximum
packing density of hard disks; that is, how many disks can be packed together in a cell of
area A?
(b) Write a simple program that adds disks at random into a rectangular box of area A = Lx ×Ly
with the constraint that no two disks overlap. If a disk overlaps a disk already present,
generate another pair of random numbers and try to place the disk again. If the density is
low, the probability of adding a disk is high, but if the desired density is high most of the
disks will be rejected. For simplicity, do not worry about periodic boundary conditions and
accept a disk if its center lies within the box. Choose Lx = 6 and Ly =
√
3Lx/2 and determine
the maximum density ρ = N/A that you can attain in a reasonable amount of CPU time.
How does this density compare to the maximum packing density? What is the qualitative
nature of the density dependence of the acceptance probability?
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 622
(c) Modify your Monte Carlo program for hard rods to a system of hard disks. Begin at a density
ρ slightly lower than the maximum packing density ρ0. Choose N = 64 with Lx = 8.81 and
Ly =
√
3Lx/2. Compare the density ρ = N/(LxLy) to the maximum packing density. Choose
the initial positions of the particles to be on a triangular lattice. A reasonable ﬁrst choice
for the maximum displacement δ is δ = 0.1. Compute g(r) for ρ/ρ0 = 0.95, 0.92, 0.88, 0.85,
0.80, 0.70, 0.60, and 0.30. Keep the ratio of Lx/Ly ﬁxed and save a conﬁguration from the
previous run to be the initial conﬁguration of the new run at lower ρ. (See page 266 for how
to save and read conﬁgurations.) Allow at least 400 Monte Carlo steps per particle for the
system to equilibrate and average g(r) for mcs ≥ 400.
(d) What is the qualitative behavior of g(r) at high and low densities? For example, describe
the number and height of the peaks of g(r). If the system is crystalline, then g(r) is not
spherically symmetric. What would you compute in this case?
(e) Use your results for g(r = 1+) to compute the mean pressure P as a function of ρ [see
(15.47b)]. Plot the ratio P V /NkT as a function of ρ, where the volume V is the area of
the system. How does the temperature T enter into the Monte Carlo simulation? Is the ratio
P V /NkT an increasing or decreasing function of ρ? At low densities we might expect the
system to act like an ideal gas with the volume replaced by (V − Nσ). Compare your low
density results with this prediction.
(f) Take snapshots of the disks at intervals of ten to twenty Monte Carlo steps per particle. Do
you see any evidence of the solid becoming a ﬂuid at lower densities?
(g) Compute an eﬀective diﬀusion coeﬃcient D by determining the mean square displacement
R2(t) of the particles after equilibrium is reached. Use the relation (15.42) and identify
the time t with the number of Monte Carlo steps per particle. Estimate D for the densities
considered in part (b) and plot the product ρD as a function of ρ. What is the dependence
of D on ρ for a dilute gas? Can you identify a range of ρ where D drops abruptly? Do you
observe any evidence of a phase transition?
(h) The magnitude of the maximum displacement parameter δ is arbitrary. If the density is
high and δ is large, then a high proportion of the trial moves will be rejected. On the other
hand, if δ is small, the acceptance probability will be close to unity, but the successive conﬁgurations
will be strongly correlated. Hence, if δ is too large or is too small, the simulation
would be ineﬃcient. One way to choose δ is to ﬁnd the value of δ that maximizes the mean
square displacement over a ﬁxed time interval. The idea is that the mean square displacement
is a measure of the exploration of phase space. Fix the density and determine the
value of δ that maximizes R2(t) . What is the corresponding acceptance probability?
Continuous potentials. Our simulations of hard disks suggest that there is a phase transition
from a ﬂuid at low densities to a solid at higher densities. This conclusion is consistent with
molecular dynamics and Monte Carlo studies of larger systems. Although the existence of a
ﬂuid-solid transition for hard sphere and hard disk systems is now well accepted, the relatively
small numbers of particles used in any simulation should remind us that results of this type
cannot be taken as evidence independently of any theoretical justiﬁcation.
The existence of a ﬂuid-solid transition for hard spheres implies that the transition is determined
by the repulsive part of the potential. We now consider a system with both a repulsive
and an attractive contribution. Our primary goal will be to determine the inﬂuence of the attractive
part of the potential on the structure of a liquid.
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 623
We adopt as our model interaction the Lennard–Jones potential:
u(r) = 4
σ
r
12
−
σ
r
6
. (15.49)
The nature of the Lennard–Jones potential and the appropriate choice of units for simulations
was discussed in Chapter 8 (see Table 8.1). We consider in Problem 15.25 the application of the
Metropolis algorithm to a system of N particles in a cell of ﬁxed volume V (area) interacting
via the Lennard–Jones potential. Because the simulation is at ﬁxed T , V , and N, the simulation
samples conﬁgurations of the system according to the Boltzmann distribution (15.4).
Problem 15.25. Monte Carlo simulation of a Lennard–Jones system
(a) The properties of a two-dimensional Lennard–Jones system have been studied by many
workers under a variety of conditions. Write a program to compute the total energy of a
system of N particles on a triangular lattice of area Lx × Ly with periodic boundary conditions.
Choose N = 64,Lx = 9.2, and Ly =
√
3Lx/2. Why does this energy correspond to the
energy at temperature T = 0? Does the energy per particle change if you consider bigger
systems at the same density?
(b) Write a program to compute the mean energy, pressure, and the radial distribution function
using the Metropolis algorithm. One way of computing the change in the potential energy
of the system due to a trial move of one of the particles is to use an array pe for the potential
energy of interaction of each particle. For simplicity, compute the potential energy of
particle i by considering its interaction with the other N − 1 particles. The total potential
energy of the system is the sum of the array elements pe(i) over all N particles divided by
two to account for double counting. For simplicity, accumulate data after each Monte Carlo
step per particle.
(c) Choose the same values of N, Lx, and Ly as in part (a) but give each particle an initial random
displacement from its triangular lattice site of magnitude 0.2. Do the Monte Carlo
simulation at a very low temperature such as T = 0.1. Choose the maximum trial displacement
δ = 0.15 and consider mcs ≥ 400. Does the system retain its symmetry? Does the value
of δ aﬀect your results?
(d) Use the same initial conditions as in part (a), but take T = 0.5. Choose δ = 0.15 and run for
a number of Monte Carlo steps per particle that is suﬃcient to yield a reasonable result for
the mean energy. Do a similar simulation at T = 1 and T = 2. What is the best choice of
the initial conﬁguration in each case? The harmonic theory of solids predicts that the total
energy of a system is due to a T = 0 contribution plus a term due to the harmonic oscillation
of the atoms. The contribution of the latter part should be proportional to the temperature.
Compare your results for E(T )−E(0) with this prediction. Use the values of σ and given in
Table 8.1 to determine the temperature and energy in SI units for your simulations of solid
argon.
(e) Decrease the density by multiplying Lx, Ly, and all the particle coordinates by 1.07. What
is the new value of ρ? Estimate the number of Monte Carlo steps per particle needed to
compute E and P at T = 0.5 to approximately 10% accuracy. Is the total energy positive or
negative? How do E and P compare to their ideal gas values? Follow the method discussed
in Problem 15.24 and compute an eﬀective diﬀusion constant. Is the system a liquid or a
solid? Plot g(r) versus r and compare g(r) to your results for hard disks at the same density.
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 624
What is the qualitative behavior of g(r)? What is the interpretation of the peaks in g(r) in
terms of the structure of the liquid? If time permits, consider a larger system at the same
density and temperature and compute g(r) for larger r.
(f) Consider the same density as in part (e) at T = 0.6 and T = 1. Look at some typical conﬁgurations
of the particles. Use your results for E(T ), P (T ), g(r) and the other data you have
collected and discuss whether the system is a gas, liquid, or solid at these temperatures.
What criteria can you use to distinguish a gas from a liquid? If time permits, repeat these
calculations for ρ = 0.7.
(g) Compute E, P , and g(r) for N = 64, Lx = Ly = 20, and T = 3. These conditions correspond to
a dilute gas. How do your results for P compare with the ideal gas equation of state? How
does g(r) compare with the results you obtained for the liquid?
(h) The chemical potential can be measured using the Widom insertion method. From thermodynamics
we know that
µ =
∂F
∂N V ,T
= −kT ln
ZN+1
ZN
(15.50)
in the limit N → ∞, where F is the Helmholtz free energy and ZN is the partition function
for N particles. The ratio ZN+1/ZN is the average of e−β∆E over all possible states of the
added particle with added energy ∆E. The idea is to compute the change in the energy ∆E
that would occur if an imaginary particle were added to the N particle system at random.
Average the value of e−β∆E over many conﬁgurations generated by the Metropolis algorithm.
The chemical potential is then given by
µ = −kT ln e−β∆E
. (15.51)
Note that in the Widom insertion method, no particle is actually added to the system during
the simulation. The chemical potential computed in (15.51) is the excess chemical potential
and does not include the part of the chemical potential due to the momentum degrees of
freedom, which is equal to the chemical potential of an ideal gas. Compute the chemical
potential of a dense gas, liquid, and solid. In what sense is the chemical potential a measure
of how easy it is to add a particle to the system?
15.11 Optimized Monte Carlo Data Analysis
As we have seen, the important physics near a phase transition occurs on long length scales. For
this reason, we might expect that simulations, which for practical reasons are restricted to relatively
small systems, might not be useful for simulations near a phase transition. Nevertheless,
we have found that methods such as ﬁnite-size scaling can yield information about how systems
behave in the thermodynamic limit. We now explore some additional Monte Carlo techniques
that are useful near a phase transition.
The Metropolis algorithm yields mean values of various thermodynamic quantities, for example,
the energy at particular values of the temperature T . Near a phase transition many
thermodynamic quantities change rapidly, and we need to determine these quantities at many
closely spaced values of T . If we were to use standard Monte Carlo methods, we would have to
do many simulations to cover the desired range of values of T . To overcome this problem, we
introduce the use of histograms which allow us to extract more information from a single Monte
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 625
Carlo simulation. The idea is to use our knowledge of the equilibrium probability distribution
at one value of T (and other external parameters) to estimate the desired thermodynamic
averages at neighboring values of the external parameters.
The ﬁrst step of the single histogram method is to simulate the system at an inverse temperature
β0 which is near the values of β of interest and measure the energy of the system after
every Monte Carlo step per spin (or other ﬁxed interval). The measured probability that the
system has energy E can be expressed as
P (E,β0) =
H0(E)
E H0(E)
. (15.52)
The histogram H0(E) is the number of conﬁgurations with energy E, and the denominator is the
total number of measurements of E. Because the probability of a given conﬁguration is given
by the Boltzmann distribution, we have
P (E,β) =
g(E)e−βE
E g(E)e−βE
(15.53)
where g(E) is the number of microstates with energy E. (The density of states g(E) should not
be confused with the radial distribution function g(r). If the energy is a continuous function,
g(E) becomes the number of states per unit energy interval. However, g(E) is usually referred
to as the density of states regardless of whether E is a continuous or discrete variable.) If we
compare (15.52) and (15.53) and note that g(E) is independent of T , we can write
g(E) = a0H0(E)eβ0E
(15.54)
where a0 is a proportionality constant that depends on β0. If we eliminate g(E) from (15.53) by
using (15.54), we obtain the desired relation
P (E,β) =
H0(E)e−(β−β0)E
E H0(E)e−(β−β0)E
. (15.55)
Note that we have expressed the probability at the inverse temperature β in terms of H0(E), the
histogram at the inverse temperature β0.
Because β is a continuous variable, we can estimate the β dependence of the mean value of
any function A that depends on E, for example, the mean energy and the speciﬁc heat. We write
the mean of A(E) as
A(β) =
E
A(E)P (E,β). (15.56)
If the quantity A depends on another quantity M, for example, the magnetization, then we can
generalize (15.56) to
A(β) =
E,M
A(E,M)P (E,M,β) (15.57a)
=
E,M A(E,M)H0(E,M)e−(β−β0)E
E,M H0(E,M)e−(β−β0)E
. (15.57b)
The histogram method is useful only when the conﬁgurations relevant to the range of temperatures
of interest occur with reasonable probability during the simulation at temperature
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 626
T0. For example, if we simulate an Ising model at low temperatures at which only ordered
conﬁgurations occur (most spins aligned in the same direction), we cannot use the histogram
method to obtain meaningful thermodynamic averages at high temperatures for which most
conﬁgurations are disordered.
Problem 15.26. Application of the histogram method
(a) Consider a 4 × 4 Ising lattice in zero magnetic ﬁeld and use the Metropolis algorithm to
compute the mean energy per spin, the mean magnetization per spin, the speciﬁc heat, and
the susceptibility per spin for T = 1 to T = 3 in steps of ∆T = 0.05. Average over at least
5000 Monte Carlo steps for each value of T per spin after equilibrium has been reached.
(b) What are the minimum and maximum values of the total energy E that might be observed
in a simulation of a Ising model on a 4×4 lattice? Use these values to set the size of the array
needed to accumulate data for the histogram H(E). Accumulate data for H(E) at T = 2.27,
a value of T close to Tc, for at least 5000 Monte Carlo steps per spin after equilibration.
Compute the energy and speciﬁc heat using (15.56). Compare your computed results with
the data obtained by simulating the system directly, that is, without using the histogram
method, at the same temperatures. At what temperatures does the histogram method break
down?
(c) What are the minimum and maximum values of the magnetization M that might be observed
in a simulation of a Ising model on a 4 × 4 lattice? Use these values to set the size
of the two-dimensional array needed to accumulate data for the histogram H(E,M). Accumulate
data for H(E,M) at T = 2.27, a value of T close to Tc, for at least 5000 Monte Carlo
steps per spin after equilibration. Compute the same thermodynamic quantities as in part
(a) using (15.57b). Compare your computed results with the data obtained by simulating
the system directly, that is, without using the histogram method, at the same temperatures.
At what temperatures does the histogram method break down?
(d) Repeat part (c) for a simulation centered about T = 1.5 and T = 2.5.
(e) Repeat part (c) for an 8 × 8 and a 16 × 16 lattice at T = 2.27.
The histogram method can be used to do a more sophisticated ﬁnite-size scaling analysis to
determine the nature of a transition. Suppose that we perform a Monte Carlo simulation and
observe a peak in the speciﬁc heat as a function of the temperature. What can this observation
tell us about a possible phase transition? In general, we can conclude very little without doing
a careful analysis of the behavior of the system at diﬀerent sizes. For example, a discontinuity
in the energy in an inﬁnite system might be manifested in small systems by a broad peak in the
speciﬁc heat. However, we have seen that the speciﬁc heat of a system with a continuous phase
transition in the thermodynamic limit may manifest itself in the same way in a small system.
Another diﬃculty is that the peak in the speciﬁc heat of a small system occurs at a temperature
that diﬀers from the transition temperature in the inﬁnite system (see Project 15.37). Finally,
there might be no transition at all, and the peak might simply represent a broad crossover from
high to low temperature behavior (see Project 15.38).
We now discuss a method due to Lee and Kosterlitz that uses the histogram data to determine
the nature of a phase transition (if it exists). To understand this method, we use the
Helmholtz free energy F of a system. At low T , the low energy conﬁgurations dominate the contributions
to the partition function Z, even though there are relatively few such conﬁgurations.
At high T , the number of disordered conﬁgurations with high E is large, and hence high energy
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 627
conﬁgurations dominate the contribution to Z. These considerations suggest that it is useful to
deﬁne a restricted free energy F(E) that includes only the conﬁgurations at a particular energy
E. We deﬁne
F(E) = −kT ln g(E)e−βE
. (15.58)
For systems with a ﬁrst-order phase transition, a plot of F(E) versus E will show two local
minima corresponding to conﬁgurations that are characteristic of the high and low temperature
phases. At low T the minimum at the lower energy will be the absolute minimum, and at high
T the higher energy minimum will be the absolute minimum of F. At the transition, the two
minima will have the same value of F(E). For systems with no transition in the thermodynamic
limit, there will only be one minimum for all T .
How will F(E) behave for the relatively small lattices that we can simulate? In systems with
ﬁrst-order transitions, the distinction between low and high temperature phases will become
more pronounced as the system size is increased. If the transition is continuous, there are
domains at all sizes, and we expect that the behavior of F(E) will not change signiﬁcantly as
the system size increases. If there is no transition, there might be a spurious double minima
for small systems, but this spurious behavior should disappear for larger systems. Lee and
Kosterlitz proposed the following method for categorizing phase transitions.
1. Do a simulation at a temperature close to the suspected transition temperature and compute
H(E). Usually the temperature at which the peak in the speciﬁc heat occurs is chosen
as the simulation temperature.
2. Use the histogram method to compute F(E) ∝ −lnH0(E) + (β − β0)E at neighboring values
of T . If there are two minima in F(E), vary β until the values of F(E) at the two minima
are equal. This temperature is an estimate of the possible transition temperature Tc.
3. Measure the diﬀerence ∆F at Tc between F(E) at the minima and F(E) at the maximum
between the two minima.
4. Repeat steps (1)–(3) for larger systems. If ∆F increases with size, the transition is ﬁrst
order. If ∆F remains the same, the transition is continuous. If ∆F decreases with size,
there is no thermodynamic transition.
The above procedure is applicable when the phase transition occurs by varying the temperature.
Transitions can also occur by varying the pressure or the magnetic ﬁeld. These ﬁeld-driven
transitions can be tested by a similar method. For example, consider the Ising model in a magnetic
ﬁeld at low temperatures below Tc. As we vary the magnetic ﬁeld from positive to negative,
there is a transition from a phase with magnetization M > 0 to a phase with M < 0. Is this transition
ﬁrst order or continuous? To answer this question, we can use the Lee–Kosterlitz method
with a histogram H(E,M) generated at zero magnetic ﬁeld and calculate F(M) instead of F(E).
The quantity F(M) is proportional to −ln E H(E,M)e−(β−β0)E. Because the states with positive
and negative magnetization are equally likely to occur for zero magnetic ﬁeld, we should see a
double minima structure for F(M) with equal minima. As we increase the size of the system,
∆F should increase for a ﬁrst-order transition and remain the same for a continuous transition.
Problem 15.27. Characterization of a phase transition
(a) Use your modiﬁed version of class Ising from Problem 15.26 to determine H(E,M). Read
the H(E,M) data from a ﬁle and compute and plot F(E) for the range of temperatures of
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 628
interest. First generate data at T = 2.27 and use the Lee–Kosterlitz method to verify that
the Ising model in two dimensions has a continuous phase transition in zero magnetic ﬁeld.
Consider lattices of sizes L = 4, 8, and 16.
(b) Do a Lee–Kosterlitz analysis of the Ising model at T = 2 and zero magnetic ﬁeld by plotting
F(M). Determine if the transition from M > 0 to M < 0 is ﬁrst order or continuous. This
transition is called ﬁeld driven because the transition occurs if we change the magnetic
ﬁeld. Make sure your simulations sample conﬁgurations with both positive and negative
magnetization by using small values of L such as L = 4, 6, and 8.
(c) Repeat part (b) at T = 2.5 and determine if there is a ﬁeld-driven transition at T = 2.5.
∗Problem 15.28. The Potts model
In the q-state Potts model, the total energy or Hamiltonian of the lattice is given by
E = −J
i,j=nn(i)
δsi,sj
(15.59)
where si at site i can have the values 1,2,...,q; the Kronecker delta function δa,b equals unity
if a = b, and is zero otherwise. As before, we will measure the temperature in energy units.
Convince yourself that the q = 2 Potts model is equivalent to the Ising model (except for a
trivial diﬀerence in the energy minimum). One of the many applications of the Potts model
is to helium absorbed on the surface of graphite. The graphite-helium interaction gives rise
to preferred adsorption sites directly above the centers of the honeycomb graphite surface. As
discussed by Plischke and Bergersen, the helium atoms can be described by a three-state Potts
model.
(a) The transition in the Potts model is continuous for small q and ﬁrst order for larger q. Write
a Monte Carlo program to simulate the Potts model for a given value of q and store the
histogram H(E). Test your program by comparing the output for q = 2 with your Ising
model program.
(b) Use the Lee–Kosterlitz method to analyze the nature of the phase transition in the Potts
model for q = 3, 4, 5, 6, and 10. First ﬁnd the location of the speciﬁc heat maximum, and
then collect data for H(E) at the speciﬁc heat maximum. Lattice sizes of order L ≥ 50 are
required to obtain convincing results for some values of q.
Another way to determine the nature of a phase transition is to use the Binder cumulant
method. The cumulant is deﬁned by
UL ≡ 1 −
E4
3 E2 2
. (15.60)
It can be shown that the minimum value of UL is
UL,min =
2
3
−
1
3
E2
+ − E2
−
2E+E−
2
+ O(L−d
) (15.61)
where E+ and E− are the energies of the two phases in a ﬁrst-order transition. These results are
derived by considering the distribution of energy values to be a sum of Gaussians about each
phase at the transition, which become sharper and sharper as L → ∞. If UL,min = 2/3 in the
inﬁnite size limit, then the transition is continuous.
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 629
Problem 15.29. The Binder cumulant and the nature of the transition
(a) Suppose that the energy in a system is given by a Gaussian distribution with a zero mean.
What is the corresponding value of UL?
(b) Consider the two-dimensional Ising model in the absence of a magnetic ﬁeld and consider
the cumulant
VL ≡ 1 −
M4
3 M2 2
. (15.62)
Compute VL for a temperature much higher than Tc. What is the value of VL? What is the
value of VL at T = 0?
(c) Compute VL for values of T in the range 2.20 ≤ T ≤ 2.35 for L = 10, 20, and 40. Plot VL as a
function of T for these three values of L. Note that the three curves for VL cross at a value of
T that is approximately Tc. What is the approximate value of VL at this crossing? Can you
conclude that the transition is continuous?
(d) Repeat Problem 15.28 using the Binder cumulant method and determine the nature of the
transition.
15.12 ∗
Other Ensembles
So far, we have considered the microcanonical ensemble (ﬁxed N, V , and E) and the canonical
ensemble (ﬁxed N, V , and T ). Monte Carlo methods are very ﬂexible and can be adapted to the
calculation of averages in any ensemble. Two other ensembles of particular importance are the
constant pressure and the grand canonical ensembles. The main diﬀerence in the Monte Carlo
method is that there are additional moves corresponding to changing the volume or changing
the number of particles. The constant pressure ensemble is particularly important for studying
ﬁrst-order phase transitions because the phase transition occurs at a ﬁxed pressure, unlike a
constant volume simulation where the system passes through a two phase coexistence region
before changing phase completely as the volume is changed.
In the NP T ensemble, the probability of a microstate is proportional to e−β(E+P V ). For a
classical system, the mean value of a physical quantity A that depends on the positions of the
particles can be expressed as
A NPT =
∞
0
dV e−βP V dr1 dr2 ···drN A({r})e−βU({r})
∞
0
dV e−βP V dr1 dr2 ···drN e−βU({r})
. (15.63)
The potential energy U({r}) depends on the set of particle coordinates ({r}). To simulate the
NP T ensemble, we need to sample the coordinates r1,r2,...,rN of the particles and the volume
V of the system. For simplicity, we assume that the central cell is a square or a cube so that
V = Ld. It is convenient to use the set of scaled coordinates {s}, where si is deﬁned as
si =
ri
L
. (15.64)
If we substitute (15.64) into (15.63), we can write A NPT as
A NPT =
∞
0
dV e−βP V V N ds1 ds2 ···dsN A({s})e−βU({s})
∞
0
dV e−βP V V N ds1 ds2 ···dsN e−βU({s})
(15.65)
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 630
where the integral over {s} is over the unit square (cube). The factor of V N arises from the
change of variables r → s. If we let V N = elnV N
= eN lnV , we see that the quantity that is
analogous to the Boltzmann factor can be written as
e−W
= e−βP V −βU({s})+N lnV
. (15.66)
Because the pressure is ﬁxed, a trial conﬁguration is generated from the current conﬁguration
by either randomly displacing a particle or making a random change in the volume, for example,
V → V + δ(2r − 1), where r is a uniform random number in the unit interval and δ is the
maximum change in volume. The trial conﬁguration is accepted if the change ∆W ≤ 0 and with
probability e−∆W if ∆W > 0. It is not necessary or eﬃcient to change the volume after every
Monte Carlo step per particle.
In the grand canonical or µV T ensemble, the chemical potential µ is ﬁxed and the number
of particles ﬂuctuates. The average of any function of the particle positions can be written (in
three dimensions) as
A µVT =
∞
N=0
(1/N!)λ−3N eβµN dr1dr2 ···drN A({r})e−βUN ({r})
∞
N=0
(1/N!)λ−3N eβµN dr1dr2 ···drN e−βUN ({r})
(15.67)
where λ = (h2/2πmkT )1/2. We have made the N-dependence of the potential energy U explicit.
If we write 1/N! = e−lnN! and λ−3N = e−N lnλ3
, we can write the quantity that is analogous to the
Boltzmann factor as
e−W
= eβµN−N lnλ3−lnN!+N lnV −βUN . (15.68)
If we write the chemical potential as
µ = µ∗
+ kT ln(λ3
/V ), (15.69)
then W can be expressed as
e−W
= e−βµ∗N−lnN!−βUN . (15.70)
There are two possible ways of obtaining a trial conﬁguration. The ﬁrst involves the displacement
of a selected particle; such a move is accepted or rejected according to the usual
criteria, that is, by the change in the potential energy UN . In the second possible way, we
choose with equal probability whether to attempt to add a particle at a randomly chosen position
in the central cell or to remove a particle that is already present. In either case, the trial
conﬁguration is accepted if W in (15.70) is increased. If W is decreased, the change is accepted
with a probability equal to
1
N + 1
e
β µ∗−(UN+1−UN )
(insertion), (15.71a)
or
Ne
−β µ∗+(UN−1−UN )
(removal). (15.71b)
In this approach µ∗ is an input parameter, and µ is not determined until the end of the calculation
when N µVT is obtained.
As we have discussed, the probability that a system at a temperature T has energy E is given
by [see (15.53)]
P (E,β) =
g(E)e−βE
Z
. (15.72)
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 631
If the density of states g(E) was known, we could calculate the mean energy (and other thermodynamic
quantities) at any temperature from the relation
E =
1
Z
E
Eg(E)e−βE
. (15.73)
Hence, the density of states is a quantity of much interest.
Suppose that we were to try to compute g(E) by doing a random walk in energy space by
ﬂipping the spins at random and accepting all conﬁgurations that are obtained. Then the histogram
of the energy H(E) the number of visits to each possible energy E of the system, would
converge to g(E) if the walk visited all possible conﬁgurations. In practice, it would be impossible
to realize such a long random walk given the extremely large number of conﬁgurations. For
example, the Ising model on a 10 × 10 square lattice has 2100 ≈ 1.3 × 1030 spin conﬁgurations.
The main diﬃculty of doing a simple random walk to determine g(E) is that the walk would
spend most of its time visiting the same energy values over and over again and would not
reach the values of E that are less probable. The idea of the Wang–Landau algorithm is to do a
random walk in energy space by ﬂipping single spins at random and to accept the changes with
a probability that is proportional to the reciprocal of the density of states. That is, energy values
that would be visited often using a simple random walk would be visited less often because they
have a bigger density of states. There is only one problem—we don’t know the density of states.
We will see that the Wang–Landau algorithm estimates the density of states at the same time
that it does a random walk in phase space. For simplicity, we discuss the algorithm in the
context of the Ising model for which E is a discrete variable.
1. Start with an initial arbitrary conﬁguration of spins and a guess for the density of states.
The simplest guess is to set g(E) = 1 for all possible energies E.
2. Choose a spin at random and make a trial ﬂip. Compute the energy before the ﬂip, E1,
and after, E2, and accept the change with probability
p(E1 → E2) = min
g(E1)
g(E2)
,1 , (15.74)
Equation (15.74) implies that if g(E2) ≤ g(E1), the state with energy E2 is always accepted;
otherwise, it is accepted with probability g(E1)/g(E2). That is, the state with energy E2 is
accepted if a random number r ≤ g(E1)/g(E2).
3. Suppose that after step (2) the energy of the system is E. (E is E2 if the change is accepted
or remains at E1 if the change is not accepted.) Then
g(E) = f g(E) (15.75)
H(E) = H(E) + 1. (15.76)
That is, we multiply the current value of g(E) by the modiﬁcation factor f > 1, and we
update the existing entry for H(E) in the energy histogram. Because g(E) becomes very
large, in practice we must work with the logarithm of the density of states so that ln(g(E))
will ﬁt into double precision numbers. Therefore, each update of the density of states
is implemented as ln(g(E)) → ln(g(E)) + ln(f ), and the ratio of the density of states is
computed as exp[ln(g(E1)) − ln(g(E2))].
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 632
4. A reasonable choice of the initial modiﬁcation factor is f = f0 = e1 2.71828... . If f0 is too
small, the random walk will need a very long time to reach all possible energies; however,
too large a choice of f0 will lead to large statistical errors.
5. Proceed with the random walk in energy space until a “ﬂat” histogram H(E) is obtained,
that is, until all the possible energy values are visited an approximately equal number of
times. If the histogram was truly ﬂat, all the possible energies would have been visited an
equal number of times. Of course, it is impossible to obtain a perfectly ﬂat histogram, and
we will say that H(E) is ﬂat when H(E) for all possible E is not less than p of the average
histogram H(E) ; p is chosen according to the size and the complexity of the system and
the desired accuracy of the density of states. For the two-dimensional Ising model on
small lattices, p can be chosen to be as high as 0.95, but for large systems the criterion for
ﬂatness may never be satisﬁed if p is too close to unity.
6. Once the ﬂatness criterion has been satisﬁed, reduce the modiﬁcation factor f using a
function such as f1 = f0, reset the histogram to H(E) = 0 for all values of E, and begin
the next iteration of the random walk during which the density of states is modiﬁed by
f1 at each step. The density of states is not reset during the simulation. We continue
performing the random walk until the histogram H(E) is again ﬂat.
7. Reduce the modiﬁcation factor fi+1 = fi, reset the histogram to H(E) = 0 for all values
of E, and continue the random walk. Stop the simulation when f is smaller than a predeﬁned
value (such as fﬁnal = exp(10−8) ≈ 1.00000001). The modiﬁcation factor acts as a
control parameter for the accuracy of the density of states during the simulation and also
determines how many Monte Carlo sweeps are necessary for the entire simulation.
At the end of the simulation, the algorithm provides only a relative density of states. To
determine the normalized density of states gn(E), we can either use the fact that the total number
of states for the Ising model is E g(E) = 2N or that the number of ground states (for which
E = −2N) is 2. The latter normalization guarantees the accuracy of the density of states at low
energies, which is important in the calculation of thermodynamic quantities at low temperature.
If we apply the former condition, we cannot guarantee the accuracy of g(E) for energies at or
near the ground state, because the rescaling factor is dominated by the maximum density of
states. We can use one of these two normalization conditions to obtain the absolute density of
states and use the other normalization condition to check the accuracy of our result.
Problem 15.30. Sampling the density of states
(a) Implement the Wang–Landau algorithm for the two-dimensional Ising model for L = 4, 8,
and 16. For simplicity, choose p = 0.8 as your criterion for ﬂatness. How many Monte Carlo
steps per spin are needed for each iteration? Determine the density of states and describe
its qualitative dependence on E.
(b) Compute P (E) = g(E)e−βE/Z for diﬀerent temperatures for the L = 16 system. If T = 0.1,
what range of energies will contribute to the speciﬁc heat? What is the range of relevant
energies for T = 1.0, T = Tc, and T = 4.0?
(c) Use the density of states that you computed in part (a) to compute the mean energy, the
speciﬁc heat, the free energy, and the entropy as a function of temperature. Compare your
results to your results for E and C that you found using the Metropolis algorithm in Problem
15.16.
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 633
(d) Use the Wang–Landau algorithm to determine the density of states for the one-dimensional
Ising model. In this case you can compare your computed values of g(E) to the exact answer:
g(E) = 2
N!
i!(N − i)!
(15.77)
where E = 2i − N, i = 0,2,...,N, and N is even. How does the accuracy of the computed
values of g(E) depend on the choice of p for the ﬂatness criterion? (Exact results are available
for g(E) for the two-dimensional Ising model as well, but no explicit combinatorial formula
exists. See the article by Beale.)
(e)∗ The results that you have obtained so far have probably convinced you that the Wang–
Landau algorithm is ideal for simulating a variety of systems with many degrees of freedom.
What about critical slowing down? Does the Wang–Landau algorithm overcome this
limitation of other single spin ﬂip algorithms? To gain some insight, we ask, given the exact
g(E), how eﬃciently does the Wang–Landau sample the diﬀerent values of E? Use either
the exact density of states in two dimensions computed by Beale or the approximate one
that you computed in part (a) and set f = 1. Because the system is doing a random walk
in energy space, it is reasonable to compute the diﬀusion constant of the random walker in
energy space:
DE(t) = [E(t) − E(0)]2
/t (15.78)
where t is the time diﬀerence, and the choice of the time origin is arbitrary. The idea is to
ﬁnd the dependence of D on the energy E of the system at a particular time origin. How
long does it take the system to return to this energy? Run for a suﬃciently long time so
that DE is independent of t. Plot DE as a function of E. Where is D a maximum? If time
permits, determine DE at the energy Ec corresponding to the critical temperature. How
does DEc
depend on L?
15.13 More Applications
You are probably convinced that Monte Carlo methods are powerful, ﬂexible, and applicable to
a wide variety of systems. Extensions to the Monte Carlo methods that we have not discussed
include multiparticle moves, biased moves where particles tend to move in the direction of
the force on them, bit manipulation for Ising-like models, and the use of multiple processors
to update diﬀerent parts of a large system simultaneously. We also have not described the
simulation of systems with long-range potentials such as Coulombic systems and dipole-dipole
interactions. For these potentials, it is necessary to include the interactions of the particles in
the center cell with the inﬁnite set of periodic images.
We conclude this chapter with a discussion of Monte Carlo methods in a context that might
seem to have little in common with the types of problems we have discussed. This context is
called multivariate or combinatorial optimization, a fancy way of saying, “How do you ﬁnd the
global minimum of a function that depends on many parameters?” Problems of this type arise
in many areas of scheduling and design as well as in physics, biology, and chemistry. We explain
the nature of this type of problem for the traveling salesman problem, although we would prefer
to call it the traveling peddler or traveling salesperson problem.
Suppose that a salesman wishes to visit N cities and follow a route such that no city is visited
more than once and the end of the trip coincides with the beginning. Given these constraints,
the problem is to ﬁnd the optimum route such that the total distance traveled is a minimum. An
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 634
W
B
J
T
P L
E
K
Figure 15.7: What is the optimum route for this random arrangement of N = 8 cities? The route
begins and ends at city W. A possible route is shown.
example of N = 8 cities and a possible route is shown in Figure 15.7. All known exact methods
for determining the optimal route require a computing time that increases as eN , and hence,
in practice, an exact solution can be found only for a small number of cities. (The traveling
salesman problem belongs to a large class of problems known as NP-complete. The NP refers
to nondeterministic polynomial. Such problems cannot be done in a time proportional to a
ﬁnite polynomial in N on standard computers, though polynomial time algorithms are known
for hypothetical nondeterministic (quantum) computers.) What is a reasonable estimate for the
maximum number of cities that you can consider without the use of a computer?
To understand the nature of the diﬀerent approaches to the traveling salesman problem,
consider the plot in Figure 15.8 of the “energy” or “cost” function E(a). We can associate E(a)
with the length of the route and interpret a as a parameter that represents the order in which
the cities are visited. If E(a) has several local minima, what is a good strategy for ﬁnding the
global (absolute) minimum of E(a)? One way is to vary a systematically and ﬁnd the value of E
everywhere. This way corresponds to an exact enumeration method and would mean knowing
the length of each possible route, an impossible task if the number of cities is too large. Another
way is to use a heuristic method, that is, an approximate method for ﬁnding a route that is close to
the absolute minimum. One strategy is to choose a value of a, generate a small random change
δa, and accept this change if E(a + δa) is less than or equal to E(a). This iterative improvement
strategy corresponds to a search for steps that lead downhill (see Figure 15.8). Because this
strategy usually leads to a local and not a global minimum, it is useful to begin from several
initial choices of a and to keep the best result. What would be the application of this type of
strategy to the salesman problem?
Because we cannot optimize the path exactly when N becomes large, we have to be satisﬁed
with solving the optimization problem approximately and ﬁnding a relatively good local minimum.
To understand the motivation for the simulated annealing algorithm, consider a seemingly
unrelated problem. Suppose we wish to make a perfect single crystal. You might know that we
should start with the material at a high temperature at which the material is a liquid melt and
then gradually lower the temperature. If we lower the temperature too quickly (a rapid quench),
the resulting crystal would have many defects or not become a crystal at all. The gradual lowering
of the temperature is known as annealing.
The method of annealing can be used to estimate the minimum of E(a). We choose a value
of a, generate a small random change δa, and calculate E(a + δa). If E(a + δa) is less than or
equal to E(a), we accept the change. However, if ∆E = E(a + δa) − E(a) > 0, we accept the change
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 635
E(a)
a
Figure 15.8: Plot of the function E(a) as a function of the parameter a.
with a probability p = e−∆E/T , where T is an eﬀective temperature. This procedure is the familiar
Metropolis algorithm with the temperature playing the role of a control parameter. The
simulated annealing process consists of ﬁrst choosing a value for T for which most moves are
accepted and then gradually lowering the temperature. At each temperature, the simulation
should last long enough for the system to reach quasiequilibrium. The annealing schedule, that
is, the rate of temperature decrease, determines the quality of the solution.
The idea is to allow moves that result in solutions of worse quality than the current solution
(uphill moves) in order to escape from local minima. The probability of doing such a move is
decreased during the search. The slower the temperature is lowered, the higher the chance
of ﬁnding the optimum solution, but the longer the run time. The eﬀective use of simulated
annealing depends on ﬁnding a annealing schedule that yields good solutions without taking
too much time. It has been proven that if the cooling rate is suﬃciently slow, the absolute
(global) minimum will eventually be reached. The bounds for “suﬃciently slow” depend on the
properties of the search landscape (the nature of E(a)) and are exceeded for most problems of
interest. However, simulated annealing is usually superior to conventional heuristic algorithms.
The moral of the simulated annealing method is that sometimes it is necessary to climb a
hill to reach a valley. The ﬁrst application of the method of simulated annealing was to the
optimal design of computers. In Problem 15.31 we apply this method to the traveling salesman
problem.
Problem 15.31. Simulated annealing and the traveling salesman problem
(a) Generate a random arrangement of N = 8 cities in a square of linear dimension L =
√
N and
calculate the optimum route by hand. Then write a Monte Carlo program and apply the
method of simulated annealing to this problem. For example, use two arrays to store the xand
y coordinate of each city and an array to store the distances between them. The state of
the system, that is, the route represented by a sequence of cities, can be stored in another
array. The length of this route is associated with the energy of an imaginary thermal system.
A reasonable choice for the initial temperature is one that is the same order as the initial
energy. One way to generate a random rearrangement of the route is to choose two cities at
random and to interchange the order of visit. Choose this method or one that you devise
and ﬁnd a reasonable annealing schedule. Compare your annealing results to exact results
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 636
whenever possible. Extend your results to larger N, for example, N = 12, 24, and 48. For
a given annealing schedule, determine the probability of ﬁnding a route of a given length.
More suggestions can be found in the references.
(b) The microcanonical Monte Carlo algorithm (demon) discussed in Section 15.3 can also be
used to do simulated annealing. The advantages of the demon algorithm are that it is deterministic
and allows large temperature ﬂuctuations. One way to implement the analog
of simulated annealing is to impose a maximum value on the energy of the demon, Ed,max,
which is gradually decreased. Guo et al. choose Ed,max to be initially equal to
√
N/4. Their
results are comparable to the usual simulated annealing method but require approximately
half the CPU time. Apply this method to the same city positions that you considered in
part (a) and compare your results.
15.14 Projects
Many of the original applications of Monte Carlo methods were done for systems of approximately
one hundred particles and lattices of order 322 spins. It would be instructive to redo
many of these applications with much better statistics and with larger system sizes. In the following,
we discuss some additional recent developments, but we have omitted other important
topics such as Brownian dynamics and umbrella sampling. More ideas for projects can be found
in the references.
Project 15.32. Overcoming critical slowing down
The usual limiting factor of most simulations is the speed of the computer. Of course, one way
to overcome this problem is to use a faster computer. Near a continuous phase transition, the
most important limiting factor on even the fastest available computers is the existence of critical
slowing down (see Problem 15.19). In this project we discuss the nature of critical slowing down
and ways of overcoming it in the context of the Ising model.
As we have mentioned, the existence of critical slowing down is related to the fact that the
size of the correlated regions of spins becomes very large near the critical point. The large size
of the correlated regions and the corresponding divergent behavior of the correlation length ξ
near Tc implies that the time τ required for a region to lose its coherence becomes very long
if a local dynamics is used. At T = Tc, τ ∼ Lz for L 1. For single spin ﬂip algorithms, z ≈ 2
and τ becomes very large for L 1. On a serial computer, the CPU time needed to obtain n
conﬁgurations increases as L2, the time needed to visit L2 spins. This factor of L2 is expected and
not a problem because a larger system contains proportionally more information. However, the
time needed to obtain n approximately independent conﬁgurations is of order τL2 ∼ L2+z ≈ L4
for the Metropolis algorithm. We conclude that an increase of L by a factor of 10 requires 104
more computing time. Hence, the existence of critical slowing down limits the maximum value
of L that can be considered.
If we are interested only in the static properties of the Ising model, the choice of dynamics
is irrelevant as long as the transition probability satisﬁes the detailed balance condition (15.18).
It is reasonable to look for a global algorithm for which groups or clusters of spins are ﬂipped
simultaneously. We are already familiar with cluster properties in the context of percolation
(see Chapter 12). A naive deﬁnition of a cluster of spins might be a domain of parallel nearest
neighbor spins. We can make this deﬁnition explicit by introducing a bond between any two
nearest neighbor spins that are parallel. The introduction of a bond between parallel spins
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 637
(a) (b)
Figure 15.9: (a) A cluster of two up spins. (b) A cluster of two down spins. The ﬁlled and open
circles represent the up and down spins, respectively. Note the bond between the two spins in
the cluster. Adapted from Newman and Barkema.
deﬁnes a site-bond percolation problem. More generally, we may assume that such a bond
exists with probability p and that this bond probability depends on the temperature T .
The dependence of p on T can be determined by requiring that the percolation transition
of the clusters occurs at the Ising critical point and by requiring that the critical exponents
associated with the clusters be identical to the analogous thermal exponents. For example, we
can deﬁne a critical exponent νp to characterize the divergence of the connectedness length
of the clusters near pc. The analogous thermal exponent ν quantiﬁes the divergence of the
thermal correlation length ξ near Tc. We will argue in the following that these (and other)
critical exponents are identical if we deﬁne the bond probability as
p = 1 − e−2J/kT
(bond probability). (15.79)
The relation (15.79) holds for any spatial dimension. What is the value of p at T = Tc for the
two-dimensional Ising model on the square lattice?
A simple argument for the temperature dependence of p in (15.79) is as follows. Consider
the two conﬁgurations in Figure 15.9 which diﬀer from one another by the ﬂip of the cluster
of two spins. In Figure 15.9(a) the six nearest neighbor spins of the cluster are in the opposite
direction and, hence, are not part of the cluster. Thus, the probability of this conﬁguration
with a cluster of two spins is pe−βJe6βJ, where p is the probability of a bond between the two
up spins, e−βJ is proportional to the probability that these two spins are parallel, and e6βJ is
proportional to the probability that the six nearest neighbors are antiparallel. In Figure 15.9(b)
the cluster spins have been ﬂipped, and the possible bonds between the cluster spins and its
nearest neighbors have to be “broken.” The probability of this conﬁguration with a cluster of
two (down) spins is p(1 − p)6e−βJe−6βJ, where the factor of (1 − p)6 is the probability that the
six nearest neighbor spins are not part of the cluster. Because we want the probability that a
cluster is ﬂipped to be unity, we need to have the probability of the two conﬁgurations and their
corresponding clusters be the same. Hence, we must have
peβJ
e−6βJ
= p(1 − p)6
eβJ
e6βJ
, (15.80)
or (1 − p)6 = e−12βJ. It is straightforward to solve for p and obtain the relation (15.79).
Now that we know how to generate clusters of spins, we can use these clusters to construct
a global dynamics instead of only ﬂipping one spin at a time as in the Metropolis algorithm.
The idea is to grow a single (site-bond) percolation cluster in a way that is analogous to the
single (site) percolation cluster algorithm discussed in Section 13.1. The algorithm can be implemented
by the following steps:
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 638
(i) Choose a seed spin at random. Its four nearest neighbor sites (on the square lattice) are
the perimeter sites. Form an ordered array corresponding to the perimeter spins that are
parallel to the seed spin and deﬁne a counter for the total number of perimeter spins.
(ii) Choose the ﬁrst spin in the ordered perimeter array. Remove it from the array and replace
it by the last spin in the array. Generate a random number r. If r ≤ p, the bond exists
between the two spins, and the perimeter spin is added to the cluster.
(iii) If the spin is added to the cluster, inspect its parallel perimeter spins. If any of these spins
are not already a part of the cluster, add them to the end of the array of perimeter spins.
(iv) Repeat steps (ii) and (iii) until no perimeter spins remain.
(v) Flip all the spins in the single cluster.
This algorithm is known as single cluster ﬂip or Wolﬀ dynamics. Note that bonds, rather than
sites, are tested so that a spin might have more than one chance to join a cluster. In the following,
we consider both the static and dynamical properties of the two-dimensional Ising model using
the Wolﬀ algorithm to generate the conﬁgurations.
(a) Modify your program for the Ising model on a square lattice so that single cluster ﬂip dynamics
(the Wolﬀ algorithm) is used. Compute the mean energy and magnetization for
L = 16 as a function of T for T = 2.0 to 2.7 in steps of 0.1. Compare your results to those
obtained using the Metropolis algorithm. How many cluster ﬂips do you need to obtain
comparable accuracy at each temperature? Is the Wolﬀ algorithm more eﬃcient at every
temperature near Tc?
(b) Fix T at the critical temperature of the inﬁnite lattice (Tc = 2/ln(1 +
√
2)) and use ﬁnite size
scaling to estimate the values of the various static critical exponents, for example, γ and α.
Compare your results to those obtained using the Metropolis algorithm.
(c) Because we are generating site-bond percolation clusters, we can study their geometrical
properties as we did for site percolation. For example, measure the distribution sns of cluster
sizes at p = pc (see Problem 13.3). How does ns depend on s for large s (see Project 13.15)?
What is the fractal dimension of the clusters in the Ising model at T = Tc?
(d) The natural unit of time for single cluster ﬂip dynamics is the number of cluster ﬂips tcf.
Measure CM(tcf) and/or CE(tcf) and estimate the corresponding correlation time τcf for T =
2.5, 2.4, 2.3, and Tc for L = 16. As discussed in Problem 15.19, τcf can be found from the
relation, τcf = tcf=1 C(tcf). The sum is cut–oﬀ at the ﬁrst negative value of C(tcf). Estimate
the value of zcf from the relation τcf = Lzcf .
(e) To compare our results for the Wolﬀ algorithm to our results for the Metropolis algorithm,
we should use the same unit of time. Because only a fraction of the spins are updated at
each cluster ﬂip, the time tcf is not equal to the usual unit of time, which corresponds to an
update of the entire lattice or one Monte Carlo step per spin. We have that τ measured in
Monte Carlo steps per spin is related to τcf by τ = τcf c /L2, where c is the mean number of
spins in the single clusters, and L2 is the number of spins in the entire lattice. Verify that the
mean cluster size scales as c ∼ Lγ/ν with γ = 7/4 and ν = 1. (The quantity c is the same
quantity as the mean cluster size S deﬁned in Chapter 12. The exponents characterizing the
divergence of the various properties of the clusters are identical to the analogous thermal
exponents.)
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 639
(f) To obtain the value of z that is directly comparable to the value found for the Metropolis
algorithm, we need to rescale the time as in part (e). We have that τ ∼ Lz ∝ Lzcf Lγ/νL−d.
Hence, z is related to the measured value of zcf by z = zcf −(d −γ/ν). What is your estimated
value of z? (It has been estimated that zcf ≈ 0.50 for the d = 2 Ising model, which would
imply that z ≈ 0.25.)
(g) One of the limitations of the usual implementation of the Metropolis algorithm is that only
one spin is ﬂipped at a time. However, there is no reason why we could not choose f spins
at random, compute the change in energy ∆E for ﬂipping these f spins, and accepting or
rejecting the trial move in the usual way according to the Boltzmann probability. Explain
why this generalization of the Metropolis algorithm would be very ineﬃcient, especially if
f >> 1. We conclude that the groups of spins to be ﬂipped must be chosen with the physics
of the system in mind and not simply at random.
Another cluster algorithm is to assign all bonds between parallel spins with probability p.
As usual, no bonds are included between sites that have diﬀerent spin orientations. From this
conﬁguration of bonds, we can form clusters of spins using one of the cluster identiﬁcation
algorithms we discussed in Chapter 12. The smallest cluster contains a single spin. After the
clusters have been identiﬁed, all the spins in each cluster are ﬂipped with probability 1/2. This
algorithm is known as the Swendsen-Wang algorithm and preceded the Wolﬀ algorithm. Because
the Wolﬀ algorithm is easier to program and gives a smaller value of z than the Swendsen-Wang
algorithm for the d = 3 and d = 4 Ising models, the Wolﬀ algorithm is more commonly used.
Project 15.33. Invaded cluster algorithm
In Problem 13.7 we found that invasion percolation is an example of a self-organized critical
phenomenon. In this cluster growth algorithm, random numbers are independently assigned
to the bonds of a lattice. The growth starts from the seed sites of the left-most column. At
each step the cluster grows by the occupation of the perimeter bond with the smallest random
number. The growth continues until the cluster satisﬁes a stopping condition. We found that
if we stop adding sites when the cluster is comparable in extent to the linear dimension L, then
the fraction of bonds that are occupied approaches the percolation threshold pc as L → ∞. The
invaded percolation algorithm automatically ﬁnds the percolation threshold!
Machta and co-workers have used this idea to ﬁnd the critical temperature of a spin system
without knowing its value in advance. For simplicity, we will discuss their algorithm in the
context of the Ising model, although it can be easily generalized to the q-state Potts model (see
the references). Consider a lattice on which there is a spin conﬁguration {si}. The bonds of
the lattice are assigned a random order. Bonds (i,j) are tested in this assigned order to see if
si is parallel to sj. If so, the bond is occupied and spins i and j are a part of the same cluster.
Otherwise, the bond is not occupied and is not considered for the remainder of the current
Monte Carlo step. The set of occupied bonds partitions the lattice into clusters of connected
sites. The clusters can be found using the Newman–Ziﬀ algorithm (see Section 12.3). The
cluster structure evolves until a stopping condition is satisﬁed. Then a new spin conﬁguration
is obtained by ﬂipping each cluster with probability 1/2, thus completing one Monte Carlo
step. The fraction f of bonds that were occupied during the growth process and the energy of
the system are measured. The bonds are then randomly reordered and the process begins again.
Note that the temperature is not an input parameter.
If open boundary conditions are used, the appropriate stopping rule is that a cluster spans
the lattice (see Chapter 12, page 450). For periodic boundary conditions, the spanning rule
discussed in Project 12.17 is appropriate.
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 640
Write a program to simulate the invaded cluster algorithm for the Ising model on the square
lattice. Start with all spins up and determine how many Monte Carlo steps are needed for
equilibration. How does this number compare to that required by the Metropolis algorithm at
the critical temperature for the same value of L? An estimate for the critical temperature can be
found from the relation (15.79) with f corresponding to p.
After you are satisﬁed that your program is working properly, determine the dependence
of the critical temperature on the concentration c of nonmagnetic impurities. That is, randomly
place nonmagnetic impurities on a fraction c of the sites.
Project 15.34. Physical test of random number generators
In Section 7.9 we discussed various statistical tests for the quality of random number generators.
In this project we will ﬁnd that the usual statistical tests might not be suﬃcient for determining
the quality of a random number generator for a particular application. The diﬃculty is
that the quality of a random number generator for a speciﬁc application depends in part on
how the subtle correlations that are intrinsic to all deterministic random number generators
couple to the way that the random number sequences are used. In this project we explore the
quality of two random number generators when they are used to implement single spin ﬂip
dynamics (the Metropolis algorithm) and single cluster ﬂip dynamics (Wolﬀ algorithm) for the
two-dimensional Ising model.
(a) Write methods to generate sequences of random numbers based on the linear congruential
algorithm
xn = 16807xn−1 mod(231
− 1), (15.81)
and the generalized feedback shift register (GFSR) algorithm
xn = xn−103 ⊕ xn−250. (15.82)
In both cases xn is the nth random number. Both algorithms require that xn be divided
by the largest possible value of xn to obtain numbers in the range 0 ≤ xn < 1. The GFSR
algorithm requires bit manipulation. Which random number generator does a better job of
passing the various statistical tests discussed in Problem 7.35?
(b) Use the Metropolis algorithm and the linear congruential random number generator to determine
the mean energy per spin E/N and the speciﬁc heat (per spin) C for the L = 16
Ising model at T = Tc = 2/ln(1 +
√
2). Make ten independent runs (that is, ten runs that
use diﬀerent random number seeds) and compute the standard deviation of the means σm
from the ten values of E/N and C, respectively. Published results by Ferrenberg, Landau,
and Wong are for 106 Monte Carlo steps per spin for each run. Calculate the diﬀerences
δe and δc between the average of E/N and C over the ten runs and the exact values (to ﬁve
decimal places), E/N = −1.45306 and C = 1.49871. If the ratio δ/σm for the two quantities is
order unity, then the random number generator does not appear to be biased. Repeat your
runs using the GFSR algorithm to generate the random number sequences. Do you ﬁnd any
evidence of statistical bias?
(c) Repeat part (b) using Wolﬀ dynamics. Do you ﬁnd any evidence of statistical bias?
(d) Repeat the computations in parts (b) and (c) using the random number generator supplied
with your programming language.
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 641
Project 15.35. Nucleation and the Ising model
(a) Equilibrate the two-dimensional Ising model at T = 4Tc/9 and B = 0.3 for a system with
L ≥ 50. What is the equilibrium value of m? Then ﬂip the magnetic ﬁeld so that it points
down, that is, B = −0.3. Use the Metropolis algorithm and plot m as a function of the time t
(the number of Monte Carlo steps per spin). What is the qualitative behavior of m(t)? Does
it ﬂuctuate about a positive value for a time long enough to determine various averages? If
so, the system can be considered to have been in a metastable state. Watch the spins evolve
for a time before m changes sign. Visually determine a place in the lattice where a “droplet”
of the stable phase (down spins) ﬁrst appears and then grows. Change the random number
seed and rerun the simulation. Does the droplet appear in the same spot at the same time?
Can the magnitude of the ﬁeld be increased further, or is there an upper bound above which
a metastable state is not well deﬁned?
(b) As discussed in Project 15.32, we can deﬁne clusters of spins by placing a bond with probability
p between parallel spins. In this case there is an external ﬁeld and the proper deﬁnition
of the clusters is more diﬃcult. For simplicity, assume that there is a bond between all
nearest–neighbor down spins and ﬁnd all the clusters of down spins. One way to identify
the droplet that initiates the decay of the metastable state is to monitor the number of spins
in the largest cluster as a function of time after the quench. At what time does the number of
spins in the largest cluster begin to grow quickly? This time is an estimate of the nucleation
time. Another way of estimating the nucleation time is to follow the evolution of the center
of mass of the largest cluster. For early times after the quench, the center of mass position
has large ﬂuctuations. However, at a certain time these ﬂuctuations decrease considerably,
which is another criterion for the nucleation time. What is the order of magnitude of the
nucleation time?
(c) While the system is in a metastable state, clusters of down spins grow and shrink randomly
until eventually one of the clusters becomes large enough to grow, nucleation occurs, and
the system decays to its stable macroscopic state. The cluster that initiates this decay is
called the nucleating droplet. This type of nucleation is due to spontaneous thermal ﬂuctuations
and is called homogeneous nucleation. Although the criteria for the nucleation time
that we used in part (b) are plausible, they are not based on fundamental considerations.
From theoretical considerations the nucleating droplet can be thought of as a cluster that
just makes it to the top of the saddle point of the free energy that separates the metastable
and stable states. We can identify the nucleating droplet by using the fact that a saddle point
structure should initiate the decay of the metastable state 50% of the time. The idea is to
save the spin conﬁgurations at regular intervals at about the time that nucleation is thought
to have occurred. We then restart the simulation using a saved conﬁguration at a certain
time and use a diﬀerent random number sequence to ﬂip the spins. If we have intervened
at a time such that the largest cluster decays in more than 50% of the trials, then the intervention
time (the time at which we changed the random number seed) is before nucleation.
Similarly, if less than 50% of the clusters decay, the intervention is after the nucleation time.
The nucleating droplet is the cluster that decays in approximately half of the trial interventions.
Because we need to do a number of interventions (usually in the range 20–100) at
diﬀerent times, the intervention method is much more CPU intensive than the other criteria.
However, it has the advantage that it has a sound theoretical basis. Redo some of the
simulations that you did in part (b) and compare the diﬀerent estimates of the nucleation
time. What is the nature and size of the nucleating droplet? If time permits, determine the
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 642
probability that the system nucleates at time t for a given quench depth. (Measure the time
t after the ﬂip of the ﬁeld.)
(d) Heterogeneous nucleation occurs in nature because of the presence of impurities, defects, or
walls. One way of simulating heterogeneous nucleation in the Ising model is to ﬁx a certain
number of spins in the direction of the stable phase (down). For simplicity, choose the
impurity to be ﬁve spins in the shape of a + sign. What is the eﬀect of the impurity on the
lifetime of the metastable state? What is the probability of droplet growth on and oﬀ the
impurity as a function of quench depth B?
(e) The questions raised in parts (b)–(d) become even more interesting when the interaction
between the spins extends beyond nearest neighbors. Assume that a given spin interacts
with all spins that are within a distance R with an interaction strength of 4J/q, where q is
the number of spins within the interaction range R. (Note that q = 4 for nearest neighbor
interactions on the square lattice.) A good choice is R = 10, although your preliminary
simulations should be for smaller R. How does the value of Tc change as R is increased?
Project 15.36. The n-fold way: Simulations at low temperature
Monte Carlo simulations become very ineﬃcient at low temperatures because almost all trial
conﬁgurations will be rejected. For example, consider an Ising model for which all spins are
up, but a small magnetic ﬁeld is applied in the negative direction. The equilibrium state will
have most spins pointing down. Nevertheless, if the magnetic ﬁeld is small and the temperature
is low enough, equilibrium will take a very long time to occur.
What we need is a more eﬃcient way of sampling conﬁgurations if the acceptance probability
is low. The n-fold way algorithm is one such method. The idea is to accept more low
probability conﬁgurations but to weight them appropriately. If we use the usual Metropolis
rule, then the probability of ﬂipping the ith spin is
pi = min 1,e−∆E/kT
. (15.83)
One limitation of the Metropolis algorithm is that it becomes very ineﬃceint if the probabilities
pi are very small. If we sum over all the spins, then we can deﬁne the total weight
Q =
i
pi. (15.84)
The idea is to choose a spin to ﬂip (with probability one) by computing a random number
rQ between 0 and Q and ﬁnding spin i that satisﬁes the condition:
i−1
k=0
pk ≤ rQ <
i
k=0
pk. (15.85)
There are two more ingredients we need to make this algorithm practical. We need to determine
how long a conﬁguration would remain unchanged if we had used the Metropolis algorithm.
Also, the algorithm would be very ineﬃcient because on average the computation of which
spin to ﬂip from (15.85) would take O(N) computations. This second problem can be easily
overcome by realizing that there are only a few possible values of pi. For example, for the Ising
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 643
model on a square lattice in a magnetic ﬁeld, there are only n = 10 possible values of pi. Thus,
instead of (15.85), we have
i−1
α=0
nαpα ≤ rQ <
i
α=0
nαpα (15.86)
where α labels one of the n possible values of pi or classes, and nα is the number of spins in
class α. Hence, instead of O(N) calculations, we need to perform only O(n) calculations. Once
we know which class we have chosen, we can randomly ﬂip one of the spins in that class.
Next we need to determine the time spent in a conﬁguration. The probability in one
Metropolis Monte Carlo step of choosing a spin at random is 1/N, and the probability of actually
ﬂipping that spin is pi, which is given by (15.83). Thus, the probability of ﬂipping any
spin is
1
N
N−1
i=0
pi =
1
N
n−1
α=0
nαpα =
Q
N
. (15.87)
The probability of not ﬂipping any spin is q ≡ 1 − Q/N, and the probability of not ﬂipping after
s steps is qs. Thus, if we generate a random number r between 0 and 1, the time s in Monte
Carlo steps per spin to remain in the current conﬁguration will be determined by solving
qs−1
≤ r < qs
. (15.88)
If Q/N << 1, then both sides of (15.88) are approximately equal, and we can approximate s by
s ≈
lnr
lnq
=
lnr
ln(1 − Q/N)
≈ −
N
Q
lnr. (15.89)
That is, we would have to wait s Monte Carlo steps per spin on the average before we would ﬂip
a spin using the Metropolis algorithm. Note that the random number r in (15.88) and (15.89)
should not be confused with the random number rQ in (15.86).
The n-fold algorithm can be summarized by the following steps:
(i) Start with an initial conﬁguration and determine the class to which each spin belongs.
Store all the possible values of pi in an array. Compute Q. Store in an array the number of
spins in class α, nα.
(ii) Determine s from (15.89). Accumulate any averages, such as the energy and magnetization
weighted by s. Also, accumulate the total time tTotal += s.
(iii) Choose a class of spin using (15.86) and randomly choose which spin in the chosen class
to ﬂip.
(iv) Update the classes of the chosen spin and its four neighbors.
(v) Repeat steps (ii)–(iv).
To conveniently carry out step (iv), set up the following arrays: spinClass[i] returns the
class of the ith spin, spinInClass[k][alpha] returns the kth spin in class α, and spinIndex[i]
returns the value of k for the ith spin to use in the array spinInClass[k][alpha]. If we deﬁne
the local ﬁeld of a spin by the sum of the ﬁelds of its four neighbors, then this local ﬁeld can
take on the values {−4,−2,0,2,4}. The ten classes correspond to these ﬁve local ﬁeld values and
the center spin equal to −1 plus these ﬁve local ﬁeld values and the center spin equal to +1.
If we order these ten classes from 0 to 9, then the class of a spin that is ﬂipped changes by +5
mod 10, and the class of a neighbor changes by the new spin value equal to ±1.
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 644
Figure 15.10: A typical conﬁguration of the planar model on a 24 × 24 square lattice that has
been quenched from T = ∞ to T = 0 and equilibrated for 200 Monte Carlo steps per spin after
the quench. Note that there are six vortices. The circle around each vortex is a guide to the eye
and is not meant to indicate the size of the vortex.
(a) Write a program to implement the n-fold way algorithm for the Ising model on a square
lattice with an applied magnetic ﬁeld. Check your program by comparing various averages
at a few temperatures with the results from your program using the Metropolis algorithm.
(b) Choose the magnetic ﬁeld B = −0.5 at the temperature T = 1. Begin with an initial conﬁguration
of all spins up and use the n-fold way to estimate how long it takes before the
majority of the spins ﬂip. Do the same simulation using the Metropolis algorithm. Which
algorithm is more eﬃcient?
(c) Repeat part (b) for other temperature and ﬁeld values. For what conditions is the n-fold
way algorithm more eﬃcient than the standard Metropolis algorithm?
(d) Repeat part (b) for diﬀerent values of the magnetic ﬁeld and plot the number of Monte Carlo
steps needed to ﬂip the spins as a function of 1/|B| for values of B from 0 to ≈ 3. Average
over at least 10 starting conﬁgurations for each ﬁeld value.
Project 15.37. The Kosterlitz–Thouless transition
The planar model (also called the x-y model) consists of spins of unit magnitude that can point
in any direction in the x-y plane. The energy or Hamiltonian function of the planar model in
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 645
zero magnetic ﬁeld can be written as
E = −J
i,j=nn(i)
[si,xsj,x + si,ysj,y] (15.90)
where si,x represents the x-component of the spin at the ith site, J measures the strength of the
interaction, and the sum is over all nearest neighbors. We can rewrite (15.90) in a simpler form
by substituting si,x = cosθi and si,y = sinθi. The result is
E = −J
i,j=nn(i)
cos(θi − θj) (15.91)
where θi is the angle that the ith spin makes with the x- axis. The most studied case is the
two-dimensional model on a square lattice. In this case the mean magnetization M = 0 for
all temperatures T > 0, but, nevertheless, there is a phase transition at a nonzero temperature
TKT, which is known as the Kosterlitz–Thouless (KT) transition. For T ≤ TKT, the spin-spin
correlation function C(r) decreases as a power law; for T > TKT, C(r) decreases exponentially.
The power law decay of C(r) for T ≤ TKT implies that every temperature below TKT acts as if it
was a critical point. We say that the planar model has a line of critical points. In the following,
we explore some of the properties of the planar model and the mechanism that causes the
transition.
(a) Write a program that uses the Metropolis algorithm to simulate the planar model on a
square lattice using periodic boundary conditions. Because θ and hence the energy of the
system is a continuous variable, it is not possible to store the previously computed values
of the Boltzmann factor for each possible value of ∆E. Instead of computing e−β∆E for each
trial change, it is faster to set up an array w such that the array element w(j) = e−β∆E, where
j is the integer part of 1000∆E. This procedure leads to an energy resolution of 0.001, which
should be suﬃcient for most purposes.
(b) One way to show that the magnetization M vanishes for all T is to compute θ2 , where θ is
the angle that a spin makes with the magnetization M for a given conﬁguration. (Although
the mean magnetization vanishes, M 0 at any given time.) Compute θ2 as a function of
the number of spins N at T = 0.1 and show that θ2 diverges as lnN. Begin with a 4 × 4
lattice and choose the maximum change in θi to be ∆θmax = 1.0. If necessary, change θmax
so that the acceptance probability is about 40%. If θ2 diverges, then the ﬂuctuations in
the direction of the spins diverges, which implies that there is no preferred direction for the
spins, and hence the mean magnetization vanishes.
(c) Modify your program so that an arrow is drawn at each site to show the orientation of
each spin. You can use the Vector2DFrame to draw a lattice of arrows. Look at a typical
conﬁguration and analyze it visually. Begin with a 32 × 32 lattice with spins pointing in
random directions and do a temperature quench to T = 0.5. (Simply change the value of β
in the Boltzmann probability.) Such a quench should lock in some long lived but metastable
vortices. A vortex is a region of the lattice where the spins rotate by at least 2π as your eye
moves around a closed path (see Figure 15.10). To determine the center of a vortex, choose a
group of four spins that are at the corners of a unit square and determine whether the spins
rotate by ±2π as your eye goes from one spin to the next in a counterclockwise direction
around the square. Assume that the diﬀerence between the direction of two neighboring
spins δθ is in the range −π < δθ < π. A total rotation of +2π indicates the existence of
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 646
a positive vortex, and a change of −2π indicates a negative vortex. Count the number of
positive and negative vortices. Repeat these observations for several conﬁgurations. What
can you say about the number of vortices of each sign?
(d) Write a method to determine the existence of a vortex for each 1 × 1 square of the lattice.
Represent the center of the vortices using a diﬀerent symbol to distinguish between a positive
and a negative vortex. Do a Monte Carlo simulation to compute the mean energy,
the speciﬁc heat, and number of vortices in the range from T = 0.5 to T = 1.5 in steps of
0.1. Use the last conﬁguration at the previous temperature as the ﬁrst conﬁguration for the
next temperature. Begin at T = 0.5 with all θi = 0. Draw the vortex locations for the last
conﬁguration at each temperature. Use at least 1000 Monte Carlo steps per spin at each
temperature to equilibrate and at least 5000 Monte Carlo steps per spin for computing the
averages. Use an 8×8 or 16×16 lattice if your computer resources are limited and larger lattices
if you have suﬃcient resources. Describe the T -dependence of the energy, the speciﬁc
heat, and the vorticity (equal to the number of vortices per unit area). Plot the logarithm
of the vorticity versus T for T < 1.1. What can you conclude about the T -dependence of
the vorticity? Explain why this form is reasonable. Describe the vortex conﬁgurations. At
what temperature do you ﬁnd a vortex that appears to be free, that is, a vortex that is not
obviously paired with another vortex of opposite sign?
(e) The Kosterlitz–Thouless theory predicts that the susceptibility χ diverges above the transition
as
χ ∼ Aeb/ ν
(15.92)
where is the reduced temperature = (T −TKT)/TKT, ν = 0.5, and A and b are nonuniversal
constants. Compute χ from the relation (15.21) with M = 0. Assume the exponential form
(15.92) for χ in the range T = 1 and T = 1.2 with ν = 0.7 and ﬁnd the best values of TKT,
A, and b. (Although theory predicts ν = 0.5, simulations for small systems indicate that
ν = 0.7 gives a better ﬁt.) One way to determine TKT, A, and b is to assume a value of TKT
and then do a least squares ﬁt of lnχ to determine A and b. Choose the set of parameters
that minimizes the variance of lnχ. How does your estimated value of TKT compare with
the temperature at which free vortices ﬁrst appear? At what temperature does the speciﬁc
heat have a peak? The Kosterlitz–Thouless theory predicts that the speciﬁc heat peak does
not occur at TKT. This prediction has been conﬁrmed by simulations (see Tobochnik and
Chester). To obtain quantitative results, you will need lattices larger than 32 × 32.
Project 15.38. The classical Heisenberg model in two dimensions
The energy or Hamiltonian of the classical Heisenberg model is similar to the Ising model and
the planar model, except that the spins can point in any direction in three dimensions. The
energy in zero external magnetic ﬁeld is
E = −J
N
i,j=nn(i)
si·sj = −J
N
i,j=nn(i)
[si,xsj,x + si,ysj,y + si,zsj,z] (15.93)
where s is a classical vector of unit length. The spins have three components, in contrast to the
spins in the Ising model which only have one component and the spins in the planar model
which have two components.
We will consider the two-dimensional Heisenberg model for which the spins are located
on a two-dimensional lattice. Early simulations and approximate theories led researchers to
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 647
believe that there was a continuous phase transition, similar to that found in the Ising model.
The Heisenberg model received more interest after it was related to quark conﬁnement. Lattice
models of the interaction between quarks, called lattice gauge theories, predict that the conﬁnement
of quarks could be explained if there are no phase transitions in these models. (The
lack of a phase transition in these models implies that the attraction between quarks grows
with distance.) The two-dimensional Heisenberg model is an analog of the four-dimensional
models used to model quark-quark interactions. Shenker and Tobochnik used a combination of
Monte Carlo and renormalization group methods to show that this model does not have a phase
transition. Subsequent work on lattice gauge theories showed similar behavior.
(a) Modify your Ising model program to simulate the Heisenberg model in two dimensions.
One way to do so is to deﬁne three arrays, one for each of the three components of the
unit spin vectors. A trial Monte Carlo move consists of randomly changing the direction
of a spin, si. First compute a small vector ∆s = ∆smax(q1,q2,q3), where −1 ≤ qn ≤ 1 is a
uniform random number, and ∆smax is the maximum change of any spin component. If
|∆s| > ∆smax, compute another ∆s. This latter step is necessary to insure that the change in
a spin direction is symmetrically distributed around the current spin direction. Then let
the trial spin equal si + ∆s normalized to a unit vector. The standard Metropolis algorithm
can now be used to determine if the trial spin is accepted. Compute the mean energy, the
speciﬁc heat, and the susceptibility as a function of T . Choose lattice sizes of L = 8, 16, 32,
and larger, if possible, and average over at least 2000 Monte Carlo steps per spin at each
temperature. Is there any evidence of a phase transition? Does the susceptibility appear to
diverge at a nonzero temperature? Plot the logarithm of the susceptibility versus the inverse
temperature and determine the temperature dependence of the susceptibility in the limit of
low temperatures.
(b) Use the Lee–Kosterlitz analysis at the speciﬁc heat peak to determine if there is a phase
transition.
Project 15.39. Domain growth kinetics
When the Ising model is quenched from a high temperature to very low temperatures, domains
of the ordered low temperature phase typically grow with time as a power law R ∼ tα, where R
is a measure of the average linear dimension of the domains. A simple measure of the domain
size is the perimeter length of a domain which can be computed from the energy per spin ,
and is given by
R =
2
2 +
. (15.94)
Equation (15.94) can be motivated by the following argument. Imagine a region of N spins
made up of a domain of up spins with a perimeter size R embedded in a sea of down spins.
The total energy of this region is −2N + 2R, where for each spin on the perimeter, the energy is
increased by 2 because one of the neighbors of a perimeter spin will be of opposite sign. The
energy per spin is = −2 + 2R/N. Because N is of order R2, we arrive at the result given in
(15.94).
(a) Modify your Ising model program so that the initial conﬁguration is random, that is, a
typical high temperature conﬁguration. Write a target class to simulate a quench of the
system. The input parameters should be the lattice size, the quench temperature (use 0.5
initially), the maximum time (measured in Monte Carlo steps per spin) for each quench,
and the number of Monte Carlo steps between drawing the lattice. Plot ln R versus lnt
after each quench is ﬁnished, where t is measured from the time of the quench.
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 648
(b) Choose L = 64 and a maximum time of 128 mcs. Averages over 10 quenches will give acceptable
results. What value do you obtain for α? Repeat for other temperatures and system
sizes. Does the exponent change? Run for a longer maximum time to check your results.
(c) Modify your program to simulate the q-state Potts model. Consider various values of q. Do
your results change? Results for large q and large system sizes are given in Grest et al.
(d)∗ Modify your program to simulate a three-dimensional system. How should you modify
(15.94)? Are your results similar?
Project 15.40. Ground state energy of the Ising spin glass
A spin glass is a magnetic system with frozen-in disorder. An example of such a system is the
Ising model with the exchange constant Jij between nearest neighbor spins randomly chosen
to be ±1. The disorder is said to be “frozen-in” because the set of interactions {Jij} does not
change with time. Because the spins cannot arrange themselves so that every pair of spins is
in its lowest energy state, the system exhibits frustration similar to the antiferromagnetic Ising
model on a triangular lattice (see Problem 15.22). Is there a phase transition in the spin glass
model, and if so, what is its nature? The answers to these questions are very diﬃcult to obtain
by doing simulations. One of the diﬃculties is that we need to do not only an average over
the possible conﬁgurations of spins for a given set of {Jij}, but also an average over diﬀerent
realizations of the interactions. Another diﬃculty is that there are many local minima in the
energy (free energy at ﬁnite temperature) as a function of the conﬁgurations of spins, and it
is very diﬃcult to ﬁnd the global minimum. As a result, Monte Carlo simulations typically
become stuck in these local minima or metastable states. Detailed ﬁnite-size scaling analyses
of simulations indicate that there might be a transition in three dimensions. It is generally
accepted that the transition in two dimensions is at zero temperature. In the following, we will
look at some of the properties of an Ising spin glass on a square lattice at low temperatures.
(a) Write a program to apply simulated annealing to an Ising spin glass using the Metropolis
algorithm with the temperature ﬁxed at each stage of the annealing schedule (see Problem
15.31a). Search for the lowest energy conﬁguration for a ﬁxed set of {Jij}. Use at least
one other annealing schedule for the same {Jij} and compare your results. Then ﬁnd the
ground state energy for at least ten other sets of {Jij}. Use lattice sizes of L = 5 and L = 10.
Discuss the nature of the ground states you are able to ﬁnd. Is there much variation in the
ground state energy E0 from one set of {Jij} to another? Theoretical calculations give an
average over realizations of E0/N ≈ −1.4. If you have suﬃcient computer resources, repeat
your computations for the three-dimensional spin glass.
(b) Modify your program to do simulated annealing using the demon algorithm (see Problem
15.31b). How do your results compare to those that you found in part (a)?
Project 15.41. Zero temperature dynamics of the Ising model
We have seen that various kinetic growth models (Section 13.3) and reaction-diﬀusion models
(Section 7.8) lead to interesting and nontrivial behavior. Similar behavior can be seen in the zero
temperature dynamics of the Ising model. Consider the one-dimensional Ising model with J > 0
and periodic boundary conditions. The initial orientation of the spins is chosen at random. We
update the conﬁgurations by choosing a spin at random and computing the change in energy
∆E. If ∆E < 0, then ﬂip the spin; else if ∆E = 0, ﬂip the spin with 50% probability. The spin
is not ﬂipped if ∆E > 0. This type of Monte Carlo update is known as Glauber dynamics. How
does this algorithm diﬀer from the Metropolis algorithm at T = 0?
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 649
(a) A quantity of interest is f (t), the fraction of spins that have not yet ﬂipped at time t. As
usual, the time is measured in terms of Monte Carlo steps per spin. Published results (Derrida
et al.) for N = 105 indicate that f (t) behaves as
f (t) ∼ t−θ
, (15.95)
for t ≈ 3 to t ≈ 10,000. The exact value of θ is 0.375. Verify this result and extend your
results to the one-dimensional q-state Potts model. In the latter model each site is initially
given a random integer between 1 and q. A site is chosen at random and set equal to either
of its two neighbors with equal probability.
(b) Another interesting quantity is the probability distribution Pn(t) that n sites have not yet
ﬂipped as a function of the time t (see Das and Sen). Plot Pn versus n for two times on the
same graph. Discuss the shape of the curves and their diﬀerences. Choose L ≥ 100 and
t = 50 and 100. Try to ﬁt the curves to a Gaussian distribution. Because the possible values
of n are bounded, ﬁt each side of the maximum of Pn to a Gaussian with diﬀerent widths.
There are a number of scaling properties that can be investigated. Show that Pn=0(t) scales
approximately as t/L2. Thus, if you compute Pn=0(t) for a number of diﬀerent times and
lengths such that t/L2 has the same value, you should obtain the same value of Pn=0.
Project 15.42. The inverse power law potential
Consider the inverse power law potential
V (r) = V0
σ
r
n
, (15.96)
with V0 > 0. One reason for the interest in potentials of this form is that thermodynamic quantities
such as the mean energy E do not depend on V0 and σ separately, but depend on a single
dimensionless parameter, which is deﬁned as (see Project 8.25)
Γ =
V0
kT
σ
a
(15.97)
where a is deﬁned in three and two dimensions by 4πa3ρ/3 = 1 and πa2ρ = 1, respectively.
The length a is proportional to the mean distance between particles. A Coulomb interaction
corresponds to n = 1, and a hard sphere system corresponds to n → ∞. What phases do you
expect to occur for arbitrary n?
(a) Compare the qualitative features of g(r) for a “soft” potential with n = 4 to a system of hard
disks at the same density.
(b) Let n = 12 and compute the mean energy E as a function of Γ for a three-dimensional system
with N = 16, 32, 64, and 128. Does E depend on N? Can you extrapolate your results for the
N-dependence of E to N → ∞? Do you see any evidence of a ﬂuid-solid phase transition? If
so, estimate the value of Γ at which it occurs. What is the nature of the transition if it exists?
What is the symmetry of the ground state?
(c) Let n = 4 and determine the symmetry of the ground state. For this value of n, there is a
solid-to-solid phase transition at which the solid changes symmetry. To determine the value
of Γ at which this phase transition exists and the symmetry of the smaller Γ solid phase
(see Dubin and Dewitt), it is necessary to use a Monte Carlo method in which the shape of
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 650
the simulation cell changes to accomodate the diﬀerent symmetry (the Rahman–Parrinello
method), an interesting project. An alternative is to prepare a bcc lattice at Γ =≈ 105 (for
example, T = 0.06 and ρ = 0.95). Then instantaneously change the potential from n = 4 to
n = 12; the new value of Γ is ≈ 4180, and the new stable phase is fcc. The transition can be
observed by watching the evolution of g(r).
Project 15.43. Rare gas clusters
There has been much recent interest in structures that contain many particles but that are not
macroscopic. An example is the unusual structure of sixty carbon atoms known as a “buckeyball.”
A less unusual structure is a cluster of argon atoms. Questions of interest include
the structure of the clusters, the existence of “magic” numbers of particles for which the cluster
is particularly stable, the temperature dependence of the quantities, and the possibility of
diﬀerent phases. This latter question has been subject to some controversy because transitions
between diﬀerent kinds of behavior in ﬁnite systems are not well deﬁned, as they are for inﬁnite
systems.
(a) Write a Monte Carlo program to simulate a three-dimensional system of particles interacting
via the Lennard–Jones potential. Use open boundary conditions; that is, do not enclose
the system in a box. The number of particles N and the temperature T should be input
parameters.
(b) Find the ground state energy E0 as a function of N. For each value of N begin with a random
initial conﬁguration and accept any trial displacement that lowers the energy. Repeat for at
least ten diﬀerent initial conﬁgurations. Plot E0/N versus N for N = 2 to 20 and describe
the qualitative dependence of E0/N on N. Is there any evidence of magic numbers, that is,
value(s) of N for which E0/N is a minimum? For each value of N save the ﬁnal conﬁguration.
Plot the positions of the atoms. Does the cluster look like a part of a crystalline solid?
(c) Repeat part (b) using simulated annealing. The initial temperature should be suﬃciently
low so that the particles do not move far away from each other. Slowly lower the temperature
according to some annealing schedule. Are your results for E0/N lower than those you
obtained in part (b)?
(d) To gain more insight into the structure of the clusters, compute the mean number of neighbors
per particle for each value of N. What is a reasonable criteria for two particles to
be neighbors? Also compute the mean distance between each pair of particles. Plot both
quantities as a function of N and compare their dependence on N with your plot of E0/N.
(e) Do you ﬁnd any evidence for a “melting” transition? Begin with the conﬁguration that has
the minimum value of E0/N and slowly increase the temperature T . Compute the energy
per particle and the mean square displacement of the particles from their initial positions.
Plot your results for these quantities versus T .
Project 15.44. The hard disks ﬂuid-solid transition
Although we have mentioned (see Section 15.10) that there is much evidence for a ﬂuid-solid
transition in a hard disk system, the nature of the transition still is a problem of current research.
In this project we follow the work of Lee and Strandburg and apply the constant pressure
Monte Carlo method (see Section 15.12) and the Lee–Kosterlitz method (see Section 15.11)
to investigate the nature of the transition. Consider N = L2 hard disks of diameter σ = 1 in a
two-dimensional box of volume V =
√
3L2v/2 with periodic boundary conditions. The quantity
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 651
v ≥ 1 is the reduced volume and is related to the density ρ by ρ = N/V = 2/(
√
3v); v = 1 corresponds
to maximum packing. The aspect ratio of 2/
√
3 is used to match the perfect triangular
lattice. Do a constant pressure (actually constant p∗ = P/kT ) Monte Carlo simulation. The trial
displacement of each disk is implemented as discussed in Section 15.10. Lee and Strandburg
ﬁnd that a maximum displacement of 0.09 gives a 45% acceptance probability. The other type
of move is a random isotropic change of the volume of the system. If the change of the volume
leads to an overlap of the disks, the change is rejected. Otherwise, if the trial volume ˜V is
less than the current volume V , the change is accepted. A larger trial volume is accepted with
probability
e−p∗( ˜V −V )+N ln( ˜V /V )
. (15.98)
Volume changes are attempted 40–200 times for each set of individual disk moves. The quantity
of interest is N(v), the distribution of the reduced volume v. Because we need to store information
about N(v) in an array, it is convenient to discretize the volume in advance and choose the
mesh size so that the acceptance probability for changing the volume by one unit is 40–50%. Do
a Monte Carlo simulation of the hard disk system for L = 10 (N = 100) and p∗ = 7.30. Published
results are for 107 Monte Carlo steps. To apply the Lee–Kosterlitz method, smooth lnN(v) by
ﬁtting it to an eighth-order polynomial. Then extrapolate lnN(v) using the histogram method
to determine p∗
c(L = 10), the pressure at which the two peaks of N(v) are of equal height. What
is the value of the free energy barrier ∆F? If suﬃcient computer resources are available, compute
∆F for larger L (published results are for L = 10, 12, 14, 16, and 20) and determine if ∆F
depends on L. Can you reach any conclusions about the nature of the transition?
Project 15.45. Vacancy mediated dynamics in binary alloys
When a binary alloy is rapidly quenched from a high temperature to a low temperature unstable
state, a pattern of domain formation called spinodal decomposition takes place as the two metals
in the alloy separate. This process is of much interest experimentally. Lifshitz and Slyozov have
predicted that at long times, the linear domain size increases with time as R ∼ t1/3. This result is
independent of the dimension for d ≥ 2, and has been veriﬁed experimentally and in computer
simulations. The behavior is modiﬁed for binary ﬂuids due to hydrodynamic eﬀects.
Most of the computer simulations of this growth process have been based on the Ising model
with spin exchange dynamics. In this model there is an A or B atom (spin up or spin down) at
each site, where A and B represent diﬀerent metals. The energy of interaction between atoms on
two neighboring sites is −J if the two atoms are the same type and +J if they are diﬀerent. Monte
Carlo moves are made by exchanging unlike atoms. (The number of A and B atoms must be
conserved.) A typical simulation begins with an equilibrated system at high temperatures. Then
the temperature is changed instantaneously to a low temperature below the critical temperature
Tc. If there are equal numbers of A and B atoms on the lattice, then spinodal decomposition
occurs. If you watch a visualization of the evolution of the system, you will see wavy-like
domains of each type of atom thickening with time.
The growth of the domains is very slow if we use spin exchange dynamics. We will see that
if simulations are performed with vacancy mediated dynamics, the scaling behavior begins at
much earlier times. Because of the large energy barriers that prevent real metallic atoms from
exchanging position, it is likely that spinodal decomposition in real alloys also occurs with
vacancy mediated dynamics. We can do a realistic simulation by including just one vacancy
because the number of vacancies in a real alloy is also very small. In this case the only possible
Monte Carlo move on a square lattice is to exchange the vacancy with one of its four neighboring
atoms. To implement this algorithm, you will need an array to keep track of which type of atom
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 652
is at each lattice site and variables to keep track of the location of the single vacancy. The
simulation will run very fast because there is little bookkeeping and all the possible trial moves
are potentially good ones. In contrast, in standard spin exchange dynamics, it is necessary to
either waste computer time checking for unlike nearest neighbor atoms or keep track of where
they are.
The major quantity of interest is the growth of the domain size R. One way to determine
R is to measure the pair correlation function C(r) = sisj , where r = |ri − rj|, and si = 1 for an
A atom and si = −1 for a B atom. The ﬁrst zero in C(r) is a measure of the domain size. An
alternative measure of the domain size is the quantity R = 2/( E /N + 2), where E /N is the
average energy per site and N is the number of sites (see Project 15.39). The quantity R is a
rough measure of the length of the perimeter of a domain and is proportional to the domain
size.
(a) Write a program to simulate vacancy mediated dynamics. The initial state consists of the
random placement of A and B atoms (half of the sites have A and half B atoms); one vacancy
replaces one of the atoms. Explain why this conﬁguration corresponds to inﬁnite
temperature. Choose a square lattice with L ≥ 50.
(b) Instantaneously quench the system by running the Metropolis algorithm at a temperature
of T = Tc/2 ≈ 1.13. You should ﬁrst look at the lattice after every attempted move of the
vacancy to see the eﬀect of vacancy dynamics. After you are satisﬁed that your program
is working correctly and that you understand the algorithm, speed up the simulation by
only collecting data and showing the lattice at times equal to t = 2n where n = 1, 2, 3 . . . .
Measure the domain size using either the energy or C(r) as a function of time averaged over
many diﬀerent initial conﬁgurations.
(c) At what time does the logR versus logt plot become linear? Do both measures of the domain
size give the same results? Does the behavior change for diﬀerent quench temperatures? Try
0.2Tc and 0.7Tc. A log-log plot of the domain size versus time should give the exponent 1/3.
(d) Repeat the measurements in three dimensions. Do you obtain the same exponent?
Project 15.46. Heat ﬂow using the demon algorithm
In our applications of the demon algorithm one demon shared its energy equally with all the
spins. As a result the spins all attained the same mean energy of interaction. Many interesting
questions arise when the system is not spatially uniform and is in a nonequilibrium but timeindependent
(steady) state.
Let us consider heat ﬂow in a one-dimensional Ising model. Suppose that instead of all
the sites sharing energy with one demon, each site has its own demon. We can study the ﬂow
of heat by requiring the demons at the boundary spins to satisfy diﬀerent conditions than the
demons at the other spins. The demon at spin 0 adds energy to the system by ﬂipping this spin
so that it is in its highest energy state, that is, in the opposite direction of spin 1. The demon
at spin N − 1 removes energy from the system by ﬂipping spin N − 1 so that it is in its lowest
energy state, that is, in the same direction as spin N − 2. As a result, energy ﬂows from site 0 to
site N − 1 via the demons associated with the intermediate sites. In order that energy not build
up at the “hot” end of the Ising chain, we require that spin 0 can only add energy to the system
if spin N − 1 simultaneously removes energy from the system. Because the demons at the two
ends of the lattice satisfy diﬀerent conditions than the other demons, we do not use periodic
boundary conditions.
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 653
The temperature is determined by the generalization of the relation (15.10); that is, the temperature
at site i is related to the mean energy of the demon at site i. To control the temperature
gradient, we can update the end spins at a diﬀerent rate than the other spins. The maximum
temperature gradient occurs if we update the end spins after every update of an internal spin.
A smaller temperature gradient occurs if we update the end spins less frequently. The temperature
gradient between any two spins can be determined from the temperature proﬁle, the
spatial dependence of the temperature. The energy ﬂow can be determined by computing the
magnitude of the energy per unit time that enters the lattice at site 0.
To implement this procedure we modify IsingDemon by converting the variables demonEnergy
and demonEnergyAccumulator to arrays. We do the usual updating procedure for spins 1
through N − 2 and visit spins 0 and N − 1 at regular intervals denoted by timeToAddEnergy.
The class ManyDemons can be downloaded from the ch15 directory.
(a) Write a target class that inputs the number of spins N and the initial energy of the system,
outputs the number of Monte Carlo steps per spin and the energy added to the system at
the high temperature boundary, and plots the temperature as a function of position.
(b) As a check on ManyDemons, modify the class so that all the demons are equivalent; that is,
impose periodic boundary conditions and do not use method boundarySpins. Compute the
mean energy of the demon at each site and use (15.10) to deﬁne a local site temperature.
Use N ≥ 52 and run for about 10,000 mcs. Is the local temperature approximately uniform?
How do your results compare with the single demon case?
(c) In ManyDemons the energy is added to the system at site 0 and is removed at site N − 1.
Determine the mean demon energy for each site and obtain the corresponding local temperature
and the mean energy of the system. Draw the temperature proﬁle by plotting the
temperature as a function of site number. The temperature gradient is the diﬀerence in
temperature from site N − 2 to site 1 divided by the distance between them. (The distance
between neighboring sites is unity.) Because of local temperature ﬂuctuations and edge effects,
the temperature gradient should be estimated by ﬁtting the temperature proﬁle in the
middle of the lattice to a straight line. Reasonable choices for the parameters are N = 52
and timeToAddEnergy = 1. Run for at least 10000 mcs.
(d) The heat ﬂux Q is the energy ﬂow per unit length per unit time. The energy ﬂow is the
amount of energy that demon 0 adds to the system at site 0. The time is conveniently
measured in terms of Monte Carlo steps per spin. Determine Q for the parameters used in
part (c).
(e) If the temperature gradient ∂T /∂x is not too large, the heat ﬂux Q is proportional to ∂T /∂x.
We can determine the thermal conductivity κ by the relation
Q = −κ
∂T
∂x
. (15.99)
Use your results for ∂T /∂x and Q to estimate κ.
(f) Determine Q, the temperature proﬁle, and the mean temperature for diﬀerent values of
timeToAddEnergy. Is the temperature proﬁle linear for all values of timeToAddEnergy? If
the temperature proﬁle is linear, estimate ∂T /∂x and determine κ. Does κ depend on the
mean temperature?
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 654
Note that by using many demons we were able to compute a temperature proﬁle by using
an algorithm that manipulates only integer numbers. The conventional approach is to solve a
heat equation similar in form to the diﬀusion equation. Now we use the same idea to compute
the magnetization proﬁle when the end spins of the lattice are ﬁxed.
(g) Modify ManyDemons by not calling method boundarySpins. Also, constrain spins 0 and
N −1 to be +1 and −1, respectively. Estimate the magnetization proﬁle by plotting the mean
value of the spin at each site versus the site number. Choose N = 22 and mcs ≥ 1000. How
do your results vary as you increase N?
(h) Compute the mean demon energy and, hence, the local temperature at each site. Does the
system have a uniform temperature even though the magnetization is not uniform? Is the
system in thermal equilibrium?
(i) The eﬀect of the constraint on the end spins is easier to observe in two and three dimensions
than in one dimension. Write a program for a two-dimensional Ising model on a L×L square
lattice. Constrain the spins at site (i,j) to be +1 and −1 for i = 0 and i = L − 1, respectively.
Use periodic boundary conditions in the y direction. How do your results compare with the
one-dimensional case?
(j) Remove the periodic boundary condition in the y direction and constrain all the boundary
spins from i = 0 to (L/2) − 1 to be +1 and the other boundary spins to be −1. Choose an
initial conﬁguration where all the spins on the left half of the system are +1 and the others
are −1. Do the simulation and draw a conﬁguration of the spins once the system has reached
equilibrium. Draw a line between each pair of spins of opposite sign. Describe the curve
separating the +1 spins from the −1 spins. Begin with L = 20 and determine what happens
as L is increased.
Appendix 15A: Relation of the Mean Demon Energy to the Tem-
perature
We know that the energy of the demon Ed is constrained to be positive and that the probability
for the demon to have energy Ed is proportional to e−Ed/kT . Hence, in general, Ed is given by
Ed =
Ed
Ed e−Ed/kT
Ed
e−Ed/kT
(15.100)
where the summations in (15.100) are over the possible values of Ed. If an Ising spin is ﬂipped
in zero magnetic ﬁeld, the minimum nonzero decrease in energy of the system is 4J (see Figure
15.11). Hence, the possible energies of the demon are 0, 4J, 8J, 12J, ... . We write x = 4J/kT
and perform the summations in (15.100). The result is
Ed/kT =
0 + xe−x + 2xe−2x + ···
1 + e−x + e−2x + ···
=
x
ex − 1
. (15.101)
The form (15.10) can be obtained by solving (15.101) for T in terms of Ed. Convince yourself
that the relation (15.101) is independent of dimension for lattices with an even number of
nearest neighbors.
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 655
∆E = -8J
∆E = -4J
∆E = 0
∆E = 4J
∆E = 8J
Figure 15.11: The ﬁve possible transitions of the Ising model on the square lattice with spin ﬂip
dynamics.
If the magnetic ﬁeld is nonzero, the possible values of the demon energy are 0, 2H, 4J −2H,
4J + 2H, ... . If J is a multiple of H, then the result is the same as before with 4J replaced by
2H, because the possible energy values for the demon are multiples of 2H. If the ratio 4J/2H
is irrational, then the demon can take on a continuum of values, and thus Ed = kT . The other
possibility is that 4J/2H = m/n, where m and n are prime positive integers that have no common
factors (other than 1). In this case it can be shown that (see Mak)
kT /J =
4/m
ln(1 + 4J/m Ed )
. (15.102)
Surprisingly, (15.102) does not depend on n. Test these relations for H 0 by choosing values
of J and H and computing the sums in (15.100) directly.
Appendix 15B: Fluctuations in the Canonical Ensemble
We ﬁrst obtain the relation of the constant volume heat capacity CV to the energy ﬂuctuations
in the canonical ensemble. We write CV as
CV =
∂ E
∂T
= −
1
kT 2
∂ E
∂β
. (15.103)
From (15.11) we have
E = −
∂
∂β
lnZ, (15.104)
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 656
and
∂ E
∂β
= −
1
Z2
∂Z
∂β s
Es e−βEs −
1
Z s
E2
s e−βEs (15.105)
= E 2
− E2
. (15.106)
The relation (15.19) follows from (15.103) and (15.106). Note that the heat capacity is at constant
volume because the partial derivatives were performed with the energy levels Es kept
constant. The corresponding quantity for a magnetic system is the heat capacity at constant
external magnetic ﬁeld.
The relation of the magnetic susceptibility χ to the ﬂuctuations of the magnetization M can
be obtained in a similar way. We assume that the energy can be written as
Es = E0,s − HMs (15.107)
where E0,s is the energy of interaction of the spins in the absence of a magnetic ﬁeld, H is the
external applied ﬁeld, and Ms is the magnetization in the s state. The mean magnetization is
given by
M =
1
Z
Ms e−βEs . (15.108)
Because ∂Es/∂H = −Ms, we have
∂Z
∂H
=
s
βMs e−βEs . (15.109)
Hence, we obtain
M =
1
β
∂
∂H
lnZ. (15.110)
If we use (15.108) and (15.110), we ﬁnd
∂ M
∂H
= −
1
Z2
∂Z
∂H s
Ms e−βEs +
1
Z s
βM2
s e−βEs (15.111)
= −β M 2
+ β M2
. (15.112)
The relation (15.21) for the zero-ﬁeld susceptibility follows from (15.112) and the deﬁnition
(15.20).
Appendix 15C: Exact Enumeration of the 2 × 2 Ising Model
Because the number of possible states or conﬁgurations of the Ising model increases as 2N ,
we can enumerate the possible conﬁgurations only for small N. As an example, we calculate
the various quantities of interest for a 2 × 2 Ising model on the square lattice with periodic
boundary conditions. In Table 15.2 we group the sixteen states according to their total energy
and magnetization.
We can compute all the quantities of interest using Table 15.2. The partition function is
given by
Z = 2e8βJ
+ 12 + 2e−8βJ
. (15.113)
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 657
# Spins Up g(E,M) Energy Magnetization
4 1 −8 4
3 4 0 2
2 4 0 0
2 2 8 0
1 4 0 −2
0 1 −8 −4
Table 15.2: The energy and magnetization of the 24 states of the zero ﬁeld Ising model on the
2 × 2 square lattice. The quantity g(E,M) is the number of microstates with the same energy.
If we use (15.104) and (15.113), we ﬁnd
E = −
∂
∂β
lnZ = −
1
Z
2(8)e8βJ
+ 2(−8)e−8βJ
. (15.114)
Because the other quantities of interest can be found in a similar manner, we only give the
results:
E2
=
1
Z
(2 × 64)e8βJ
+ (2 × 64)e−8βJ
(15.115)
M =
1
Z
(0) = 0 (15.116)
|M| =
1
Z
(2 × 4)e8βJ
+ 8 × 2 (15.117)
M2
=
1
Z
(2 × 16)e8βJ
+ 8 × 4 . (15.118)
The dependence of C and χ on βJ can be found by using (15.114) and (15.115) and (15.116) and
(15.118), respectively.
References and Suggestions for Further Reading
M. P. Allen and D. J. Tildesley, Computer Simulation of Liquids (Clarendon Press, 1987). See
Chapter 4 for a discussion of Monte Carlo methods.
Paul D. Beale, “Exact distribution of energies in the two-dimensional Ising model,” Phys. Rev.
Lett. 76, 78 (1996). The author discusses a Mathematica program that can compute the
exact density of states for the two-dimensional Ising model.
K. Binder, editor, Monte Carlo Methods in Statistical Physics, 2nd ed. (Springer–Verlag, 1986).
Also see K. Binder, editor, Applications of the Monte Carlo Method in Statistical Physics
(pringer–Verlag, 1984) and K. Binder, editor, The Monte Carlo Method in Condensed Matter
Physics (Springer–Verlag, 1992). The latter book discusses the Binder cumulant method in
the introductory chapter.
Marvin Bishop and C. Bruin, “The pair correlation function: A probe of molecular order,” Am.
J. Phys. 52, 1106–1108 (1984). The authors compute the pair correlation function for a
two-dimensional Lennard–Jones model.
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 658
A. B. Bortz, M. H. Kalos and J. L. Lebowitz, “A new algorithm for Monte Carlo simulation of
Ising spin systems,” J. Comput. Phys. 17, 10–18 (1975). This paper ﬁrst introduced the nfold
way algorithm, which was rediscovered independently by many workers in the 1970s
and 80s.
S. G. Brush, “History of the Lenz–Ising model,” Rev. Mod. Phys. 39, 883–893 (1967).
James B. Cole, “The statistical mechanics of image recovery and pattern recognition,”Am. J.
Phys. 59, 839–842 (1991). A discussion of the application of simulated annealing to the
recovery of images from noisy data.
R. Cordery, S. Sarker, and J. Tobochnik, “Physics of the dynamical critical exponent in one
dimension,” Phys. Rev. B 24, 5402–5403 (1981).
Michael Creutz, “Microcanonical Monte Carlo simulation,” Phys. Rev. Lett. 50, 1411 (1983).
See also Gyan Bhanot, Michael Creutz, and Herbert Neuberger, “Microcanonical simulation
of Ising systems,” Nuc. Phys. B 235, 417–434 (1984).
Pratap Kumar Das and Parongama Sen, “Probability distributions of persistent spins in an
Ising chain,” J. Phys. A 37, 7179–7184 (2004).
B. Derrida, A. J. Bray, and C. Godrèche, “Non-trivial exponents in the zero temperature dynamics
of the 1D Ising and Potts models,” J. Phys. A 27, L357–L361 (1994); B. Derrida, V.
Hakim, and V. Pasquier, “Exact ﬁrst passage exponents in 1d domain growth: Relation to
a reaction-diﬀusion model,” Phys. Rev. Lett. 75, 751 (1995).
Daniel H. E. Dubin and Hugh Dewitt, “Polymorphic phase transition for inverse-power-potential
crystals keeping the ﬁrst-order anharmonic correction to the free energy,” Phys. Rev. B 49,
3043–3048 (1994).
Jerome J. Erpenbeck and Marshall Luban, “Equation of state for the classical hard-disk ﬂuid,”
Phys. Rev. A 32, 2920–2922 (1985). These workers use a combined molecular dynamics/Monte
Carlo method and consider 1512 and 5822 disks.
Alan M. Ferrenberg, D. P. Landau, and Y. Joanna Wong, “Monte Carlo simulations: Hidden
errors from “good” random number generators,” Phys. Rev. Lett. 69, 3382 (1992).
Alan M. Ferrenberg and Robert H. Swendsen, “New Monte Carlo technique for studying phase
transitions,” Phys. Rev. Lett. 61, 2635 (1988); “Optimized Monte Carlo data analysis,”
Phys. Rev. Lett. 63, 1195 (1989); “Optimized Monte Carlo data analysis,” Computers in
Physics 3 5, 101 (1989). The second and third papers discuss using the multiple histogram
method with data from simulations at more than one temperature.
P. Fratzl and O. Penrose, “Kinetics of spinodal decomposition in the Ising model with vacancy
diﬀusion,” Phys. Rev. B 50, 3477–3480 (1994).
Daan Frenkel and Berend Smit, Understanding Molecular Simulation, 2nd ed. (Academic Press,
2002).
Harvey Gould and W. Klein, “Spinodal eﬀects in systems with long-range interactions,” Physica
D 66, 61–70 (1993). This paper discusses nucleation in the Ising model and Lennard–
Jones systems.
Harvey Gould and Jan Tobochnik, “Overcoming critical slowing down,” Computers in Physics
3 (4), 82 (1989).
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 659
James E. Gubernatis, The Monte Carlo Method in the Physical Sciences (AIP Press, 2004). June
2003 was the 50th anniversary of the Metropolis, Rosenbluth, Rosenbluth, Teller, and
Teller publication of what is now called the Metropolis algorithm. This algorithm established
the Monte Carlo method in physics and other ﬁelds and lead to the development of
other Monte Carlo algorithms. Six of the papers in the proceedings of the conference give
historical perspectives.
Hong Guo, Martin Zuckermann, R. Harris, and Martin Grant, “A fast algorithm for simulated
annealing,” Physica Scripta T38, 40–44 (1991).
Gary S. Grest, Michael P. Anderson, and David J. Srolovitz, “Domain-growth kinetics for the
Q-state Potts model in two and three dimensions,” Phys. Rev. B 38, 4752–4760 (1988).
R. Harris, “Demons at work,” Computers in Physics 4 (3), 314 (1990).
S. Istrail, “Statistical mechanics, three-dimensionality and NP-completeness: I. Universality
of intractability of the partition functions of the Ising model across non-planar lattices,”
Proceedings of the 32nd ACM Symposium on the Theory of Computing, ACM Press, pp.
87–96, Portland, Oregon, May 21–23, 2000. This paper shows that it is impossible to obtain
an analytic solution for the three-dimensional Ising model.
J. Kertész, J. Cserti and J. Szép, “Monte Carlo simulation programs for microcomputer,” Eur. J.
Phys. 6, 232–237 (1985).
S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, “Optimization by simulated annealing,” Science
220, 671–680 (1983). See also, S. Kirkpatrick and G. Toulouse, “Conﬁguration space
analysis of traveling salesman problems,” J. Physique 46, 1277–1292 (1985).
J. M. Kosterlitz and D. J. Thouless, “Ordering, metastability and phase transitions in twodimensional
systems,” J. Phys. C 6, 1181–1203 (1973); J. M. Kosterlitz, “The critical properties
of the two-dimensional xy model,” J. Phys. C 7, 1046–1060 (1974).
D. P. Landau, Shan-Ho Tsai, and M. Exler, “A new approach to Monte Carlo simulations in
statistical physics: Wang–Landau sampling,” Am. J. Phys. 72, 1294–1302 (2004).
D. P. Landau, “Finite-size behavior of the Ising square lattice,” Phys. Rev. B 13, 2997–3011
(1976). A clearly written paper on a ﬁnite-size scaling analysis of Monte Carlo data. See
also D. P. Landau, “Finite-size behavior of the simple-cubic Ising lattice,” Phys. Rev. B 14,
255–262 (1976).
D. P. Landau and R. Alben, “Monte Carlo calculations as an aid in teaching statistical mechanics,”
Am. J. Phys. 41, 394–400 (1973).
David Landau and Kurt Binder, A Guide to Monte Carlo Simulations in Statistical Physics, 2nd
ed. (Cambridge University Press, 2005).
Jooyoung Lee and J. M. Kosterlitz, “New numerical method to study phase transitions,” Phys.
Rev. Lett. 65, 137 (1990); ibid., “Finite-size scaling and Monte Carlo simulations of ﬁrstorder
phase transitions,” Phys. Rev. B 43, 3265–3277 (1991).
Jooyoung Lee and Katherine J. Strandburg, “First-order melting transition of the hard-disk
system,” Phys. Rev. B 46, 11190–11193 (1992).
Jiwen Liu and Erik Luijten, “Rejection-free geometric cluster algorithm for complex ﬂuids,”
Phys. Rev. Lett. 92 035504 (2004) and ibid., Phys. Rev. E 71, 066701-1–12 (2005).
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 660
J. Machta, Y. S. Choi, A. Lucke, T. Schweizer, and L. Chayes, “Invaded cluster algorithm for
Potts models,” Phys. Rev. E 54, 1332–1345 (1996).
S. S. Mak, “The analytical demon of the Ising model,” Phys. Lett. A 196, 318 (1995).
J. Marro and R. Toral, “Microscopic observations on a kinetic Ising model,” Am. J. Phys. 54,
1114–1121 (1986).
N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, “Equation of
state calculations by fast computing machines,” J. Chem. Phys. 21, 1087–1092 (1953).
A. Alan Middleton, “Improved extremal optimization for the Ising spin glass,” Phys. Rev. E
69, 055701-1–4 (2004). The extremal optimization algorithm, which was inspired by the
Bak–Sneppen algorithm for evolution (see Problem 14.12), preferentially ﬂips spins that
are “unﬁt.” The adaptive algorithm proposed in this paper is an example of an heuristic
that ﬁnds exact ground states eﬃciently for systems with frozen-in disorder.
M. E. J. Newman and G. T. Barkema, Monte Carlo Methods in Statistical Physics (Oxford University
Press, 1999).
M. A. Novotny, “A new approach to an old algorithm for the simulation of Ising-like systems,”
Computers in Physics 9 (1), 46 (1995). The n-fold way algorithm is discussed. Also, see
M. A. Novotny, “A tutorial on advanced dynamic Monte Carlo methods for systems with
discrete state spaces,” in Annual Reviews of Computational Physics IX, edited by Dietrich
Stauﬀer (World Scientiﬁc, 2001), pp. 153–210.
Ole G. Mouritsen, Computer Studies of Phase Transitions and Critical Phenomena (Springer–
Verlag, 1984).
E. P. Münger and M. A. Novotny, “Reweighting in Monte Carlo and Monte Carlo renormalizationgroup
studies,” Phys. Rev. B 43, 5773–5783 (1991). The authors discuss the histogram
method and combine it with renormalization group calculations.
Michael Plischke and Birger Bergersen, Equilibrium Statistical Physics, 3rd ed. (Prentice Hall,
2005). A graduate level text that discusses some contemporary topics in statistical physics,
many of which have been inﬂuenced by computer simulations.
William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery, Numerical
Recipes, 2nd ed. (Cambridge University Press, 1992). A Fortran program for the traveling
salesman problem is given in Section 10.9.
Stephen H. Shenker and Jan Tobochnik, “Monte Carlo renormalization-group analysis of the
classical Heisenberg model in two dimensions,” Phys. Rev. B 22, 4462–472 (1980).
Amihai Silverman and Joan Adler, “Animated simulated annealing,” Computers in Physics 6,
277 (1992). The authors describe a simulation of the annealing process to obtain a defect
free single crystal of a model material.
H. Eugene Stanley, Introduction to Phase Transitions and Critical Phenomena (Oxford University
Press, 1971). See Appendix B for the exact solution of the zero-ﬁeld Ising model for a
two-dimensional lattice.
Jan Tobochnik and G. V. Chester, “Monte Carlo study of the planar model,” Phys. Rev. B 20,
3761–3769 (1979).
CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 661
Jan Tobochnik, Harvey Gould, and Jon Machta, “Understanding the temperature and the chemical
potential through computer simulations,” Am. J. Phys. 73 (8), 708–716 (2005). This
paper extends the demon algorithm to compute the chemical potential.
Simon Trebst, David A. Huse, and Matthias Troyer, “Optimizing the ensemble for equilibration
in broad-histogram Monte Carlo simulations,” Phys. Rev. E 70, 046701-1–5 (2004). The
adaptive algorithm presented in this paper overcomes critical slowing down and improves
upon the Wang–Landau algorithm and is another example of the ﬂexibility of Monte Carlo
algorithms.
I. Vattulainen, T. Ala–Nissila, and K. Kankaala, “Physical tests for random numbers in simulations,”
Phys. Rev. Lett. 73, 2513 (1994).
B. Widom, “Some topics in the theory of ﬂuids,” J. Chem. Phys. 39, 2808–2812 (1963). This
paper discusses the insertion method for calculating the chemical potential.
Chapter 16
Quantum Systems
We discuss numerical solutions of the time-independent and time-dependent Schrödinger equation
and describe several Monte Carlo methods for estimating the ground state of quantum
systems.
16.1 Introduction
So far we have simulated the microscopic behavior of physical systems using Monte Carlo methods
and molecular dynamics. In the latter method, the classical trajectory (the position and momentum)
of each particle is calculated as a function of time. However, in quantum systems the
position and momentum of a particle cannot be speciﬁed simultaneously. Because the description
of microscopic particles is intrinsically quantum mechanical, we cannot directly simulate
their trajectories on a computer (see Feynman).
Quantum mechanics does allow us to analyze probabilities, although there are diﬃculties
associated with such an analysis. Consider a simple probabilistic system described by the onedimensional
diﬀusion equation (see Section 7.2)
∂P (x,t)
∂t
= D
∂2P (x,t)
∂x2
, (16.1)
where P (x,t) is the probability density of a particle being at position x at time t. One way to
convert (16.1) to a diﬀerence equation and obtain a numerical solution for P (x,t) is to make
x and t discrete variables. Suppose we choose a mesh size for x such that the probability is
given at p values of x. If we choose p to be order 103, a straightforward calculation of P (x,t)
would require approximately 103 data points for each value of t. In contrast, the corresponding
calculation of the dynamics of a single particle based on Newton’s second law would require
one data point.
The limitations of the direct computational approach become even more apparent if there
are many degrees of freedom. For example, for N particles in one dimension, we would have to
calculate the probability P (x1,x2,...,xN ,t), where xi is the position of particle i. Because we need
to choose a mesh of p points for each xi, we need to specify Np values at each time t. For the same
level of precision, p will be proportional to the length of the system (for particles conﬁned to one
dimension). Consequently, the calculation time and memory requirements grow exponentially
with the length of the system. For example, for 10 particles on a mesh of 100 points, we would
662
CHAPTER 16. QUANTUM SYSTEMS 663
need to store 10100 numbers to represent P , which is already much more than any computer
today can store. In two and three dimensions the growth is even faster.
Although the direct computational approach is limited to systems with only a few degrees
of freedom, the simplicity of this approach will aid our understanding of the behavior of quantum
systems. After a summary of the general features of quantum mechanical systems in Section
16.2, we consider this approach to solving the time-independent Schrödinger equation in
Sections 16.3 and 16.4. In Section 16.5, we use a half-step algorithm to generate wave packet
solutions to the time-dependent Schrödinger equation.
Because we have already learned that the diﬀusion equation (16.1) can be formulated as a
random walk problem, it might not surprise you that Schrödinger’s equation can be analyzed
in a similar way. Monte Carlo methods are introduced in Section 16.7 to obtain variational
solutions of the ground state. We introduce quantum Monte Carlo methods in Section 16.8 and
discuss more sophisticated quantum Monte Carlo methods in Sections 16.9 and 16.10.
16.2 Review of Quantum Theory
For simplicity, we consider a one-dimensional, nonrelativistic quantum system consisting of
one particle. The state of the system is completely characterized by the position space wave
function Ψ (x,t), which is interpreted as a probability amplitude. The probability P (x,t)∆x of
the particle being in a “volume” element ∆x centered about the position x at time t is equal to
P (x,t)∆x = |Ψ (x,t)|2
∆x, (16.2)
where |Ψ (x,t)|2 = Ψ (x,t)Ψ ∗(x,t), and Ψ ∗(x,t) is the complex conjugate of Ψ (x,t). This interpretation
of Ψ (x,t) requires the use of normalized wave functions such that
∞
−∞
Ψ ∗
(x,t)Ψ (x,t)dx = 1. (16.3)
If the particle is subjected to the inﬂuence of a potential energy function V (x,t), the evolution
of Ψ (x,t) is given by the time-dependent Schrödinger equation
i
∂Ψ (x,t)
∂t
= −
2
2m
∂2Ψ (x,t)
∂x2
+ V (x,t)Ψ (x,t), (16.4)
where m is the mass of the particle, and is Planck’s constant divided by 2π.
Physically measurable quantities, such as the momentum, have corresponding operators.
The expectation or average value of an observable A is given by
A = Ψ ∗
(x,t) ˆAΨ (x,t)dx, (16.5)
where ˆA is the operator corresponding to the measurable quantity A. For example, the momentum
operator corresponding to the linear momentum p is ˆp = −i ∂/∂x in position space.
If the potential energy function is independent of time, we can obtain solutions of (16.4) of
the form
Ψ (x,t) = φ(x)e−iEt/
. (16.6)
CHAPTER 16. QUANTUM SYSTEMS 664
A particle in the state (16.6) has a well-deﬁned energy E. If we substitute (16.6) into (16.4), we
obtain the time-independent Schrödinger equation
−
2
2m
d2φ(x)
dx2
+ V (x)φ(x) = E φ(x). (16.7)
Note that φ(x) is an eigenstate of the Hamiltonian operator
ˆH = −
2
2m
∂2
∂x2
+ V (x) (16.8)
with the eigenvalue E. That is,
ˆH φ(x) = E φ(x). (16.9)
In general, there are many eigenstates φn, each with eigenvalues En that satisfy (16.9) and the
boundary conditions imposed on the eigenstates by physical considerations.
The general form of Ψ (x,t) can be expressed as a superposition of the eigenstates of the
operator corresponding to any physical observable. For example, if ˆH is independent of time,
we can write
Ψ (x,t) =
n
cn φn(x)e−iEnt/
, (16.10)
where Σ represents a sum over the discrete states and an integral over the continuum states.
The coeﬃcients cn in (16.10) can be determined from the value of Ψ (x,t) at any time t. For
example, if we know Ψ (x,t = 0), we can use the orthonormality property of the eigenstates of
any physical operator to obtain
cn = φ∗
n(x)Ψ (x,0)dx. (16.11)
The coeﬃcient cn can be interpreted as the probability amplitude of a measurement of the total
energy yielding a particular value En.
There are three steps needed to solve (16.7) numerically. The ﬁrst is to integrate (16.7) for
any given value of the energy E in a way similar to the approach we have used for numerically
solving other ordinary diﬀerential equations. This approach will usually not satisfy the boundary
conditions. The second step is to ﬁnd the particular values of E that lead to solutions that
satisfy the boundary conditions. Finally, we need to normalize the eigenstate wave function
using (16.3) so that we can interpret the eigenstate as a probability amplitude.
We ﬁrst discuss the solution of (16.7) without imposing any boundary conditions by treating
the solution to (16.7) as an initial value problem for the wave function and its derivative at some
value of x for a given value of E. We will use these solutions to develop our intuition about the
behavior of one-dimensional solutions to the Schrödinger equation.
To use an ODE solver, we express the wave function rate in terms of the independent variable
x:
dφ
dx
= φ (16.12a)
dφ
dx
= −
2m
2
[E − V (x)]φ (16.12b)
dx
dx
= 1. (16.12c)
Because the time-independent Schrödinger equation is a second-order diﬀerential equation,
two initial conditions must be speciﬁed to obtain a solution. For simplicity, we ﬁrst assume
CHAPTER 16. QUANTUM SYSTEMS 665
that the wave function is zero at the starting point, xmin, and the derivative is nonzero. We
also assume that the range of values of x is ﬁnite and divide this range into intervals of width
∆x. We initially consider potential energy functions V (x) such that V (x) = 0 for x < 0; V (x)
changes abruptly at x = 0 to V0, the value of the stepHeight parameter. An implementation of
the numerical solution of (16.12) is shown in Listing 16.1.
Listing 16.1: The Schroedinger class models the one-dimensional time-independent
Schrödinger equation.
package org . opensourcephysics . sip . ch16 ;
import org . opensourcephysics . numerics . ;
public class Schroedinger implements ODE {
double energy = 0;
double [ ] phi ;
double [ ] x ;
double xmin , xmax ; / / range of values of x
double [ ] s t a t e = new double [ 3 ] ; / / s t a t e = phi , dphi / dx , x
ODESolver solver = new RK45MultiStep ( this ) ;
double stepHeight = 0;
int numberOfPoints ;
public void i n i t i a l i z e ( ) {
phi = new double [ numberOfPoints ] ;
x = new double [ numberOfPoints ] ;
double dx = (xmax−xmin ) / ( numberOfPoints −1);
solver . setStepSize ( dx ) ;
}
void solve ( ) {
for ( int i = 0; i <numberOfPoints ; i ++) { / / zeros wavefunction
phi [ i ] = 0;
}
s t a t e [ 0 ] = 0; / / i n i t i a l phi
s t a t e [ 1 ] = 1 . 0 ; / / nonzero i n i t i a l dphi / dx
s t a t e [ 2 ] = xmin ; / / i n i t i a l value of x
for ( int i = 0; i <numberOfPoints ; i ++) {
phi [ i ] = s t a t e [ 0 ] ; / / s t o r e s wavefunction
x [ i ] = s t a t e [ 2 ] ;
solver . step ( ) ; / / s t e p s Schroedinger equation
i f (Math . abs ( s t a t e [0]) >1.0 e9 ) { / / checks f o r diverging s o l u t i o n
break ; / / l e a v e the loop
}
}
}
public double [ ] getState ( ) {
return s t a t e ;
}
public void getRate ( double [ ] state , double [ ] rate ) {
rate [ 0] = s t a t e [ 1 ] ;
rate [ 1] = 2.0 ( − energy+evaluatePotential ( s t a t e [ 2 ] ) ) s t a t e [ 0 ] ;
rate [ 2] = 1 . 0 ;
CHAPTER 16. QUANTUM SYSTEMS 666
}
public double evaluatePotential ( double x ) { / / p o t e n t i a l i s nonzero f o r x > 0
i f ( x<0) {
return 0;
} else {
return stepHeight ;
}
}
}
The solve method initializes the wave function and position arrays and sets the initial value
of dφ/dx to an arbitrary nonzero value of unity. A loop is then used to compute values of φ until
the solution diverges or until x ≥ xmax.
SchroedingerApp in Listing 16.2 produces a graphical view of φ(x). We will use this program
in Problem 16.1 to study the behavior of the solution as we vary the height of the potential
step.
Listing 16.2: SchroedingerApp solves the one-dimensional time-independent Schrödinger
equation for a given energy.
package org . opensourcephysics . sip . ch16 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . frames . ;
public class SchroedingerApp extends AbstractCalculation {
PlotFrame frame = new PlotFrame ( "x" , "phi" , "Wave function" ) ;
Schroedinger schroedinger = new Schroedinger ( ) ;
public SchroedingerApp ( ) {
frame . setConnected (0 , true ) ;
frame . setMarkerShape (0 , Dataset .NO_MARKER) ;
}
public void calculate ( ) {
schroedinger . xmin = control . getDouble ( "xmin" ) ;
schroedinger . xmax = control . getDouble ( "xmax" ) ;
schroedinger . stepHeight =
control . getDouble ( "step height at x = 0" ) ;
schroedinger . numberOfPoints = control . getInt ( "number of points" ) ;
schroedinger . energy = control . getDouble ( "energy" ) ;
schroedinger . i n i t i a l i z e ( ) ;
schroedinger . solve ( ) ;
frame . append (0 , schroedinger . x , schroedinger . phi ) ;
}
public void reset ( ) {
control . setValue ( "xmin" , −5);
control . setValue ( "xmax" , 5 ) ;
control . setValue ( "step height at x = 0" , 1 ) ;
control . setValue ( "number of points" , 500);
control . setValue ( "energy" , 1 ) ;
}
CHAPTER 16. QUANTUM SYSTEMS 667
public s t a t i c void main ( String [ ] args ) {
CalculationControl . createApp (new SchroedingerApp ( ) , args ) ;
}
}
Problem 16.1. Numerical solution of the time-independent Schrödinger equation
(a) Sketch your guess for φ(x) for a potential step height of V0 = 3 and energies E = 1, 2, 3, 4,
and 5.
(b) Choose xmin = -10 and xmax = 10, and run SchroedingerApp with the parameters given
in part (a). How well do your predictions match the numerical solution? Is there any discontinuity
in φ or in the derivative dφ/dx at x = 0? Describe the wave function for both
x < 0 and x > 0. Why does the wave function have a larger oscillatory amplitude when x > 0
than when x < 0 if the energy is greater than the potential step height?
(c) Describe the behavior of the wave function as the energy approaches the potential step
height. Consider E in the range 2.5 to 3.5 in steps of 0.1.
(d) Repeat part (b) with the initial condition φ = 1 and dφ/dx = 0. Describe the diﬀerences, if
any, in φ(x).
Problem 16.1 demonstrates that the nature of the solution of (16.7) changes dramatically
depending on the relative values of the energy E and the potential energy. If E is greater than V0,
the wave function is oscillatory whereas, if E is less than or equal to V0, the wave function grows
exponentially. The diﬀerential equation solver may fail if the diﬀerence between the potential
energy and E is too large. There is also an exponentially decaying solution in the region where
E < V0, but this solution is diﬃcult to detect.
Problem 16.2. Analytic solutions of the time-independent Schrödinger equation
(a) Find the analytic solution to (16.7) for the step potential for the cases: E > V0, E < V0, and
E = V0. We will use units such that m = = 1 in all the problems in this chapter.
(b) Run SchroedingerApp for the three cases to obtain the numerical solution of (16.7). When
the numercial solution shows spatial oscillations in a region of space, estimate the wavelength
of the oscillations and compare your numerical solution to the analytic results. When
the numerical solution shows exponential decay as a function of position, estimate the decay
rate and compare your numerical solution with the analytic solution.
The solutions that we have obtained so far do not satisfy any condition other than that they
solve (16.12). We have plotted only a portion of the wave function, and the solutions can be
extended by increasing the number of points and the range of x over which the computation is
performed. Physically, these solutions are unrealistic because they cannot be normalized over
all of space. The normalization problem can be solved by using a linear combination of energy
eigenstates (16.10) with diﬀerent values of E. This combination is called a wavepacket.
Although we used a fourth-order algorithm in Listing 16.1, simpler algorithms can be used.
Recall that the solution of (16.7) with V (x) = 0 can be expressed as a linear combination of
sine and cosine functions. The oscillatory nature of this solution leads us to expect that the
Euler–Cromer algorithm introduced in Chapter 3 will yield satisfactory results.
CHAPTER 16. QUANTUM SYSTEMS 668
16.3 Bound State Solutions
We ﬁrst consider potentials for which a particle is conﬁned to a speciﬁc region of space. Such a
potential is known as the inﬁnite square well and is described by
V (x) =



0 for |x| ≤ a
∞ for |x| > a.
(16.13)
For this potential, an acceptable solution of (16.7) must vanish at the boundaries of the well.
We will ﬁnd that the eigenstates φn(x) can satisfy these boundary conditions only for speciﬁc
values of the energy En.
Problem 16.3. The inﬁnite square well
(a) Show analytically that the energy eigenvalues of the inﬁnite square well are given by En =
n2π2 2/8ma2, where n is a positive integer. Also show that the normalized eigenstates have
the form
φn(x) =
1
√
a
cos
nπx
2a
n = 1,3,... (even parity) (16.14a)
φn(x) =
1
√
a
sin
nπx
2a
n = 2,4,... (odd parity). (16.14b)
What is the parity of the ground state solution?
(b) We can solve (16.7) numerically for the inﬁnite square well by setting stepHeight = 0.
xmin = −a, and xmax = +a in SchroedingerApp and requiring that φ(x = +a) = 0. What
is the condition for φ(x = −a) in the program? Choose a = 1 and calculate the ﬁrst four
energy eigenvalues exactly using SchroedingerApp. Do the numerical and analytic solutions
match? Do the solutions satisfy the boundary conditions exactly? Are your numerical
solutions normalized?
Problem 16.4. Bound state solutions of the time-independent Schrödinger equation
(a) Consider the potential energy function deﬁned by
V (x) =



0 for −a ≤ x ≤ 0
V0 for 0 < x ≤ a
∞ for |x| > a.
(16.15)
As for the inﬁnite square well, the eigenfunction is conﬁned between inﬁnite potential barriers
at x = ±a. In addition, there is a step potential at x = 0. Choose a = 5 and V0 = 1 and
run SchroedingerApp with an energy of E = 0.15. Repeat with an energy of E = 0.16. Why
can you conclude that an energy eigenvalue is bracketed by these two values?
(b) Choose a strategy for determining the value of E such that the boundary conditions at x = +a
are satisﬁed. Determine the energy eigenvalue to four decimal places. Does your answer
depend on the number of points at which the wave function is computed?
(c) Repeat the above procedure starting with energy values of 0.58 and 0.59 and ﬁnd the energy
eigenvalue of the second bound state.
CHAPTER 16. QUANTUM SYSTEMS 669
If you were persistent in doing all of Problem 16.4, you would have discovered two energy
eigenvalues, 0.1505 and 0.5857.
The procedure we used is known as the shooting algorithm. The allowed eigenvalues are
imposed by the requirement that φn(x) → 0 at the boundaries. Although the shooting algorithm
usually yields an eigenvalue solution, we often wish to ﬁnd speciﬁc eigenvalues, such as
the eigenvalue E = 1.1195 corresponding to the third excited state for the potential in (16.15).
Because the energy of a wave function increases as the wavelength decreases, we can order the
energy eigenvalues by counting the number of times the corresponding eigenstate crosses the
x-axis, that is, by the number of nodes. The ground state eigenstate has no nodes. Why? Why
can we order the eigenvalues by the number of nodes? The number of nodes can be used to
narrow the energy bracket in the shooting algorithm. For example, if we are searching for the
third energy eigenvalue and we observe 5 nodes, then the energy is too large. To ﬁnd a speciﬁc
quantum state, we automate the shooting method as follows:
1. Choose a value of the energy E and count the number of nodes.
2. Increase E and repeat step 1 until the number of nodes is equal to the desired number.
3. Decrease E and repeat step 1 until the number of nodes is one less than the desired number.
The desired value of the energy eigenvalue is now bracketed. We can further narrow
the energy by doing the following:
4. Set the energy to the bracket midpoint.
5. Initialize φ(x) at the left boundary and iterate φ(x) toward increasing x until φ diverges
or until the right boundary is reached.
6. If the quantum number is even (odd) and the last value of φ(x) in step 4 is negative (positive),
then the trial value of E is too large.
7. If the quantum number is even (odd) and the last value of φ(x) in step 4 is positive (negative),
then the trial value of E is too small.
8. Repeat steps 2–7 until the wave function satisﬁes the right-hand boundary condition to an
acceptable tolerance. This procedure is known as a binary search because every repetition
decreases the energy bracket by a factor of two.
Problem 16.5 asks you to write a program that ﬁnds speciﬁc eigenvalues using this proce-
dure.
Problem 16.5. Shooting algorithm
(a) Modify SchroedingerApp to ﬁnd the eigenvalue associated with a given number of nodes.
How is the number of nodes related to the quantum number? Test your program for the
inﬁnite square well. What is the value of ∆x needed to determine E1 to two decimal places?
three decimal places?
(b) Add a method to normalize φ. Normalize and display the ﬁrst ﬁve eigenstates.
(c) Find the ﬁrst ﬁve eigenstates and eigenvalues for the potential in (16.15) with a = 1 and
V − 0 = 1.
(d) Does your result for E1 depend on the starting value of dφ/dx?
CHAPTER 16. QUANTUM SYSTEMS 670
-a -b ab x
Vb
V
Figure 16.1: An inﬁnite square well with a potential bump of height Vb in the middle.
Problem 16.6. Perturbation of the inﬁnite square well
(a) Determine the eﬀect of a small perturbation on the eigenstates and eigenvalues of the inﬁnite
square well. Place a small rectangular bump of half-width b and height Vb symmetrically
about x = 0 (see Fig. 16.1). Choose b a and determine how the ground state energy
and eigenstate change with Vb and b. What is the relative change in the ground state energy
for Vb = 10, b = 0.1 and Vb = 20, b = 0.1 with a = 1? Let φ0 denote the ground state eigenstate
for b = 0 and let φb denote the ground state eigenstate for b 0. Compute the value of
the overlap integral
a
0
φb(x)φ0(x)dx. (16.16)
This integral would be unity if the perturbation was not present (and the eigenstate was
properly normalized). How is the change in the overlap integral related to the relative
change in the energy eigenvalue?
(b) Compute the ground state energy for Vb = 20 and b = 0.05. How does the value of E1
compare to that found in part (a) for Vb = 10 and b = 0.1?
Because numerical solutions to the Schrödinger equation grow exponentially if V (x) − E >
0, it may not be possible to obtain a numerical solution for φ(x) that satisﬁes the boundary
conditions if V (x) − E is large over an extended region of space. The reason is that energy can
be speciﬁed and φ can be computed only to ﬁnite accuracy. Problem 16.7 shows that we can
sometimes solve this problem using simpler boundary conditions if the potential is symmetric.
In this case,
V (x) = V (−x), (16.17)
and φ(x) can be chosen to have deﬁnite parity. For even parity solutions, φ(−x) = φ(x); odd
parity solutions satisfy φ(−x) = −φ(x). The deﬁnite parity of φ(x) allows us to specify either φ
or φ at x = 0. Hence, the parity of φ determines one of the boundary conditions. For simplicity,
choose φ(0) = 1 and φ (0) = 0 for even parity solutions and φ(0) = 0 and φ (0) = 1 for odd parity
solutions.
Problem 16.7. Symmetric potentials
(a) Modify Schroedinger to make use of symmetric potential boundary conditions for the harmonic
oscillator:
V (x) =
1
2
x2
. (16.18)
CHAPTER 16. QUANTUM SYSTEMS 671
Start the solution at x = 0 using appropriate conditions for even and odd quantum numbers
and ﬁnd the ﬁrst four energy eigenvalues such that the wave function approaches zero for
large values of x. Because the computed φ(x) will diverge for suﬃciently large x, we seek
values of the energy such that a small decrease in E causes the wave function to diverge
in one direction, and a small increase causes the wave function to diverge in the opposite
direction. Initially choose xmax = 5 so that the classically forbidden region is suﬃciently
large so that φ(x) can decay to zero for the ﬁrst few eigenstates. Increase xmax if necessary for
the higher energy eigenvalues. Is there any pattern in the values of the energy eignevalues
you found?
(b) Repeat part (a) for the linear potential V (x) = |x|. Describe the diﬀerences between your
results for this potential and for the harmonic oscillator potential. The quantum mechanical
treatment of the linear potential can be used to model the energy spectrum of a bound
quark-antiquark system known as quarkonium.
(c) Obtain a numerical solution of the anharmonic oscillator V (x) = 1
2 x2 +bx4. In this case there
are no analytic solutions, and numerical solutions are necessary for large values of b. How
do the ground state energy and eigenstate depend on b for small b?
Problem 16.8. Finite square well
The ﬁnite square well potential is given by
V (x) =



0 for |x| ≤ a
V0 for |x| > a.
(16.19)
The input parameters are the well depth, V0, and the half-width of the well, a.
(a) Choose V0 = 10 and a = 1. How do you expect the value of the ground state energy to
compare to its corresponding value for the inﬁnite square well? Compute the ground state
eigenvalue and eigenstate by determining a value of E such that φ(x) has no nodes and
is approximately zero for large x. (See Problem (16.7a) for the procedure for ﬁnding the
eigenvalues.)
(b) Because the well depth is ﬁnite, φ(x) is nonzero in the classically forbidden region for which
E < V0 and x > |a|. Deﬁne the penetration distance as the distance from x = a to a point
where φ is ∼ 1/e ≈ 0.37 of its value at x = a. Determine the qualitative dependence of the
penetration distance on the magnitude of V0.
(c) What is the total number of bound excited states? Why is the total number of bound states
ﬁnite?
As we have found, it is diﬃcult to ﬁnd bound state solutions of the time-independent
Schrödinger equation because the exponential solution allows numerical errors to dominate
when V (x) − E > 0 is large. Because we want to easily generate eigenstates in subsequent sections,
we have written a general-purpose eigenstate solver that examines the maxima and minima
of the solution as well as the nodes to determine the eigenstate’s quantum number. The
code for the Eigenstate class is in the ch16 package. The EigenstateApp target class shows
how the Eigenstate class is used.
Listing 16.3: The EigenstateApp program tests the Eigenstate class.
package org . opensourcephysics . sip . ch16 ;
CHAPTER 16. QUANTUM SYSTEMS 672
import org . opensourcephysics . frames . PlotFrame ;
import org . opensourcephysics . numerics . Function ;
public class EigenstateApp {
public s t a t i c void main ( String [ ] args ) {
PlotFrame drawingFrame =
new PlotFrame ( "x" , "|phi|" , "eigenstate" ) ;
int numberOfPoints = 300;
double
xmin = −5, xmax = +5;
Eigenstate eigenstate =
new Eigenstate (new Potential ( ) , numberOfPoints , xmin , xmax ) ;
int n = 3; / / quantum number
double [ ] phi = eigenstate . getEigenstate (n ) ;
double [ ] x = eigenstate . getXCoordinates ( ) ;
i f ( eigenstate . getErrorCode ()== Eigenstate .NO_ERROR) {
drawingFrame . setMessage ( "energy = "+eigenstate . energy ) ;
} else {
drawingFrame . setMessage ( "eigenvalue did not converge" ) ;
}
drawingFrame . append (0 , x , phi ) ;
drawingFrame . s e t V i s i b l e ( true ) ;
drawingFrame . setDefaultCloseOperation (
javax . swing . JFrame .EXIT_ON_CLOSE ) ;
}
}
class Potential implements Function {
public double evaluate ( double x ) {
return ( x x ) / 2 ;
}
}
The getEigenstate method in the Eigenstate class computes the eigenstate for the speciﬁed
quantum number and returns a zeroed wave function if the algorithm does not converge.
We test the validity of the Eigenstate class in Problem 16.9.
Problem 16.9. The Eigenstate class
(a) Examine the code of the Eigenstate class. What “trick” is used to handle the divergence in
the forbidden region of deep wells?
(b) Write a class that displays the eigenstates of the simple harmonic oscillator using the Calculation
interface. Include input parameters that allow the user to vary the principal quantum
number and the number of points.
(c) Use a spatial grid of 300 points with −5 < x < 5 and compare the known analytic solution
for the simple harmonic oscillator eigenstates to the numerical solution for the lowest three
energy eigenstates. What is the largest energy eigenvalue that can be computed to an accuracy
of 1%? What causes the decreasing accuracy for larger quantum numbers? What if the
domain is increased to −50 < x < 50?
(d) Describe the conditions under which the Eigenstate class fails and demonstrate this failure.
Improve the Eigenstate class to handle at least one failure mode.
CHAPTER 16. QUANTUM SYSTEMS 673
16.4 Time Development of Eigenstate Superpositions
If the Hamiltonian is independent of time, the time development of the wave function Ψ (x,t)
can be expressed as a linear superposition of energy eigenstates φn(x) with eigenvalue En:
Ψ (x,t) =
n
cn φn(x)e−iEnt/
. (16.20)
To understand the time dependence of Ψ (x,t), we begin by studying superpositions of analytic
solutions. The static getEigenstate method in the BoxEigenstate class generates these
solutions for the inﬁnite square well.
Listing 16.4: The BoxEigenstate class generates analytic stationary state solutions for the inﬁnite
square well.
package org . opensourcephysics . sip . ch16 ;
public class BoxEigenstate {
s t a t i c double a = 1; / / length of box
private BoxEigenstate ( ) {
/ / p r o h i b i t i n s t a n t i a t i o n because a l l methods are s t a t i c
}
s t a t i c double [ ] getEigenstate ( int n , int numberOfPoints ) {
double [ ] phi = new double [ numberOfPoints ] ;
n++; / / quantum number
double norm = Math . sqrt (2/ a ) ;
for ( int i = 0; i <numberOfPoints ; i ++) {
phi [ i ] = norm Math . sin ( ( n Math . PI i ) / ( numberOfPoints − 1 ) ) ;
}
return phi ;
}
s t a t i c double getEigenvalue ( int n) {
n++;
return (n n Math . PI Math . PI )/2/ a/a ; / / hbar = 1 , mass = 1
}
}
To visualize the evolution of Ψ (x,t) in (16.20), we deﬁne a class that stores the energy eigenstates
φn(x) the real and imaginary parts of the expansion coeﬃcients cn and the eigenvalues
En. As the system evolves, the eigenstates are added together as in (16.20) using the expansion
coeﬃcients. The BoxSuperposition class shown in Listing 16.5 creates such a wave function
for the inﬁnite square well. Later we will modify this class to study other potentials.
Listing 16.5: The BoxSuperposition class models the time dependence of the wave function of
an inﬁnite square well using a superposition of eigenstates.
package org . opensourcephysics . sip . ch16 ;
public class BoxSuperposition {
double [ ] realCoef ;
double [ ] imagCoef ;
double [ ] [ ] s t a t e s ; / / e i g e n f u n c t i o n s
double [ ] eigenvalues ; / / e i g e n v a l u e s
double [ ] x , realPsi , imagPsi ;
CHAPTER 16. QUANTUM SYSTEMS 674
double [ ] zeroArray ;
public BoxSuperposition ( int numberOfPoints , double [ ] realCoef ,
double [ ] imagCoef ) {
i f ( realCoef . length !=imagCoef . length ) {
throw new IllegalArgumentException ( "Real and imaginary
coefficients must have equal number of elements." ) ;
}
this . realCoef = realCoef ;
this . imagCoef = imagCoef ;
int nstates = realCoef . length ;
/ / delay a l l o c a t i o n of arrays f o r e i g e n s t a t e s
s t a t e s = new double [ nstates ] [ ] ; / / e i g e n f u n c t i o n s
eigenvalues = new double [ nstates ] ; / / e i g e n v a l u e s
realPsi = new double [ numberOfPoints ] ;
imagPsi = new double [ numberOfPoints ] ;
zeroArray = new double [ numberOfPoints ] ;
x = new double [ numberOfPoints ] ;
double dx = BoxEigenstate . a /( numberOfPoints −1);
double xo = 0;
for ( int j = 0 , n = numberOfPoints ; j <n ; j ++) {
x [ j ] = xo ;
xo += dx ;
}
for ( int n = 0;n<nstates ; n++) {
s t a t e s [n] = BoxEigenstate . getEigenstate (n , numberOfPoints ) ;
eigenvalues [n] = BoxEigenstate . getEigenvalue (n ) ;
}
update ( 0 ) ; / / compute the superpositon at t = 0
}
void update ( double time ) {
/ / s e t r e a l and imaginary parts of wave function to zero
System . arraycopy ( zeroArray , 0 , realPsi , 0 , realPsi . length ) ;
System . arraycopy ( zeroArray , 0 , imagPsi , 0 , imagPsi . length ) ;
for ( int i = 0 , nstates = realCoef . length ; i <nstates ; i ++) {
double [ ] phi = s t a t e s [ i ] ;
double re = realCoef [ i ] ;
double im = imagCoef [ i ] ;
double sin = Math . sin ( time eigenvalues [ i ] ) ;
double cos = Math . cos ( time eigenvalues [ i ] ) ;
for ( int j = 1 , n = phi . length −1; j <n ; j ++) {
realPsi [ j ] += ( re cos −im sin ) phi [ j ] ;
imagPsi [ j ] += (im cos+re sin ) phi [ j ] ;
}
}
}
}
The BoxSuperpositionApp class in Listing 16.6 implements the eigenstate superposition
and displays the wave function by extending the AbstractAnimation class and implementing
the doStep method.
CHAPTER 16. QUANTUM SYSTEMS 675
Listing 16.6: BoxSuperpositionApp shows the evolution of a particle in a box.
package org . opensourcephysics . sip . ch16 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . ComplexPlotFrame ;
public class BoxSuperpositionApp extends AbstractSimulation {
ComplexPlotFrame psiFrame = new ComplexPlotFrame ( "x" , "|Psi|" ,
"Time dependent wave function" ) ;
BoxSuperposition superposition ;
double time , dt ;
public BoxSuperpositionApp ( ) {
psiFrame . limitAutoscaleY ( −1 , 1 ) ;
}
public void i n i t i a l i z e ( ) {
time = 0;
psiFrame . setMessage ( "t = "+decimalFormat . format ( time ) ) ;
dt = control . getDouble ( "dt" ) ;
double [ ] re = ( double [ ] ) control . getObject ( "real coef" ) ;
double [ ] im = ( double [ ] ) control . getObject ( "imag coef" ) ;
int numberOfPoints = control . getInt ( "number of points" ) ;
superposition = new BoxSuperposition ( numberOfPoints , re , im ) ;
psiFrame . append ( superposition . x , superposition . realPsi ,
superposition . imagPsi ) ;
}
public void doStep ( ) {
time += dt ;
superposition . update ( time ) ;
psiFrame . clearData ( ) ;
psiFrame . append ( superposition . x , superposition . realPsi ,
superposition . imagPsi ) ;
psiFrame . setMessage ( "t = "+decimalFormat . format ( time ) ) ;
}
public void reset ( ) {
control . setValue ( "dt" , 0.005);
control . setValue ( "real coef" , new double [ ] {0.707 , 0 , 0 . 7 0 7 } ) ;
control . setValue ( "imag coef" , new double [ ] {0 , 0 , 0 } ) ;
control . setValue ( "number of points" , 50);
i n i t i a l i z e ( ) ;
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl . createApp (new BoxSuperpositionApp ( ) ) ;
}
}
Because wave functions have real and imaginary components, the BoxSuperpositionApp
class uses a ComplexPlotFrame for plotting. The ComplexPlotFrame renders data using an envelope
whose height is proportional to the magnitude, and the region between the envelope is
colored from red to blue to show the phase. A more traditional plotting style showing the real
CHAPTER 16. QUANTUM SYSTEMS 676
and imaginary parts of the wave function is available from the frame’s Tools menu. (Also see
Appendix 16A.) We use BoxSuperpositionApp to study the periodicity of the wave function in
Problems 16.10 and 16.11.
Problem 16.10. Time-dependent wave function for the inﬁnite square well
(a) Add a second visualization to the BoxSuperpositionApp class that displays the probability
density Ψ (x,t).
(b) Change the coeﬃcient array so that the particle is in the ground state. Show that the wave
function changes in time, but that the probability density does not. At what times does the
ground state wave function return to its initial condition? Find the corresponding times for
the ﬁrst and second excited states.
(c) Choose the coeﬃcient array so that the particle is in a 50:50 superposition of the ground
state and the ﬁrst excited state. At what times does the wave function return to its initial
condition? After what time does the probability density return to its initial condition?
(d) Change the coeﬃcient array so that the particle is in a 50:50 superposition of the ﬁrst and
second excited states. After what time does the wave function return to its initial condition?
After what time does the probability density return to its initial condition?
(e) Will the initial wave function always revive, that is, return to its initial condition? Explain.
Problem 16.11. Time-dependent wave function for the simple harmonic oscillator
(a) Modify BoxSuperpositionApp and BoxSuperposition to superimpose the eigenstates of
the simple harmonic oscillator using the Eigenstate class to compute the eigenstates. What
are the period of the ground state and the ﬁrst excited state wave functions and the probability
density?
(b) Change the coeﬃcient array so that the particle is in a 50:50 superposition of the ground
state and the ﬁrst excited state. At what times does the wave function return to its initial
condition? At what times does the probability density return to its initial condition?
Compare these times with the period of the classical oscillator.
(c) Repeat part (b) for a 50:50 superposition of the ﬁrst and second excited states.
Problem 16.12. Linear potential
Does the linear potential V (x) = |x| exhibit periodicity if the particle is in a superposition state?
Test your hypothesis using numerical solutions to the Schrödinger equation.
As we have seen, the evolution of an arbitrary wave function can be found by expanding the
initial state in terms of the energy eigenstates. From the orthogonality property of eigenstates,
it is easy to show that
cn =
∞
−∞
φ∗
n(x)Ψ (x,0)dx. (16.21)
This operation is known as a projection of Ψ onto φn.
CHAPTER 16. QUANTUM SYSTEMS 677
Problem 16.13. Projections
(a) Add a projection method to the BoxSuperpositionApp class using the signature:
double [ ] projection ( int n , double [ ] realPhi , double [ ] imagPhi )
The projection method’s arguments are the quantum number, the real component of the
wave function, and the imaginary component of the wave function. The method returns a
two-component array containing the real and imaginary parts of the projection of the wave
function on the nth eigenstate.
(b) Test your projection method by projecting an eigenstate onto another eigenstate. That is,
verify the orthogonality condition
δnm =
∞
−∞
φm(x)φn(x)dx. (16.22)
(c) Compute the expansion coeﬃcients for a particle in a box using the following initial Gaussian
wave function:
Ψ (x,0) = e−64x2
. (16.23)
Assume a box width a = 1. Plot the amplitude of the resulting coeﬃcients as a function
of the quantum number n. How does the shape of this plot depend on the width of the
Gaussian wave function?
(d) Use the coeﬃcients from part (c) to determine the evolution of the wave function. Does the
wave function remain real? Does the initial state revive?
(e) Repeat parts (c) and (d) using the initial wave function
Ψ (x,0) =



2 |x| ≤ 1/8
0, |x| > 1/8.
(16.24)
Problem 16.14. Coherent states
Because the energy eigenvalues of the simple harmonic oscillator are equally spaced, there exist
wave functions known as coherent states whose probability density propagates quasi-classically.
(a) Include a suﬃcient number of expansion coeﬃcients for V (x) = 10x2 to model an initial
Gaussian wave function centered at the origin:
Ψ (x,0) = e−16x2
. (16.25)
Describe the evolution.
(b) Repeat part (a) with
Ψ (x,0) = e−16(x−2)2
. (16.26)
CHAPTER 16. QUANTUM SYSTEMS 678
(c) Show that the wave functions in parts (a) and (b) change their width but not their Gaussian
envelope. Construct a wave function with the following expansion coeﬃcients and observe
its behavior
c2
n =
n n
n!
e− n
. (16.27)
The expectation of the number of quanta n is given by
n = E −
1
2
ω, (16.28)
where E is the energy expectation value of the coherent state.
The expansion of an arbitrary wave function in terms of a set of eigenstates is closely related
to Fourier analysis. Because the eigenstates of a particle in a box are sinusoidal functions, we
could have used the fast Fourier transform algorithm (FFT) to compute the projection coeﬃcients.
Because these coeﬃcients are calculated only once in Problem 16.14, evaluating (16.21)
directly is reasonable. We will use the FFT to study wave functions in momentum space and to
implement the operator splitting method for time evolution in Section 16.6.
16.5 The Time-Dependent Schrödinger Equation
Although the numerical solution of the time-independent Schrödinger equation (16.7) is straightforward
for one particle, the numerical solution of the time-dependent Schrödinger equation
(16.4) is not as simple. A naive approach to its numerical solution can be formulated by introducing
a grid for the time coordinate and a grid for the spatial coordinate. We use the notation
tn = t0 + n∆t, xs = x0 + s∆x, and Ψ (xs,tn). The idea is to relate Ψ (xs,tn+1) to the value of Ψ (xs,tn)
for each value of xs. An example of an algorithm that solves the Schrödinger-like equation
∂Ψ /∂t = ∂2Ψ /∂x2 to ﬁrst order in ∆t is given by
1
∆t
Ψ (xs,tn+1) − Ψ (xs,tn) =
1
(∆x)2
Ψ (xs+1,tn) − 2Ψ (xs,tn) + Ψ (xs−1,tn) . (16.29)
The right-hand side of (16.29) represents a ﬁnite diﬀerence approximation to the second derivative
of Ψ with respect to x. Equation (16.29) is an example of an explicit scheme, because given
Ψ at time tn, we can compute Ψ at time tn+1. Unfortunately, this explicit approach leads to unstable
solutions; that is, the numerical value of Ψ diverges from the exact solution as Ψ evolves
in time.
One way to avoid the instability is to retain the same form as (16.29) but to evaluate the
spatial derivative on the right side of (16.29) at time tn+1 rather than time tn:
1
∆t
Ψ (xs,tn+1) − Ψ (xs,tn) =
1
(∆x)2
Ψ (xs+1,tn+1) − 2Ψ (xs,tn+1) + Ψ (xs−1,tn+1) . (16.30)
Equation (16.30) is an implicit method because the unknown function Ψ (xs,tn+1) appears on
both sides. To obtain Ψ (xs,tn+1), it is necessary to solve a set of linear equations at each time
step. More details of this approach and the demonstration that (16.30) leads to stable solutions
can be found in the references.
Visscher and others have suggested an alternative approach in which the real and imaginary
parts of Ψ are treated separately and deﬁned at diﬀerent times. The algorithm ensures that the
total probability remains constant. If we let
Ψ (x,t) = R(x,t) + i I(x,t), (16.31)
CHAPTER 16. QUANTUM SYSTEMS 679
then Schrödinger’s equation i∂Ψ (x,t)/∂t = ˆHΨ (x,t) becomes ( = 1 as usual)
∂R(x,t)
∂t
= ˆH I(x,t) (16.32a)
∂I(x,t)
∂t
= − ˆH R(x,t). (16.32b)
A stable method of numerically solving (16.32) is to use a form of the half-step method (see
Appendix 3A). The resulting diﬀerence equations are
R(x,t + ∆t) = R(x,t) + ˆH I x,t +
1
2
∆t ∆t (16.33a)
I x,t +
3
2
∆t = I x,t +
1
2
∆t − ˆH R(x,t)∆t, (16.33b)
where the initial values are given by R(x,0) and I(x, 1
2 ∆t). Visscher has shown that this algorithm
is stable if
−2
∆t
≤ V ≤
2
∆t
−
2 2
(m∆x)2
, (16.34)
where the inequality (16.34) holds for all values of the potential V .
The appropriate deﬁnition of the probability density P (x,t) = R(x,t)2 +I(x,t)2 is not obvious
because R and I are not deﬁned at the same time. The following choice conserves the total
probability:
P (x,t) = R(x,t)2
+ I x,t +
1
2
∆t I x,t −
1
2
∆t (16.35a)
P x,t +
1
2
∆t = R(t + ∆t)R(x,t) + I x,t +
1
2
∆t
2
. (16.35b)
An implementation of (16.33) is given in the TDHalfStep class in Listing 16.7. The real part
of the wave function is ﬁrst updated for all positions, and then the imaginary part is updated
using the new values of the real part.
Listing 16.7: The TDHalfStep class solves the one-dimensional time-dependent Schrödinger
equation.
package org . opensourcephysics . sip . ch16 ;
public class TDHalfStep {
double [ ] x , realPsi , imagPsi , potential ;
double dx , dx2 ;
double dt = 0.001;
public TDHalfStep ( GaussianPacket packet , int numberOfPoints ,
double xmin , double xmax) {
realPsi = new double [ numberOfPoints ] ;
imagPsi = new double [ numberOfPoints ] ;
potential = new double [ numberOfPoints ] ;
x = new double [ numberOfPoints ] ;
dx = (xmax−xmin ) / ( numberOfPoints −1);
dx2 = dx dx ;
double x0 = xmin ;
for ( int i = 0 , n = realPsi . length ; i <n ; i ++) {
CHAPTER 16. QUANTUM SYSTEMS 680
x [ i ] = x0 ;
potential [ i ] = getV ( x0 ) ;
realPsi [ i ] = packet . getReal ( x0 ) ;
imagPsi [ i ] = packet . getImaginary ( x0 ) ;
x0 += dx ;
}
dt = getMaxDt ( ) ;
/ / advances the imaginary part by 1/2 st ep at s t a r t
for ( int i = 1 , n = realPsi . length −1; i <n ; i ++) {
/ / deltaRe = change in r e a l part of p s i in 1/2 st ep
double deltaRe = potential [ i ] realPsi [ i ] −0.5 ( realPsi [ i +1]−
2 realPsi [ i ]+ realPsi [ i −1])/ dx2 ;
imagPsi [ i ] −= deltaRe dt /2;
}
}
double getMaxDt ( ) {
double dt = Double .MAX_VALUE;
for ( int i = 0 , n = potential . length ; i <n ; i ++) {
i f ( potential [ i ] <0) {
dt = Math . min( dt , −2/ potential [ i ] ) ;
}
double a = potential [ i ]+2/dx2 ;
i f ( a>0) {
dt = Math . min( dt , 2/a ) ;
}
}
return dt ;
}
double step ( ) {
for ( int i = 1 , n = imagPsi . length −1; i <n ; i ++) {
double imH = potential [ i ] imagPsi [ i ] −0.5 ( imagPsi [ i +1]
−2 imagPsi [ i ]+ imagPsi [ i −1])/ dx2 ;
realPsi [ i ] += imH dt ;
}
for ( int i = 1 , n = realPsi . length −1; i <n ; i ++) {
double reH = potential [ i ] realPsi [ i ] −0.5 ( realPsi [ i +1]
−2 realPsi [ i ]+ realPsi [ i −1])/ dx2 ;
imagPsi [ i ] −= reH dt ;
}
return dt ;
}
public double getV ( double x ) {
return 0; / / change t h i s statement to model other p o t e n t i a l s
}
}
Before we can use the TDHalfStep class, we need to choose an initial wave function. A
convenient form is the Gaussian wave packet with a width w centered about x0 given by
Ψ (x,0) =
1
2πw2
1/4
eik0(x−x0)
e−(x−x0)2/4w2
. (16.36)
CHAPTER 16. QUANTUM SYSTEMS 681
The expectation value of the initial velocity of the wave packet is v = p0/m = k0/m. Note that
the wave function has a nonzero momentum expectation value, which is known as a momentum
boost. An implementation of (16.36) is shown in the GaussianPacket class. The constructor is
passed the width, center, and momentum of the packet. Real and imaginary values can then be
calculated at any x to ﬁll the wave function arrays.
Listing 16.8: The GaussianPacket class creates a wave function with a Gaussian probability
distribution and a momentum boost.
package org . opensourcephysics . sip . ch16 ;
public class GaussianPacket {
double w, x0 , p0 ;
double w42 ;
double norm ;
public GaussianPacket ( double width , double center , double momentum) {
w = width ;
w42 = 4 w w;
x0 = center ;
p0 = momentum;
norm = Math .pow(2 Math . PI w w, −0.25);
}
public double getReal ( double x ) {
return norm Math . exp ( −(x−x0 ) ( x−x0 )/w42) Math . cos (p0 ( x−x0 ) ) ;
}
public double getImaginary ( double x ) {
return norm Math . exp ( −(x−x0 ) ( x−x0 )/w42) Math . sin (p0 ( x−x0 ) ) ;
}
}
To start the half-step algorithm, we need the value of I(x,t = 1
2 ∆t) and R(x,t = 0). To obtain
I(x,t = 1
2 ∆t), we use the real component of the wave function to perform a half step.
I x,t +
1
2
∆t = I(x,t) − ˆH R(x,t)
∆t
2
. (16.37)
The normalization factor must be computed after we correct the initial wave function using
(16.37). For completeness, we list the TDHalfStepApp target class.
Listing 16.9: The TDHalfStepApp class solves the time-independent Schrödinger equation and
displays the wave function.
package org . opensourcephysics . sip . ch16 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . ComplexPlotFrame ;
public class TDHalfStepApp extends AbstractSimulation {
ComplexPlotFrame psiFrame = new ComplexPlotFrame ( "x" , "|Psi|" ,
"Wave function" ) ;
TDHalfStep wavefunction ;
double time ;
public TDHalfStepApp ( ) {
CHAPTER 16. QUANTUM SYSTEMS 682
/ / do not a u t o s c a l e within t h i s y−range
psiFrame . limitAutoscaleY ( −1 , 1 ) ;
}
public void i n i t i a l i z e ( ) {
time = 0;
psiFrame . setMessage ( "t=" +0);
double xmin = control . getDouble ( "xmin" ) ;
double xmax = control . getDouble ( "xmax" ) ;
int numberOfPoints = control . getInt ( "number of points" ) ;
double width = control . getDouble ( "packet width" ) ;
double x0 = control . getDouble ( "packet offset" ) ;
double momentum = control . getDouble ( "packet momentum" ) ;
GaussianPacket packet = new GaussianPacket ( width , x0 , momentum ) ;
wavefunction = new TDHalfStep ( packet , numberOfPoints , xmin , xmax ) ;
psiFrame . clearData ( ) ; / / removes old data
psiFrame . append ( wavefunction . x , wavefunction . realPsi ,
wavefunction . imagPsi ) ;
}
public void doStep ( ) {
time += wavefunction . step ( ) ;
psiFrame . clearData ( ) ;
psiFrame . append ( wavefunction . x , wavefunction . realPsi ,
wavefunction . imagPsi ) ;
psiFrame . setMessage ( "t="+decimalFormat . format ( time ) ) ;
}
public void reset ( ) {
control . setValue ( "xmin" , −20);
control . setValue ( "xmax" , 20);
control . setValue ( "number of points" , 500);
control . setValue ( "packet width" , 1 ) ;
control . setValue ( "packet offset" , −15);
control . setValue ( "packet momentum" , 2 ) ;
/ / multiple computations per animation st ep
setStepsPerDisplay ( 1 0 ) ;
enableStepsPerDisplay ( true ) ;
i n i t i a l i z e ( ) ;
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl . createApp (new TDHalfStepApp ( ) ) ;
}
}
Problem 16.15. Evolution of a wave packet
(a) Add an array to TDHalfStepApp that saves the imaginary part of the wave function at the
previous time step so that the probability density can be computed using (16.35). Show that
the probability is conserved.
(b) Use TDHalfStepApp to follow the motion of a wave packet in a potential-free region. Let
x0 = −15, k0 = 2, w = 1, dx = 0.4, and dt = 0.1. Suitable values for the minimum and
CHAPTER 16. QUANTUM SYSTEMS 683
maximum values of x on the grid are xmin = −20 and xmax = 20. What is the shape of the
wave packet at diﬀerent times? Does the shape of the wave packet depend on your choice
of the parameters k0 and w?
(c) Modify TDHalfStepApp so that the quantities x0(t) and w(t), the position and width of the
wave packet as a function of time, can be measured directly. What is a reasonable deﬁnition
of w(t)? What is the qualitative dependence of x0 and w on t? How do your results change
if the initial width of the packet is reduced by a factor of four?
Problem 16.16. Evolution of wave packet incident on a potential step
(a) Use TDHalfStepApp with a step potential beginning at x = 0 with height V0 = 2. Choose
x0 = −10, k0 = 2, w = 1, dx = 0.4, dt = 0.1, xmin = −20, and xmax = 20. Describe the motion
of the wave packet. Does the shape of the wave packet remain a Gaussian for all t? What
happens to the wave packet at x = 0? Determine the height and width of the reﬂected and
transmitted wave packets, the time ti for the incident wave to reach the barrier at x = 0, and
the time tr for the reﬂected wave to return to x = x0. Is tr = ti? If these times are not equal,
explain the reason for the diﬀerence.
(b) Repeat the analysis in part (a) for a step potential of height V0 = 10. Is tr ≈ ti in this case?
(c) What is the motion of a classical particle with a kinetic energy corresponding to the central
wave vector k = k0?
Problem 16.17. Scattering of a wave packet from a potential barrier
(a) Consider a potential barrier of the form
V (x) =



0 (x < 0)
V0 (0 ≤ x ≤ a)
0 (x > a).
(16.38)
Generate a series of snapshots that show the wave packet approaching the barrier and then
interacting with it to generate reﬂected and transmitted packets. Choose V0 = 2 and a = 1
and consider the behavior of the wave packet for k0 = 1, 1.5, 2, and 3. Does the width of the
packet increase with time? How does the width depend on k0? For what values of k0 is the
motion of the packet in qualitative agreement with the motion of a corresponding classical
particle?
(b) Consider a square well with V0 = −2 and consider the same questions as in part (a).
Problem 16.18. Evolution of two wave packets
Modify GaussianPacket in Listing 16.8 to include two wave packets with identical widths and
speeds, with the sign of k0 chosen so that the two wave packets approach each other. Choose
their respective values of x0 so that the two packets are initially well separated. Let V = 0 and
describe what happens when you determine their time dependence. Do the packets inﬂuence
each other? What do your results imply about the existence of a superposition principle?
CHAPTER 16. QUANTUM SYSTEMS 684
16.6 Fourier Transformations and Momentum Space
The position space wave function, Ψ (x,t), is only one of many possible representations of a
quantum mechanical state. A quantum system also is completely characterized by the momentum
space wave function, Φ(p,t). The probability P (p,t)∆p of the particle being in a “volume”
element ∆p centered about the momentum p at time t is equal to
P (p,t)∆p = |Φ(p,t)|2
∆p. (16.39)
Because either a position space or a momentum space representation provides a complete description
of the system, it is possible to transform the wave function from one space to another
as:
Φ(p,t) =
1
√
2π
∞
−∞
Ψ (x,t)e−ipx/
dx (16.40)
Ψ (x,t) =
1
√
2π
∞
−∞
Φ(p,t)eipx/
dp. (16.41)
The momentum and position space transformations, (16.40) and (16.41), are Fourier integrals.
Because a computer stores a wave function on a ﬁnite grid, these transformations simplify
to the familiar Fourier series (see Section 9.3):
Φm =
N/2
n=−N/2
Ψne−ipmxn/
, (16.42)
Ψn =
1
N
N/2
m=−N/2
Φmeipmxn/
, (16.43)
where Φm = Φ(pm) and Ψn = Ψ (xn). We have not explicitly shown the time dependence in
(16.42) and (16.43).
We now use the FFTApp program introduced in Section 9.3 to transform a wave function
between position and momentum space. Note that the wavenumber 2π/λ (or 2π/T in the time
domain) in classical physics has the same numerical value as momentum in quantum mechanics
p = h/λ = 2π /λ in units such that = 1. Consequently, we can use the getWrappedOmega and
getNaturalOmega methods in the FFT class to generate arrays containing momentum values for
a transformed position space wave function.
The FFTApp program in Listing 9.7 transforms N complex data points using an input array
that has length 2N. The real part of the jth data point is stored in array element 2j and the
imaginary part is stored in element 2j + 1. The FFT class transforms this array and maintains
the same ordering of real and imaginary parts. However, the momenta (wavenumbers) are in
warp-around order starting with the zero momentum coeﬃcients in the ﬁrst two elements and
switching to negative momenta halfway through the array. The toNaturalOrder class sorts the
array in order of increasing momentum. We use the FFTApp class in Problem 16.19.
Problem 16.19. Transforming to momentum space
(a) The FFTApp class initializes the wave function grid using the following complex exponential:
Ψn = Ψ (n∆x) = ein∆x
= cosn∆x + i sinn∆x. (16.44)
CHAPTER 16. QUANTUM SYSTEMS 685
Use FFTApp to show that a complex exponential has a deﬁnite momentum if the grid contains
an integer number of wavelengths. In other words, show that there is only one nonzero
Fourier component.
(b) How small a wavelength (or how large a momentum) can be modeled if the spatial grid has
N points and extends over a distance L?
(c) Where do the maximum, zero, and minimum values of the momentum occur in wraparound
order?
After the transformation, the momentum space wave function is stored in an array. The
array elements can be assigned a momentum value using the de Broglie relation p = h/λ. The
longest wavelength that can exist on the grid is equal to the grid dimension L = (N − 1)∆x, and
this wave has a momentum of
p0 =
h
L
. (16.45)
Points on the momentum grid have momentum values with integer multiples of p0.
Problem 16.20. Momentum visualization
Add a ComplexPlotFrame to the FFTApp program to show the momentum space wave function
of a position space Gaussian wave packet. Add a user interface to control the width of the
Gaussian wave packet and verify the Heisenberg uncertainty relation ∆x∆p ≥ /2. Shift the
center of the position space wave packet and explain the change in the resulting momentum
space wave function.
Problem 16.21. Momentum time evolution
Modify TDHalfStepApp so that it displays the momentum space wave function in addition to the
position space wave function. Describe the momentum space evolution of a Gaussian packet for
the inﬁnite square well and a simple harmonic oscillator potential. What evidence of classicallike
behavior do you observe?
The FFT can be used to implement a fast and accurate method for solving Schrödinger’s
equation. We start by writing (16.4) in operator notation as
i
∂Ψ (x,t)
∂t
= ˆHΨ (x,t) = ( ˆT + ˆV )Ψ (x,t), (16.46)
where ˆH, ˆT , and ˆV are the Hamiltonian, kinetic energy, and potential energy operators, respectively.
The formal solution to (16.46) is
Ψ (x,t) = e−i ˆH(t−t0)/
Ψ (x,t0) = e−i( ˆT + ˆV )(t−t0)/
Ψ (x,t0). (16.47)
The time evolution operator ˆU is deﬁned as
ˆU = e−i ˆH(t−t0)/
= e−i( ˆT + ˆV )(t−t0)/
. (16.48)
It might be tempting to express the time evolution operator as
ˆU = e−i ˆT ∆t/
e−i ˆV ∆t/
, (16.49)
CHAPTER 16. QUANTUM SYSTEMS 686
but (16.49) is valid only for ∆t ≡ t −t0 << 1, because ˆT and ˆV do not commute. A more accurate
approximation (accurate to second order in ∆t) is obtained by using the following symmetric
decomposition:
ˆU = e−i ˆV ∆t/2
e−i ˆT ∆t/
e−i ˆV ∆t/2
. (16.50)
The key to using (16.50) to solve (16.46) is to use the position space wave function when
applying e−i ˆV ∆t/2 and to use the momentum space wave function when applying e−i ˆT ∆t/2 . In
position space, the potential energy operator is equivalent to simply multiplying by the potential
energy function. That is, the eﬀect of the ﬁrst and last terms in (16.50) is to multiply points
on the position grid by a phase factor that is proportional to the potential energy:
˜Ψj = e−iV (xj )∆t/2
Ψj. (16.51)
Because the kinetic energy operator in position space involves partial derivatives, it is convenient
to transform both the operator and the wave function to momentum space. In momentum
space the kinetic energy operator is equivalent to multiplying by the kinetic energy p2/2m.
The middle term in (16.50) operates by multiplying points on the momentum grid by a phase
factor that is proportional to the kinetic energy:
˜Φj = e
−ip2
j ∆t/2m
Φj. (16.52)
The split-operator algorithm jumps back and forth between position and momentum space
to propagate the wave function. The algorithm starts in position space where each grid value
Ψj = Ψ (xj,t) is multiplied by (16.51). The wave function is then transformed to momentum
space where every momentum value Φj is multiplied by (16.52). It is then transformed back
to position space where (16.51) is applied a second time. A single time step can therefore be
written as
Ψ (x,t + ∆t) = e−iV (x)∆t/2
F−1
e−ip2∆t/2m
F[e−iV (x)∆t/2
Ψ (x,t)] , (16.53)
where F is the Fourier transform to momentum space and F−1 is its inverse.
Problem 16.22. Split-operator algorithm
(a) Write a program to implement the split-operator algorithm. It is necessary to evaluate the
exponential phase factors only once when implementing the split-operator algorithm. Store
the complex exponentials in arrays that match the x-values on the spatial grid and the pvalues
on the momentum grid. Use wrap-around order when storing the momentum phase
factors because the FFT class inverse transformation assumes that data are in wrap-around
order. You can use the getWrappedOmega method in the FFT to obtain the momenta in this
ordering.
(b) Compare the evolution of a Gaussian wave packet using the split-operator and half-step
algorithms using identical grids. How does the ﬁnite grid size aﬀect each algorithm?
(c) Compare the computation speed of the split-operator and half-step algorithms using a
Gaussian wave packet in a square well. Disable plotting and other nonessential computation
when comparing the speeds.
Problem 16.23. Split-operator accuracy
The split-operator and half-step algorithms fail if the time step is too large. Use both algorithms
to evolve a simple harmonic oscillator coherent state (see Problem 16.14). Describe the error
that occurs if the time step becomes too large.
CHAPTER 16. QUANTUM SYSTEMS 687
16.7 Variational Methods
One way of obtaining a good approximation of the ground state energy is to use a variational
method. This approach has numerous applications in chemistry, atomic and molecular physics,
nuclear physics, and condensed matter physics. Consider a system whose Hamiltonian operator
ˆH is given by (16.8). According to the variational principle, the expectation value of the
Hamiltonian for an arbitrary trial wave function Ψ is greater than or equal to the ground state
energy E0. That is,
H = E[Ψ ] =
Ψ ∗(x) ˆHΨ (x)dx
Ψ ∗(x)Ψ (x)dx
≥ E0, (16.54)
where E0 is the exact ground state energy of the system. We assume that the wave function is
continuous and bounded. The inequality (16.54) reduces to an equality only if Ψ is an eigenstate
of ˆH with the eigenvalue E0. For bound states, Ψ may be assumed to be real without loss of
generality so that Ψ ∗ = Ψ and thus |Ψ (x)|2 = Ψ (x)2. This assumption implies that we do not
need to store two values representing the real and imaginary parts of Ψ .
The inequality (16.54) is the basis of the variational method. The procedure is to choose
a physically reasonable form for the trial wave function Ψ (x) that depends on one or more
parameters. The expectation value E[Ψ ] is computed, and the parameters are varied until a
minimum of E[Ψ ] is obtained. This value of E[Ψ ] is an upper bound to the true ground state
energy. Often forms of Ψ are chosen so that the integrals in (16.54) can be done analytically. To
avoid this restriction we can use numerical integration methods.
In most applications of the variational method the integrals in (16.54) are multidimensional
and Monte Carlo integration methods are essential. For this reason we will use Monte Carlo
integration in the following, even though we will consider only one- and two-body problems.
Because it is ineﬃcient to simply choose points at random to compute E[Ψ ], we rewrite (16.54)
in a form that allows us to use importance sampling. We write
E[Ψ ] =
Ψ (x)2EL(x)dx
Ψ (x)2 dx
, (16.55)
where EL is the local energy
EL(x) =
ˆHΨ (x)
Ψ (x)
, (16.56)
which can be calculated analytically using the trial wave function. The form of (16.55) is that of
a weighted average with the weight equal to the normalized probability density Ψ (x)2/ Ψ (x)2 dx.
As discussed in Section 11.6, we can sample values of x using the distribution Ψ (x)2 so that the
Monte Carlo estimate of E[Ψ ] is given by the sum
E[Ψ ] = lim
n→∞
1
n
n
i=1
EL(xi), (16.57)
where n is the number of times that x is sampled from Ψ 2. How can we sample from Ψ 2? In
general, it is not possible to use the inverse transform method (see Section 11.5) to generate
a nonuniform distribution. A convenient alternative is the Metropolis method which has the
advantage that only an unnormalized Ψ 2 is needed for the proposed move.
CHAPTER 16. QUANTUM SYSTEMS 688
Problem 16.24. Ground state energy of several one-dimensional systems
(a) It is useful to test the variational method on an exactly solvable problem. Consider the
one-dimensional harmonic oscillator with V (x) = x2/2. Choose the trial wave function to be
Ψ (x) ∝ e−λx2
, with λ the variational parameter. Generate values of x chosen from a normalized
Ψ 2(x) using the inverse transform method and verify that λ = 1/2 yields the smallest
upper bound by considering λ = 1/2 and four other values of λ near 1/2. Another way to
generate a Gaussian distribution is to use the Box–Muller method discussed in Section 11.5.
(b) Repeat part (a) using the Metropolis method to generate x distributed according to Ψ (x)2 ∝
e−2λx2
and evaluate (16.57). As discussed in Section 11.7, the Metropolis method can be
summarized by the following steps:
(i) Choose a trial position xtrial = xn + δn, where δn is a uniform random number in the
interval [−δ,δ].
(ii) Compute w = p(xtrial)/p(xn), where in this case p(x) = e−2λx2
.
(iii) If w ≥ 1, accept the change and let xn+1 = xtrial.
(iv) If w < 1, generate a random number r and let xn+1 = xtrial if r ≤ w.
(v) If the trial change is not accepted, then let xn+1 = xn.
Remember that it is necessary to wait for equilibrium (convergence to the distribution Ψ 2)
before computing the average value of EL. Look for a systematic trend in EL over the course
of the random walk. Choose a step size δ that gives a reasonable value for the acceptance
ratio. How many trials are necessary to obtain EL to within 1% accuracy compared to the
exact analytic result?
(c) Instead of ﬁnding the minimum of EL as a function of the various variational parameters,
minimize the quantity
σ2
L = E2
L − EL
2
. (16.58)
Verify that the exact minimum value of σ2
L [Ψ ] is zero, whereas the exact minimum value of
EL[Ψ ] is unknown in general.
(d) Consider the anharmonic potential V (x) = 1
2 x2 +bx4. Plot V (x) as a function of x for b = 1/8.
Use ﬁrst-order perturbation theory to calculate the lowest order change in the ground state
energy due to the x4 term. Then choose a reasonable form for your trial wave function and
use your Monte Carlo program to estimate the ground state energy. How does your result
compare with ﬁrst-order perturbation theory?
(e) Consider the anharmonic potential of part (d) with b = −1/8. Plot V (x). Use ﬁrst-order
perturbation theory to calculate the lowest order change in the ground state energy due to
the x4 term and then use your program to estimate E0. Do your Monte Carlo estimates for
the ground state energy have a lower bound? Why or why not?
(f) Modify your program so that it can be applied to the ground state of the hydrogen atom. In
this case we have V (r) = −e2/r, where e is the magnitude of the charge on the electron. The
element of integration dx in (16.55) is replaced by 4πr2 dr. Choose Ψ ∝ e−r/a, where a is the
variational parameter. Measure lengths in terms of the Bohr radius 2/me2 and energy in
terms of the Rydberg me4/2 2. In these units µ = e2 = = 1. Find the optimal value of a.
What is the corresponding energy?
CHAPTER 16. QUANTUM SYSTEMS 689
(g) Consider the Yukawa or screened Coulomb potential for which V (r) = −e2e−αr/r, where
α > 0. In this case the ground state and wave function can only be obtained numerically.
For α = 0.5 and α = 1.0 the most accurate numerical estimates of E0 are −0.14808 and
−0.01016, respectively. What is a good choice for the form of the trial wave function? How
close can you come to these estimates?
Problem 16.25. Variational estimate of the ground state of Helium
Helium has long served as a testing ground for atomic trial wave functions. Consider the ground
state of the helium atom with the interaction
V (r1,r2) = −2e2 1
r1
+
1
r2
+
e2
r12
, (16.59)
where r12 is the separation between the two electrons. Assume that the nucleus is ﬁxed and ignore
relativistic eﬀects. Choose Ψ (r1,r2) = Ae−Zeﬀ(r1+r2)/a0 , where Zeﬀ is a variational parameter.
Estimate the upper bound to the ground state energy based on this functional form of Ψ .
Our discussion of variational Monte Carlo methods has been only introductory in nature.
One important application of variational Monte Carlo methods is to optimize a given trial wave
function which is then used to “guide” the Monte Carlo methods discussed in Sections 16.8 and
16.9.
16.8 Random Walk Solutions of the Schrödinger Equation
We now introduce a Monte Carlo approach based on expressing the Schrödinger equation in
imaginary time. This approach follows that of Anderson (see references). We will then discuss
several other quantum Monte Carlo methods. We will see that although the systems of interest
are quantum mechanical, we can convert them to systems for which we can use classical Monte
Carlo methods.
To understand how we can interpret the Schrödinger equation in terms of a random walk
in imaginary time, we substitute τ = it/ into the time-dependent Schrödinger equation for a
free particle and write (in one dimension)
∂Ψ (x,τ)
∂τ
=
2
2m
∂2Ψ (x,τ)
∂x2
. (16.60)
Note that (16.60) is identical in form to the diﬀusion equation (16.1). Hence, we can interpret
the wave function Ψ as a probability density with a diﬀusion constant D = 2/2m.
From our discussion in Chapter 7, we know that we can use the formal similarity between
the diﬀusion equation and the imaginary-time free particle Schrödinger equation to solve the
latter by replacing it by an equivalent random walk problem. To understand how we can interpret
the role of the potential energy term in the context of random walks, we write Schrödinger’s
equation in imaginary time as
∂Ψ (x,τ)
∂τ
=
2
2m
∂2Ψ (x,τ)
∂x2
− V (x)Ψ (x,τ). (16.61)
If we were to ignore the ﬁrst term (the diﬀusion term) on the right side of (16.61), the result
would be a ﬁrst-order diﬀerential equation corresponding to a decay or growth process depending
on the sign of V . We can obtain the solution to this ﬁrst-order equation by replacing it by
CHAPTER 16. QUANTUM SYSTEMS 690
a random decay or growth process, for example, radioactive decay. These considerations suggest
that we can interpret (16.61) as a combination of diﬀusion and branching processes. In the
latter, the number of walkers increases or decreases at a point x depending on the sign of V (x).
The walkers do not interact with each other because the Schrödinger equation (16.61) is linear
in Ψ . Note that it is Ψ ∆x and not Ψ 2∆x that corresponds to the probability distribution of the
random walkers. This probabilistic interpretation requires that Ψ be nonnegative and real.
We now use this probabilistic interpretation of (16.61) to develop an algorithm for determining
the ground state wave function and energy. The general solution of Schrödinger’s equation
can be written for imaginary time τ as [see (16.10)]
Ψ (x,τ) =
n
cn φn(x)e−Enτ
. (16.62)
For suﬃciently large τ, the dominant term in the sum in (16.62) comes from the term representing
the eigenvalue of lowest energy. Hence, we have
Ψ (x,τ → ∞) = c0 φ0(x)e−E0τ
. (16.63)
From (16.63) we see that the spatial dependence of Ψ (x,τ → ∞) is proportional to the ground
state eigenstate φ0(x). If E0 > 0, we also see that Ψ (x,τ) and hence the population of walkers
will eventually decay to zero unless E0 = 0. This problem can be avoided by measuring E0
from an arbitrary reference energy Vref, which is adjusted so that an approximate steady state
distribution of random walkers is obtained.
Although we could attempt to ﬁt the τ-dependence of the computed probability distribution
of the random walkers to (16.63) and thereby extract E0, it is more convenient to compute
E0 directly from the relation
E0 = V =
niV (xi)
ni
, (16.64)
where ni is the number of walkers at xi at time τ. An estimate for E0 can be found by averaging
the sum in (16.64) for several values of τ once a steady state distribution of random walkers has
been reached. To derive (16.64), we rewrite (16.61) and (16.63) by explicitly introducing the
reference potential Vref:
∂Ψ (x,τ)
∂τ
=
2
2m
∂2Ψ (x,τ)
∂x2
− V (x) − Vref Ψ (x,τ), (16.65)
and
Ψ (x,τ) ≈ c0φ0(x)e−(E0−Vref)τ
. (16.66)
We ﬁrst integrate (16.65) with respect to x. Because ∂Ψ (x,τ)/∂x vanishes in the limit |x| → ∞,
(∂2Ψ /∂x2)dx = 0, and hence
∂Ψ (x,τ)
∂τ
dx = − V (x)Ψ (x,τ)dx + Vref Ψ (x,τ)dx. (16.67)
If we diﬀerentiate (16.66) with respect to τ, we obtain the relation
∂Ψ (x,τ)
∂τ
= (Vref − E0)Ψ (x,τ). (16.68)
We then substitute (16.68) for ∂Ψ /∂τ into (16.67) and ﬁnd
(Vref − E0)Ψ (x,τ)dx = − V (x)Ψ (x,τ)dx + Vref Ψ (x,τ)dx. (16.69)
CHAPTER 16. QUANTUM SYSTEMS 691
If we cancel the terms proportional to Vref in (16.69), we ﬁnd that
E0 Ψ (x,τ)dx = V (x),Ψ (x,τ)dx, (16.70)
or
E0 =
V (x)Ψ (x,τ)dx
Ψ (x,τ)dx
. (16.71)
The desired result (16.64) follows by making the connection between Ψ (x)∆x and the density
of walkers between x and x + ∆x.
Although the derivation of (16.64) is somewhat involved, the random walk algorithm is
straightforward. A simple implementation of the algorithm is as follows:
1. Place a total of N0 walkers at the initial set of positions xi, where the xi need not be on a
grid.
2. Compute the reference energy Vref = i Vi/N0.
3. Randomly move the ﬁrst walker to the right or left by a ﬁxed step length ∆s. The step
length ∆s is related to the time step ∆τ by (∆s)2 = 2D∆τ. (D = 1/2 in units such that
= m = 1.)
4. Compute ∆V = V (x) − Vref and a random number r in the unit interval. If ∆V > 0 and
r < ∆V ∆τ, then remove the walker. If ∆V < 0 and r < −∆V ∆τ, then add another walker
at x. Otherwise, just leave the walker at x. This procedure is accurate only in the limit of
∆τ << 1. A more accurate procedure consists of computing Pb = e−∆V ∆τ −1 = n+f , where n
is the integer part of Pb, and f is the fractional part. We then make n copies of the walker,
and if f > r, we make one more copy.
5. Repeat steps 3 and 4 for each of the N0 walkers and compute the mean potential energy
(16.71) and the actual number of random walkers. The new reference potential is given
by
Vref = V −
a
N0∆τ
(N − N0), (16.72)
where N is the new number of random walkers, and V is their mean potential energy.
The average of V is an estimate of the ground state energy. The parameter a is adjusted so
that the number of random walkers N remains approximately constant.
6. Repeat steps 3–5 until the estimates of the ground state energy V have reached a steady
state value with only random ﬂuctuations. Average V over many Monte Carlo steps to
compute the ground state energy. Do a similar calculation to estimate the distribution of
random walkers.
The QMWalk class implements this algorithm for the harmonic oscillator potential. Initially,
the walkers are randomly distributed within a distance initialWidth of the origin. The program
also estimates the ground state wave function by accumulating the spatial distribution
of the walkers at discrete intervals of position. The input parameters are the desired number
of walkers N0, the number of position intervals to accumulate data for the ground state wave
function numberOfBins, and the step size ds. We also use ds for the interval size in the wave
function computation. The program computes the current number of walkers, the estimate of
the ground state energy, and the value of Vref. The unnormalized ground state wave function is
also plotted.
CHAPTER 16. QUANTUM SYSTEMS 692
Listing 16.10: The QMWalk class calculates the ground state of the simple harmonic oscillator
using the random walk Monte Carlo algorithm.
package org . opensourcephysics . sip . ch16 ;
public class QMWalk {
int numberOfBins = 1000; / / f o r wave function
double [ ] x ; / / p o s i t i o n s of walkers
double [ ] phi0 ; / / estimate of ground s t a t e wave function
double [ ] xv ; / / x values f o r computing phi0
int N0; / / d e s i r e d number of walkers
int N; / / actual number of walkers
double ds ; / / st ep s i z e
double dt ; / / time i n t e r v a l
double vave = 0; / / mean p o t e n t i a l
double vref = 0; / / r e f e r e n c e p o t e n t i a l
double eAccum = 0; / / accumulation of energy values
double xmin ; / / minimum x
int mcs ;
public void i n i t i a l i z e ( ) {
N0 = N;
x = new double [2 numberOfBins ] ;
phi0 = new double [ numberOfBins ] ;
xv = new double [ numberOfBins ] ;
/ / minimum l o c a t i o n f o r computing phi0
xmin = −ds numberOfBins / 2 . 0 ;
double binEdge = xmin ;
for ( int i = 0; i <numberOfBins ; i ++) {
xv [ i ] = binEdge ;
binEdge += ds ;
}
/ / i n i t i a l width f o r l o c a t i o n of walkers
double initialWidth = 1;
for ( int i = 0; i <N; i ++) {
/ / i n i t i a l random l o c a t i o n of walkers
x [ i ] = (2 Math . random() −1) initialWidth ;
vref += potential ( x [ i ] ) ;
}
vave = 0;
vref = 0;
eAccum = 0;
mcs = 0;
dt = ds ds ;
}
void walk ( ) {
double vsum = 0;
for ( int i = N−1; i >=0;i −−) {
i f (Math . random() <0.5) { / / move walker
x [ i ] += ds ;
} else {
x [ i ] −= ds ;
}
double pot = potential ( x [ i ] ) ;
CHAPTER 16. QUANTUM SYSTEMS 693
double dv = pot−vref ;
vsum += pot ;
i f (dv<0) { / / decide to add or d e l e t e walker
i f (N==0||(Math . random()<−dv dt)&&(N<x . length ) ) {
x [N] = x [ i ] ; / / new walker at the current l o c a t i o n
vsum += pot ; / / add energy of new walker
N++;
}
} else {
i f ( ( Math . random() <dv dt)&&(N>0)) {
N−−;
/ / r e l a b e l l a s t walker to d e l e t e d walker index
x [ i ] = x [N] ;
vsum −= pot ; / / s u b t r a c t energy of d e l e t e d walker
}
}
}
vave = (N==0) ? 0 / / i f no walkers p o e n t i a l = 0
: vsum/N;
vref = vave −(N−N0)/N0/dt ;
mcs++;
}
void doMCS( ) {
walk ( ) ;
eAccum += vave ;
for ( int i = 0; i <N; i ++) {
int bin = ( int ) Math . floor ( ( x [ i ]−xmin )/ ds ) ; / / bin index
i f ( bin>=0&&bin<numberOfBins ) {
phi0 [ bin ]++;
}
}
}
void resetData ( ) {
for ( int i = 0; i <numberOfBins ; i ++) {
phi0 [ i ] = 0;
}
eAccum = 0;
mcs = 0;
}
public double potential ( double x ) {
return 0.5 x x ;
}
}
Listing 16.11: The QMWalkApp class computes and displays the result of a random walk Monte
Carlo calculation.
package org . opensourcephysics . sip . ch16 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . frames . PlotFrame ;
CHAPTER 16. QUANTUM SYSTEMS 694
public class QMWalkApp extends AbstractSimulation {
PlotFrame phiFrame = new PlotFrame ( "x" , "Phi_0" , "Phi_0(x)" ) ;
QMWalk qmwalk = new QMWalk( ) ;
public void i n i t i a l i z e ( ) {
qmwalk .N = control . getInt ( "initial number of walkers" ) ;
qmwalk . ds = control . getDouble ( "step size ds" ) ;
qmwalk . numberOfBins =
control . getInt ( "number of bins for wavefunction" ) ;
qmwalk . i n i t i a l i z e ( ) ;
}
public void doStep ( ) {
qmwalk .doMCS ( ) ;
phiFrame . clearData ( ) ;
phiFrame . append (0 , qmwalk . xv , qmwalk . phi0 ) ;
phiFrame . setMessage ( "E = "+decimalFormat . format (
qmwalk . eAccum/qmwalk . mcs)+" N = "+qmwalk .N) ;
}
public void reset ( ) {
control . setValue ( "initial number of walkers" , 50);
control . setValue ( "step size ds" , 0 . 1 ) ;
control . setValue ( "number of bins for wavefunction" , 100);
enableStepsPerDisplay ( true ) ;
}
public void resetData ( ) {
qmwalk . resetData ( ) ;
phiFrame . clearData ( ) ;
phiFrame . repaint ( ) ;
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl control =
SimulationControl . createApp (new QMWalkApp ( ) ) ;
control . addButton ( "resetData" , "Reset Data" ) ;
}
}
Problem 16.26. Ground state of the harmonic and anharmonic oscillators
(a) Use QMWalk and QMWalkApp to estimate the ground state energy E0 and the corresponding
eigenstate for V (x) = x2/2. Choose the desired number of walkers N0 = 50, the step length
ds = 0.1, and numberOfBins = 100. Place the walkers at random within the range −1 ≤ x ≤ 1.
Compare your Monte Carlo estimate for E0 to the exact result E0 = 0.5.
(b) Reset your data averages after the averages seemed to have converged and compute the
averages again. How many Monte Carlo steps per walker are needed for 1% accuracy in E0?
Plot the probability distribution of the random walkers and compare it to the exact result
for the ground state wave function.
(c) Modify QMWalk so that more than one copy of the walker can be created at each step (see
CHAPTER 16. QUANTUM SYSTEMS 695
step 4 on page 691). How much better does the algorithm work now? Can you use a larger
step size or fewer Monte Carlo steps to obtain the same accuracy?
(d) Obtain a numerical solution of the anharmonic oscillator with
V (x) =
1
2
x2
+ bx3
. (16.73)
Consider b = 0.1, 0.2, and 0.5. A calculation of the eﬀect of the x3 term is necessary for
the study of the anharmonicity of the vibrations of a physical system, for example, the
vibrational spectrum of diatomic molecules.
Problem 16.27. Ground state of a square well
(a) Modify QMWalkApp to ﬁnd the ground state energy and wave function for the ﬁnite squarewell
potential (16.13) with a = 1 and V0 = 5. Choose N0 = 100, ds = 0.1, and numberOfBins =
100. Place the walkers at random within the range −1.5 ≤ x ≤ 1.5.
(b) Increase V0 and ﬁnd the ground state energy as a function of V0. Use your results to estimate
the limiting value of the ground state energy for V0 → ∞.
Problem 16.28. Ground state of a cylindrical box
Compute the ground state energy and wave function of the circular potential
V (r) =



0 (r ≤ 1)
−V0, (r > 1),
(16.74)
where r2 = x2 + y2. Modify QMWalkApp by using Cartesian coordinates in two dimensions. For
example, add an array to store the positions of the y-coordinates of the walkers. What happens
if you begin with an initial distribution of walkers that is not cylindrically symmetric?
16.9 Diﬀusion Quantum Monte Carlo
We now discuss an improvement of the random walk algorithm known as diﬀusion quantum
Monte Carlo. Although some parts of the discussion might be diﬃcult to follow initially, the
algorithm is straightforward. Your understanding of the method will be enhanced by writing a
program to implement the algorithm and then reading the following derivation again.
To provide some background, we introduce the concept of a Green’s function or propagator
deﬁned by
Ψ (x,τ) = G(x,x ,τ)Ψ (x ,0)dx . (16.75)
From the form of (16.75) we see that G(x,x ,τ) “propagates” the wave function from time zero
to time τ. If we operate on both sides of (16.75) with ﬁrst (∂/∂τ) and then with (H − Vref), we
can verify that G satisﬁes the equation
∂G
∂τ
= −( ˆH − Vref)G, (16.76)
CHAPTER 16. QUANTUM SYSTEMS 696
which is the same form as the imaginary-time Schrödinger equation (16.65). It is easy to verify
that G(x,x ,τ) = G(x ,x,τ). A formal solution of (16.76) is
G(τ) = e−( ˆH−Vref)τ
, (16.77)
where the meaning of the exponential of an operator is given by its Taylor series expansion.
The diﬃculty with (16.77) is that the kinetic and potential energy operators ˆT and ˆV in ˆH
do not commute. For this reason, if we want to write the exponential in (16.77) as a product of
two exponentials, we can only approximate the exponential for short times ∆τ. To ﬁrst order in
∆τ (higher-order terms involve the commutator of ˆV and ˆH), we have
G(∆τ) ≈ GbranchGdiﬀusion (16.78)
= e−(V −Vref)∆τ
e− ˆT ∆τ
, (16.79)
where Gdiﬀusion ≡ e− ˆT ∆τ and Gbranch ≡ e−( ˆV −Vref)∆τ correspond to the two random processes:
diﬀusion and branching. From (16.76) we see that Gdiﬀusion and Gbranch satisfy the diﬀerential
equations:
∂Gdiﬀusion
∂τ
= − ˆT Gdiﬀusion =
2
2m
∂2Gdiﬀusion
∂x2
(16.80)
∂Gbranch
∂τ
= (Vref − ˆV )Gbranch. (16.81)
The solutions to (16.79)–(16.81) that are symmetric in x and x are
Gdiﬀusion(x,x ,∆τ) = (4πD∆τ)−1/2
e−(x−x )2/4D∆τ
, (16.82)
with D ≡ 2/2m, and
Gbranch(x,x ,∆τ) = e−( 1
2 [V (x)+V (x )]−Vref)∆τ
. (16.83)
From the form of (16.82) and (16.83), we can see that the diﬀusion quantum Monte Carlo
method is similar to the random walk algorithm discussed in Section 16.8. An implementation
of the diﬀusion quantum Monte Carlo method in one dimension can be summarized as follows:
1. Begin with a set of N0 random walkers. There is no lattice so the positions of the walkers
are continuous. It is advantageous to choose the walkers so that they are in regions of
space where the wave function is known to be large.
2. Choose one of the walkers and displace it from x to x . The new position is chosen from a
Gaussian distribution with a variance 2D∆τ and zero mean. This change corresponds to
the diﬀusion process given by (16.82).
3. Weight the conﬁguration x by
w(x → x ,∆τ) = e−( 1
2 [V (x)+V (x )]−Vref)∆τ
. (16.84)
One way to do this weighting is to generate duplicate random walkers at x . For example,
if w ≈ 2, we would have two walkers at x where previously there had been one. To implement
this weighting (branching) correctly, we must make an integer number of copies
that is equal on the average to the number w. A simple way to do so is to take the integer
part of w + r, where r is a uniform random number in the unit interval. The number of
copies can be any nonnegative integer including zero. The latter value corresponds to a
removal of a walker.
CHAPTER 16. QUANTUM SYSTEMS 697
4. Repeat steps 2 and 3 for all members of the ensemble, thereby creating a new ensemble at
a later time ∆τ. One iteration of the ensemble is equivalent to performing the integration
Ψ (x,τ) = G(x,x ,∆τ)Ψ (x ,τ − ∆τ)dx . (16.85)
5. The quantity of interest Ψ (x,τ) will be independent of the original ensemble Ψ (x,0) if a
suﬃcient number of Monte Carlo steps are taken. As before, we must ensure that N(τ),
the number of walkers at time τ, is kept close to the desired number N0.
Now we can understand how the simple random walk algorithm discussed in Section 16.8
is an approximation to the diﬀusion quantum MC algorithm. First, the Gaussian distribution
gives the exact distribution for the displacement of a random walker in a time ∆τ, in contrast
to the ﬁxed step size in the simple random walk algorithm which gives the average displacement
of a walker. Hence, there are no systematic errors due to the ﬁnite step size. Second, if
we expand the exponential in (16.83) to ﬁrst order in ∆τ and set V (x) = V (x ), we obtain the
branching rule used previously. (We use the fact that the uniform distribution r is the same
as the distribution 1 − r.) However, the diﬀusion quantum MC algorithm is not exact because
the branching is independent of the position reached by diﬀusion, which is only true in the
limit ∆τ → 0. This limitation is remedied in the Green’s function Monte Carlo method where a
short time approximation is not made (see the articles on Green’s function Monte Carlo in the
references).
One limitation of the two random walk methods we have discussed is that they can become
very ineﬃcient. This ineﬃciency is due in part to the branching process. If the potential
becomes large and negative (as it is for the Coulomb potential when an electron approaches a
nucleus), the number of copies of a walker will become very large. It is possible to improve the
eﬃciency of these algorithms by introducing an importance sampling method. The idea is to
use an initial guess ΨT (x) for the wave function to guide the walkers toward the more important
regions of V (x). To implement this idea, we introduce the function f (x,τ) = Ψ (x,τ)ΨT (x). If we
calculate the quantity ∂f /∂t − D ∂2f /∂x2 and use (16.65), we can show that f (x,τ) satisﬁes the
diﬀerential equation
∂f
∂τ
= D
∂2f
∂x2
− D
∂ f F(x)
∂x
− [EL(x) − Vref]f , (16.86)
where
F(x) =
2
ΨT
∂ΨT
∂x
, (16.87)
and the local energy EL(x) is given by
EL(x) =
ˆH∂T
ΨT
= V (x) −
D
ΨT
∂2
ΨT ∂x2
. (16.88)
The term in (16.86) containing F corresponds to a drift in the walkers away from regions where
|ΨT |2 is small (see Problem 7.43).
To incorporate the drift term into Gdiﬀusion, we replace (x − x )2 in (16.82) by the term x −
x − D∆τF(x )
2
so that the diﬀusion propagator becomes
Gdiﬀusion(x,x ,∆τ) = (4πD∆τ)−1/2
e−(x−x −D∆τF(x ))2/4D∆τ
. (16.89)
CHAPTER 16. QUANTUM SYSTEMS 698
However, this replacement destroys the symmetry between x and x . To restore it, we use the
Metropolis algorithm for accepting the new position of a walker. The acceptance probability p
is given by
p =
|ΨT (x )|2 Gdiﬀusion(x,x ,∆τ)
|ΨT (x)|2 Gdiﬀusion(x ,x,∆τ)
. (16.90)
If p > 1, we accept the move; otherwise, we accept the move if r ≤ p. The branching step
is achieved by using (16.83) with V (x) + V (x ) replaced by EL(x) + EL(x ) and ∆τ replaced by
an eﬀective time step. The reason for the use of an eﬀective time step in (16.83) is that some
diﬀusion steps are rejected. The eﬀective time step to be used in (16.83) is found by multiplying
∆τ by the average acceptance probability. It can be shown (see Hammond et al.) that the mean
value of the local energy is an unbiased estimator of the ground state energy.
Another possible improvement is to periodically replace branching (which changes the
number of walkers) with a weighting of the walkers. At each weighting step, each walker is
weighted by Gbranch, and the total number of walkers remains constant. After n steps, the
kth walker receives a weight Wk = Πn
i=1G
(i,k)
branch, where G
(i,k)
branch is the branching factor of the
kth walker at the ith time step. The contribution to any average quantity of the kth walker is
weighted by Wk.
Problem 16.29. Diﬀusion Quantum Monte Carlo
(a) Modify QMWalkApp to implement the diﬀusion quantum Monte Carlo method for the systems
considered in Problems 16.26 and 16.27. Begin with N0 = 100 walkers and ∆τ = 0.01.
Use at least three values of ∆τ and extrapolate your results to ∆τ → 0. Reasonable results
can be obtained by adjusting the reference energy every 20 Monte Carlo steps with a = 0.1.
(b) Write a program to apply the diﬀusion quantum Monte Carlo method to the hydrogen atom.
In this case a conﬁguration is represented by three coordinates.
(c)∗ Modify your program to include weights in addition to changing walker populations. Redo
part (a) and compare your results.
∗Problem 16.30. Importance sampling
(a) Derive the partial diﬀerential equation (16.86) for f (x,τ).
(b) Modify QMWalkApp to implement the diﬀusion quantum Monte Carlo method with importance
sampling. Consider the harmonic oscillator problem with the trial wave function
ΨT = e−λx2
. Compute the statistical error associated with the ground state energy as a function
of λ. How much variance reduction can you achieve relative to the naive diﬀusion
quantum Monte Carlo method? Then consider another form of ΨT that does not have a
form identical to the exact ground state. Try the hydrogen atom with ΨT = e−λr.
16.10 Path Integral Quantum Monte Carlo
The Monte Carlo methods we have discussed so far are primarily useful for estimating the
ground state energy and wave function, although it is also possible to ﬁnd the ﬁrst few excited
states with some eﬀort. In this section we discuss a Monte Carlo method that is of particular
interest for computing the thermal properties of quantum systems.
CHAPTER 16. QUANTUM SYSTEMS 699
We recall (see Section 7.10) that classical mechanics can be formulated in terms of the principle
of least action. That is, given two points in space-time, a classical particle chooses the path
that minimizes the action given by
S =
x,t
x0,0
Ldt. (16.91)
The Lagrangian L is given by L = T − V . Quantum mechanics also can be formulated in terms
of the action (cf. Feynman and Hibbs). The result of this path integral formalism is that the
real-time propagator G can be expressed as
G(x,x0,t) = A
paths
eiS/
, (16.92)
where A is a normalization factor. The sum in (16.92) is over all paths between (x0,0) and (x,t),
not just the path that minimizes the classical action. The presence of the imaginary number i
in (16.92) leads to interference eﬀects. As before, the propagator G(x,x0,t) can be interpreted as
the probability amplitude for a particle to be at x at time t given that it was at x0 at time zero.
G satisﬁes the equation [see (16.75)]
Ψ (x,t) = G(x,x0,t)Ψ (x0,0)dx0 (t > 0). (16.93)
Because G satisﬁes the same diﬀerential equation as Ψ in both x and x0, G can be expressed as
G(x,x0,t) =
n
φn(x)φn(x0)e−iEnt/
, (16.94)
where the φn are the eigenstates of H. For simplicity, we set = 1 in the following. As before,
we substitute τ = it into (16.94) and obtain
G(x,x0,τ) =
n
φn(x)φn(x0)e−τEn . (16.95)
We ﬁrst consider the ground state. In the limit τ → ∞, we have
G(x,x,τ) → φ0(x)2
e−τE0 (τ → ∞). (16.96)
From the form of (16.96) and (16.92), we see that we need to compute G and hence S to estimate
the properties of the ground state.
To compute S, we convert the integral in (16.91) to a sum. The Lagrangian for a single
particle of unit mass in terms of τ becomes
L = −
1
2
dx
dτ
2
− V (x) = −E. (16.97)
We divide the imaginary time interval τ into N equal steps of size ∆τ and write E as
E(xj,τj) =
1
2
(xj+1 − xj)2
(∆τ)2
+ V (xj), (16.98)
where τj = j∆τ, and xj is the corresponding displacement. The action becomes
S = −i∆τ
N−1
j=0
E(xj,τj) = −i∆τ
N−1
j=0
1
2
(xj+1 − xj)2
(∆τ)2
+ V (xj) , (16.99)
CHAPTER 16. QUANTUM SYSTEMS 700
and the probability amplitude for the path becomes
eiS
= e
∆τ N−1
j=0
1
2 (xj+1−xj )2/(∆τ)2+V (xj )
. (16.100)
Hence, the propagator G(x,x0,N∆τ) can be expressed as
G(x,x0,N∆τ) = A dx1 ···dxN−1 e
∆τ N−1
j=0
1
2 (xj+1−xj )2/(∆τ)2+V (xj )
, (16.101)
where x ≡ xN and A is an unimportant constant.
From (16.101) we see that G(x,x0,N∆τ) has been expressed as a multidimensional integral
with the displacement variable xj associated with the time τj. The sequence x0,x1,...,xN deﬁnes
a possible path, and the integral in (16.101) is over all paths. Because the quantity of interest is
G(x,x,N∆τ) [see (16.96)], we adopt the periodic boundary condition xN = x0. The choice of x in
the argument of G is arbitrary for ﬁnding the ground state energy, and the use of the periodic
boundary conditions implies that no point in the closed path is unique. It is thus possible (and
convenient) to rewrite (16.101) by letting the sum over j go from 1 to N:
G(x0,x0,N∆τ) = A dx1 ···dxN−1 e
−∆τ N
j=1
1
2 (xj −xj−1)2/(∆τ)2+V (xj )
, (16.102)
where we have written x0 instead of x because the xj that is not integrated over is xN = x0.
The result of this analysis is to convert a quantum mechanical problem for a single particle
into a statistical mechanics problem for N “atoms” on a ring connected by nearest neighbor
“springs” with spring constant 1/(∆τ)2. The label j denotes the order of the atoms in the ring.
Note that the form of (16.102) is similar to the form of the Boltzmann distribution. Because
the partition function for a single quantum mechanical particle contains terms of the form
e−βEn , and (16.95) contains terms proportional to e−τEn , we make the correspondence β = τ =
N∆τ. We shall see in the following how we can use this identity to simulate a quantum system
at a ﬁnite temperature.
We can use the Metropolis algorithm to simulate the motion of N “atoms” on a ring. Of
course, these atoms are a product of our analysis just as were the random walkers we introduced
in diﬀusion Monte Carlo and should not be confused with real particles. A possible path
integral algorithm can be summarized as follows:
1. Choose N and ∆τ such that N∆τ >> 1 (the zero temperature limit). Also choose δ, the
maximum trial change in the displacement of an atom, and mcs, the total number of Monte
Carlo steps per atom.
2. Choose an initial conﬁguration for the displacements xj that is close to the approximate
shape of the ground state probability amplitude.
3. Choose an atom j at random and a trial displacement xj → xj + (2r − 1)δ, where r is a
uniform random number in the unit interval. Compute the change ∆E in the energy E,
where ∆E is given by
∆E =
1
2
xj+1 − xj
∆τ
2
+
1
2
xj − xj−1
∆τ
2
+ V (xj)
−
1
2
xj+1 − xj
∆τ
2
−
1
2
xj − xj−1
∆τ
2
− V (xj). (16.103)
CHAPTER 16. QUANTUM SYSTEMS 701
If ∆E < 0, accept the change; otherwise, compute the probability p = e−∆τ∆E and a random
number r in the unit interval. If r ≤ p, then accept the move; otherwise reject the trial
move.
4. Divide the possible x values into equal size bins of width ∆x. Update P (x); that is, let
P (x = xj) → P (x = xj) + 1, where x is the displacement of the atom chosen in step 3 after
step 3 is completed. Do this update even if the trial move was rejected.
5. Repeat steps 3 and 4 until a suﬃcient number of Monte Carlo steps per atom has been
obtained. (Do not take data until the memory of the initial path is lost and the system has
reached “equilibrium.”)
Normalize the probability density P (x) by dividing by the product of N and mcs. The
ground state energy E0 is given by
E0 =
x
P (x)[T (x) + V (x)], (16.104)
where T (x) is the kinetic energy as determined from the virial theorem
2T (x) = x
dV
dx
, (16.105)
which is discussed in many texts (see Griﬃths for example). It is also possible to compute T
from averages over (xj − xj−1)2, but the virial theorem yields a smaller variance. The ground
state wave function φ(x) is obtained from the normalized probability P (x)∆x by dividing by ∆x
and taking the square root.
We can also ﬁnd the thermodynamic properties of a particle that is connected to a heat
bath at temperature T = 1/β by not taking the β = N∆τ → ∞ limit. To obtain the ground state,
which corresponds to the zero temperature limit (β >> 1), we have to make N∆τ as large as
possible. However, we need ∆τ to be as small as possible to approximate the continuum time
limit. Hence, to obtain the ground state we need a large number of time intervals N. For the
ﬁnite temperature simulation, we can use smaller values of N for the same level of accuracy as
the zero temperature simulation.
The path integral method is very ﬂexible and can be generalized to higher dimensions
and many mutually interacting particles. For three dimensions, xj is replaced by the threedimensional
displacement rj. Each real particle is represented by a ring of N “atoms” with a
spring-like potential connecting each atom within a ring. Each atom in each ring also interacts
with the atoms in the other rings through an interparticle potential. If the quantum system is
a ﬂuid where indistinguishability is important, then we must consider the eﬀect of exchange
by treating the quantum system as a classical polymer system where the “atoms” represent the
monomers of a polymer, and where polymers can split up and reform. Chandler and Wolynes
discuss how the quantum mechanical eﬀects due to exchanging identical particles can be associated
with the chemical equilibrium of the polymers. They also discuss Bose condensation
using path integral techniques.
Problem 16.31. Path integral calculation
(a) Write a program to implement the path integral algorithm for the one-dimensional harmonic
oscillator potential with V (x) = x2/2. Use the structure of your Monte Carlo Lennard–
Jones program from Chapter 15 as a guide.
CHAPTER 16. QUANTUM SYSTEMS 702
(b) Let N∆τ = 15 and consider N = 10, 20, 40, and 80. Equilibrate for at least 2000 Monte
Carlo steps per atom and average over at least 5000 mcs. Compare your results with the
exact result for the ground state energy given by E0 = 0.5. Estimate the equilibration time
for your calculation. What is a good initial conﬁguration? Improve your results by using
larger values of N∆τ.
(c) Find the mean energy E of the harmonic oscillator at the temperature T determined by
β = N∆τ. Find E for β = 1, 2, and 3 and compare it with the exact result E = 1
2 coth(β/2).
(d) Repeat the above calculations for the Morse potential V (x) = 2(1 − e−x)2.
16.11 Projects
Many of the techniques described in this chapter can be extended to two-dimensional quantum
systems. The Complex2DFrame tool in the frames package is designed to show two-dimensional
complex scalar ﬁelds such as quantum wave functions. Listing 16.13 in Appendix A shows how
this class is used to show a two-dimensional Gaussian wave packet with a momentum boost.
Project 16.32. Separable systems in two dimensions
The shooting method is inappropriate for the calculation of eigenstates and eigenvalues in two
or more dimensions with arbitrary potential energy functions V (r). However, the special case
of separable potentials can be reduced to several one-dimensional problems that can be solved
using the numerical methods described in this chapter. Many molecular modeling programs
use the Hartree–Fock self-consistent ﬁeld approximation to model nonseparable systems as a
set of one-dimensional problems. Recently, there has been signiﬁcant progress motivated by a
molecular dynamics algorithm developed by Car and Parrinello.
Write a two-dimensional eigenstate class Eigenstate2d that calculates eigenstates and eigenvalues
for a separable potential of the form
V (x,y) = V1(x) + V2(y). (16.106)
Test this class using the known analytic solutions for the two-dimensional rectangular box and
two-dimensional harmonic oscillator. Use this class to model the evolution of superposition
states. Under what conditions are there wave function revivals?
Project 16.33. Excited state wave functions using quantum Monte Carlo
Quantum Monte Carlo methods can be extended to compute the excited state wave functions
using a Gram–Schmidt procedure to insure that each excited state is orthogonal to all lower
lying states (see Roy et al.). A quantum Monte Carlo method is used to compute the ground state
wave function. A trial wave function for the ﬁrst exited state is then selected and the ground
state component is subtracted from the trial wave function. This subtraction is repeated after
every iteration of the Monte Carlo algorithm. Because excited states decay with a time constant
e−(Ej −E0)
, the lowest remaining excited state dominates the remaining wave function. After the
ﬁrst excited state is obtained, the second excited state is computed by subtracting both known
states from the trial wave function. This process is repeated to obtain additional wave functions.
Implement this procedure to ﬁnd the ﬁrst few excited state wave functions for the onedimensional
harmonic oscillator. Then consider the one-dimensional double-well oscillator
V (x) = −
1
2
kx2
+ a3x3
+ a4x4
, (16.107)
CHAPTER 16. QUANTUM SYSTEMS 703
with k = 40, a3 = 1, and a4 = 1.
Project 16.34. Quantum Monte Carlo in two dimensions
The procedure described in Project 16.33 can be used to compute two-dimensional wave functions
(see Roy et al.).
(a) Test your program using a separable two-dimensional double-well potential.
(b) Find the ﬁrst few excited states for the two-dimensional double-well potential
V (x,y) = −
1
2
kxx2
−
1
2
kyy2
+
1
2
(axxx4
+ 2axyx2
y2
+ ayyy4
), (16.108)
with kx = ky = 20 and axx = ayy = axy = 5. Repeat with kx = ky = 20 and axx = ayy = axy =
1.
Project 16.35. Evolution of a wave packet in two dimensions
Both the half-step and split-operator algorithms can be extended to model the evolution of twodimensional
systems with arbitrary potentials V (x,y). (See Numerical Recipes for how the FFT
algorithm is extended to more dimensions.) Implement either algorithm and model a wave
packet scattering from a central barrier and a wave packet passing through a double slit.
A clever way to insure stability in the half-step algorithm is to use a boolean array to tag
grid locations where the solution becomes unstable and to set the wave function to zero at these
grid points.
double minV = −2/dt ;
double maxVx = 2/dt −2/(dx dx ) ;
double maxVy = 2/dt −2/(dy dy ) ;
double maxV = Math . min(maxVx,maxVy ) ;
for ( int i = 0 , n = potential . length ; i <= n ; i ++) {
for ( int j = 0 , m = potential [ 0 ] . length ; j <= m; j ++) {
i f ( potential [ i ] [ j ] >= minV && potential [ i ] [ j ] <= maxV) / / s t a b l e
stable [ i ] [ j ] = true ; / / s t a b l e
else
stable [ i ] [ j ] = false ; / / unstable , s e t wave function to zero
}
}
}
Project 16.36. Two-particle system
Rubin Landau has studied the time dependence of two particles interacting in one dimension
with a potential that depends on their relative separation:
V (x1,x2) = V0e−(x1−x2)2/2α2
. (16.109)
Model a scattering experiment for particles having momentum p1 and p2 by assuming the following
(unnormalized) initial wave function:
Ψ (x1,x2) = eip1x1 e−(x1−a)2/4w2
eip2x2 e−(x2−a)2/4w2
, (16.110)
where 2a is the separation and w is the variance in each particle’s position. Do the particles
bounce oﬀ each other when the interaction is repulsive? What happens when the interaction is
attractive?
CHAPTER 16. QUANTUM SYSTEMS 704
(a) Real and imaginary. (b) Amplitude and phase.
Figure 16.2: Two representations of complex wave functions. (The actual output is in color.)
Appendix 16A: Visualizing Complex Functions
Complex functions are essential in quantum mechanics and the frames package contains classes
for displaying and analyzing these functions. Listing 16.12 uses a ComplexPlotFrame to display
a one-dimensional wave function.
Listing 16.12: The ComplexPlotFrameApp class displays a one-dimensional Gaussian wave
packet with a momentum boost.
package org . opensourcephysics . sip . ch16 ;
import org . opensourcephysics . frames . ComplexPlotFrame ;
public class ComplexPlotFrameApp {
public s t a t i c void main ( String [ ] args ) {
ComplexPlotFrame frame =
new ComplexPlotFrame ( "x" , "Psi(x)" , "Complex function" ) ;
int n = 128;
double
xmin = −Math . PI , xmax = Math . PI ;
double
x = xmin , dx = (xmax−xmin )/n ;
double [ ] xdata = new double [n ] ;
/ / r e a l and imaginary values a l t e r n a t e
double [ ] zdata = new double [2 n ] ;
/ / t e s t function i s e^(−x x /4) e ^( i mode x ) f o r x=[−pi , pi )
int mode = 4;
for ( int i = 0; i <n ; i ++) {
double a = Math . exp(−x x / 4 ) ;
zdata [2 i ] = a Math . cos (mode x ) ;
zdata [2 i +1] = a Math . sin (mode x ) ;
xdata [ i ] = x ;
x += dx ;
}
CHAPTER 16. QUANTUM SYSTEMS 705
frame . append ( xdata , zdata ) ;
frame . s e t V i s i b l e ( true ) ;
frame . setDefaultCloseOperation ( javax . swing . JFrame .EXIT_ON_CLOSE ) ;
}
}
Figure 16.2 shows two representations of a quantum wave function. The real and imaginary
representation displays the real and imaginary parts of the wave function Ψ (x) by drawing
two curves. In the amplitude and phase representation the vertical height represents the wave
function magnitude and the color indicates phase. Note that the complex phase is oscillating,
indicating that the wave function has a nonzero momentum expectation value, which is known
as a momentum boost.
Wave function visualizations can be selected at runtime using the Tools menu, or they can
be selected programmatically using convert methods such as convertToPostView and convertToReImView.
The Tools menu also allows the user to select a table view to examine the data
being used to draw the wave function and to display a phase legend that shows the color to
phase relation.
A Complex2DFrame displays a two-dimensional complex scalar ﬁeld such as a two-dimensional
wave function. We instantiate a Complex2DFrame and then pass to it a multidimensional array
containing the ﬁeld’s real and imaginary components. Listing 16.13 shows how this class is
used to show a two-dimensional Gaussian wave packet with a momentum boost.
Listing 16.13: The Complex2DFrameApp program displays a two-dimensional Gaussian wave
packet with a momentum boost.
package org . opensourcephysics . sip . ch16 ;
import org . opensourcephysics . frames . Complex2DFrame ;
public class Complex2DFrameApp {
public s t a t i c void main ( String [ ] args ) {
Complex2DFrame frame =
new Complex2DFrame( "x" , "y" , "Complex field" ) ;
frame . setPreferredMinMax ( −1.5 , 1.5 , −1.5 , 1 . 5 ) ;
double [ ] [ ] [ ] f i e l d = new double [ 2 ] [ 3 2 ] [ 3 2 ] ;
frame . setAll ( f i e l d ) ;
for ( int i = 0 , nx = f i e l d [ 0 ] . length ; i <nx ; i ++) {
double x = frame . indexToX ( i ) ;
for ( int j = 0 , ny = f i e l d [ 0 ] [ 0 ] . length ; j <ny ; j ++) {
double y = frame . indexToY ( j ) ;
double a = Math . exp ( −4 (x x+y y ) ) ;
f i e l d [ 0 ] [ i ] [ j ] = a Math . cos (5 x ) ; / / r e a l component
f i e l d [ 1 ] [ i ] [ j ] = a Math . sin (5 x ) ; / / complex component
}
}
frame . setAll ( f i e l d ) ;
frame . s e t V i s i b l e ( true ) ;
frame . setDefaultCloseOperation ( javax . swing . JFrame .EXIT_ON_CLOSE ) ;
}
}
The complex ﬁeld is computed on a n-row by m-column grid and stored in an array with
dimensions 2 × m × n. The default visualization uses a grid in which every cell is colored using
CHAPTER 16. QUANTUM SYSTEMS 706
brightness to show the complex number’s magnitude and color to show phase. Other visualizations
can be programmed or selected at runtime using the menu.
References and Suggestions for Further Reading
The ALPS project, <http://alps.comp-phys.org/>, has open source simulation programs for
strongly correlated quantum mechanical systems and C++ libraries for simplifying the
development of such code. Although most of the code is beyond the level of this text, this
open source project is another example of software for use in both research and education.
J. B. Anderson, “A random walk simulation of the Schrödinger equation: H+
3,” J. Chem. Phys.
63, 1499–1503 (1975); “Quantum chemistry by random walk. H 2P, H+
3 D3h
1A 1, H2
3Σ+
u,
H4
1Σ+
g , Be 1S,” J. Chem. Phys. 65, 4121–4127 (1976); “Quantum chemistry by random
walk: Higher accuracy,” J. Chem. Phys. 73, 3897–3899 (1980). These papers describe the
random walk method, extensions for improved accuracy, and applications to simple molecules.
G. Baym, Lectures on Quantum Mechanics, Westview Press (1990). A discussion of the Schrödinger
equation in imaginary time is given in Chapter 3.
M. A. Belloni, W. Christian, and A. Cox, Physlet Quantum Physics, Prentice Hall (2006) This
book contains interactive exercises using Java applets to visualize quantum phenomena.
H. A. Bethe, Intermediate Quantum Mechanics, Westview Press (1997). Applications of quantum
mechanics to atomic systems are discussed.
Jay S. Bolemon, “Computer solutions to a realistic ‘one-dimensional’ Schrödinger equation,”
Am. J. Phys. 40, 1511 (1972).
Siegmund Brandt and Hans Dieter Dahmen, The Picture Book of Quantum Mechanics, third
edition, Springer-Verlag (2001); Siegmund Brandt, Hans Dieter Dahmen, and Tilo Stroh,
Interactive Quantum Mechanics, Springer-Verlag (2003).These books show computer generated
pictures of quantum wave functions in diﬀerent contexts.
R. Car and M. Parrinelli, “Uniﬁed approach for molecular dynamics and density-functional
theory,” Phys. Rev. Lett. 55, 2471 (1985).
David M. Ceperley, “Path integrals in the theory of condensed helium,” Rev. Mod. Phys. 67,
279–355 (1995).
David M. Ceperley and Berni J. Alder, “Quantum Monte Carlo,” Science 231, 555 (1986). A
survey of some of the applications of quantum Monte Carlo methods to physics and chem-
istry.
David Chandler and Peter G. Wolynes, “Exploiting the isomorphism between quantum theory
and classical statistical mechanics of polyatomic ﬂuids,” J. Chem. Phys. 74, 4078–4095
(1981). The authors use path integral techniques to look at multiparticle quantum sys-
tems.
D. F. Coker and R. O. Watts, “Quantum simulation of systems with nodal surfaces,” Mol. Phys.
58, 1113–1123 (1986).
Jim Doll and David L. Freeman, “Monte Carlo methods in chemistry,” Computing in Science
and Engineering 1 (1), 22–32 (1994).
CHAPTER 16. QUANTUM SYSTEMS 707
Robert M. Eisberg and Robert Resnick, Quantum Physics, second edition, John Wiley & Sons
(1985). See Appendix G for a discussion of the numerical solution of Schrödinger’s equa-
tion.
R. P. Feynman, “Simulating physics with computers,” Int. J. Theor. Phys. 21, 467–488 (1982).
A provocative discussion of the intrinsic diﬃculties of simulating quantum systems. See
also R. P. Feynman, Feynman Lectures on Computation, Westview Press (1996).
Richard P. Feynman and A. R. Hibbs, Quantum Mechanics and Path Integrals, McGraw–Hill
(1965).
David J. Griﬃths, Introduction to Quantum Mechanics, second edition, Prentice Hall (2005). An
excellent undergraduate text that discusses the virial theorem in several problems.
B. L. Hammond, W. A. Lester Jr., and P. J. Reynolds, Monte Carlo Methods in Ab Initio Quantum
Chemistry, World Scientiﬁc (1994). An excellent book on quantum Monte Carlo methods.
Steven E. Koonin and Dawn C. Meredith, Computational Physics, Addison–Wesley (1990). Solutions
of the time-dependent Schrödinger equation are discussed in the context of parabolic
partial diﬀerential equations in Chapter 7. Chapter 8 discusses Green’s function Monte
Carlo methods.
Rubin Landau, “Two-particle Schrödinger equation animations of wavepacket-wavepacket scattering,”
Am. J. Phys. 68 (12), 1113–1119 (2000).
Michel Le Bellac, Fabrice Mortessagne, and G. George Batrouni, Equilibrium and Non-Equilibrium
Statistical Thermodynamics, Cambridge University Presss (2004). Chapter 7 of this graduate
level text discusses the world line algorithm for bosons and fermions on a lattice.
M. A. Lee and K. E. Schmidt, “Green’s function Monte Carlo,” Computers in Physics 6 (2), 192
(1992). A short and clear explanation of Green’s function Monte Carlo.
P. K. MacKeown, “Evaluation of Feynman path integrals by Monte Carlo methods,” Am. J.
Phys. 53, 880—885 (1985). The author discusses projects suitable for an advanced undergraduate
course. Also see P. K. MacKeown and D. J. Newman, Computational Techniques
in Physics, Adam Hilger (1987).
Jean Potvin, “Computational quantum ﬁeld theory. Part II: Lattice gauge theory,” Computers in
Physics 8, 170 (1994); and “Computational quantum ﬁeld theory,” Computers in Physics
7, 149 (1993).
William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery, Numerical
Recipes, second edition, Cambridge University Press (1992). The numerical solution of the
time-dependent Schrödinger equation is discussed in Chapter 19.
Peter J. Reynolds, David M. Ceperley, Berni J. Alder, and William A. Lester Jr., “Fixed-node
quantum Monte Carlo for molecules,” J. Chem. Phys. 77, 5593–5603 (1982). This paper
describes a random walk algorithm for use in molecular applications including importance
sampling and the treatment of Fermi statistics.
P. J. Reynolds, J. Tobochnik, and H. Gould, “Diﬀusion quantum Monte Carlo,” Computers in
Physics 4 (6), 882 (1990).
CHAPTER 16. QUANTUM SYSTEMS 708
U. Rothlisberger, “15 Years of Car–Parrinello simulations in physics, chemistry and biology,” in
Computational Chemistry: Reviews of Current Trends, edited by Jerzy Leszczynski, World
Scientiﬁc (2001), Vol. 6.
Amlan K. Roy, Neetu Gupta, and B. M. Deb, “Time-dependent quantum mechanical calculation
of ground and excited states of anharmonic and double-well oscillators,” Phys. Rev
A 65, 012109-1–7 (2001).
Amlan K. Roy, Ajit J. Thakkar, and B. M. Deb, “Low-lying states of two-dimensional doublewell
potentials,” J. Phys. A 38, 2189–2199 (2005).
K. E. Schmidt, Parhat Niyaz, A. Vaught, and Michael A. Lee, “Green’s function Monte Carlo
method with exact imaginary-time propagation,” Phys. Rev. E 71, 016707-1–17 (2005).
Bernd Thaller, Visual Quantum Mechanics: Selected Topics with Computer-Generated Animations
of Quantum-Mechanical Phenomena, Telos (2000); Bernd Thaller, Advanced Visual
Quantum Mechanics, Springer (2005).
J. Tobochnik, H. Gould, and K. Mulder, “An introduction to quantum Monte Carlo,” Computers
in Physics 4 (4), 431 (1990). An explanation of the path integral method applied to one
particle.
P. B. Visscher, “A fast explicit algorithm for the time-dependent Schrödinger equation,” Computers
in Physics 5 (6), 596 (1991).
Chapter 17
Visualization and Rigid Body
Dynamics
We study aﬃne transformations in order to visualize objects in three dimensions. We then solve
Euler’s equation of motion for rigid body dynamics using the quaternion representation of ro-
tations.
17.1 Two-Dimensional Transformations
Physicists frequently use transformations to convert from one system of coordinates to another.
A very common transformation is an aﬃne transformation, which has the ability to rotate, scale,
stretch, skew, and translate an object. Such a transformation maps straight lines to straight lines.
They are often represented using matrices and are manipulated using the tools of linear algebra
such as matrix multiplication and matrix inversion. The Java 2D API deﬁnes a set of classes
designed to create high quality graphics using image composition, image processing, antialiasing,
and text layout. Because linear algebra and aﬃne transformations are used extensively in
imaging and drawing APIs, we begin our study of two- and three-dimensional visualization
techniques by studying the properties of transformations.
It is straightforward to rotate a point (x,y) about the origin by an angle θ (see Figure 17.2)
or scale the distance from the origin by (sx,sy) using matrices:
x
y
=
cosθ −sinθ
sinθ cosθ
x
y
(17.1)
x
y
=
sx 0
0 sy
x
y
. (17.2)
Performing several transformations corresponds to multiplying matrices. However, the translation
of the point (x,y) by (dx,dy) is treated as an addition and not as a multiplication and must
be written diﬀerently:
x
y
=
dx
dy
+
x
y
. (17.3)
709
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 710
-4
-3
-2
-1
1
x
4
3
2 (x, y)
r
-4 -3 -2 -1 1 2 3 4
r'
y
(x',y')
Figure 17.1: A two-dimensional rotation of a point (x,y) produces a point with new coordinates
(x ,y ) as computed according to (17.1).
This inconsistency in the type of mathematical operation is easily overcome if points are expressed
in terms of homogeneous coordinates by adding a third coordinate w. Homogeneous coordinates
are used extensively in computer graphics to treat all transformations consistently.
Instead of representing a point in two dimensions by a pair of numbers (x,y), each point is represented
by a triple (x,y,w). Because two homogeneous coordinates represent the same point if
one is a multiple of the other, we usually homogenize the point by dividing the x-y coordinates by
w and write the coordinates in the form (x,y,1). (The w coordinate can be used to add perspective
(see Foley et al.).) By using homogeneous coordinates, an arbitrary aﬃne transformation
can be written as 

x
y
1


=


m00 m01 m02
m10 m11 m12
0 0 1




x
y
1


. (17.4)
A translation, for example, can be expressed as:


x
y
1


=


1 0 dx
0 1 dy
0 0 1




x
y
1


. (17.5)
Exercise 17.1. Homogeneous coordinates
(a) How are the rotation and scaling transformations expressed in matrix notation using homogeneous
coordinates? Sketch the transformation matrices for a 30◦ clockwise rotation and
for a scaling along the x-axis by a factor of two and then write the transformation matrices.
Do these matrices commute?
(b) Describe the eﬀect of the aﬃne transformation


1 0.2 0
0 1 0
0 0 1


. (17.6)
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 711
Exercise 17.1 shows that a coordinate transformation can be broken into parts using a block
matrix format:
A dT
0 1
. (17.7)
We will use boldface for row vectors such as 0 and d and calligraphic symbols to represent
matrices. The translation vector d is transposed to convert it to a column vector. The upper
left-hand submatrix A produces rotation and scaling while the vector d = [dx,dy] produces
translation.
Homogeneous coordinates have another advantage because they can be used to distinguish
between points and vectors. Unlike points, vectors should remain invariant under translation.
To transform vectors using the same transformation matrices that we use to transform points,
we set w to zero, thereby removing the eﬀect of the last column. Note that the diﬀerence between
two homogeneous points produces a w equal to zero. The elimination of the eﬀect of translation
makes sense because the diﬀerence between two points is a displacement vector, and vectors are
deﬁned in terms of their components, not their location.
The AffineTransform class in the java.awt.geom package deﬁnes two-dimensional aﬃne
transformations. Instances of this class are constructed as
AffineTransform at = new AffineTransform ( double m00, double m10,
double m01, double m11, double m02, double m12 ) ;
The methods in this class encapsulate most of the matrix arithmetic that is required for twodimensional
visualization. For example, there are methods to calculate a transformation’s inverse
and to combine transformations using the rules of matrix multiplication. There are also
static methods for constructing pure rotations, scalings, and translations that require only one
or two parameters.
double theta = Math . PI /6;
AffineTransform at = AffineTransform . getRotateInstance ( theta ) ;
A method such as getRotateInstance is known as a convenience method because it simpliﬁes a
complicated API.
The AffineTransform class can transform geometric objects, images, and even text. The
following code fragment shows how this class is used to rotate a point and a rectangle.
Point2D . Double pt = Point2D . Double ( 2 . 0 , 3 . 0 ) ;
pt = AffineTransform . getRotateInstance (Math . PI / 3 ) . transform ( pt , null ) ;
Shape shape = new Rectangle2D . Double (50 ,50 ,100 ,150);
shape = AffineTransform . getRotateInstance (Math . PI / 3 ) . createTransformedShape ( shape ) ;
The Affine2DApp class in Listing 17.1 demonstrates aﬃne transformations by applying them to
a rectangle.
Listing 17.1: The Affine2DApp class.
package org . opensourcephysics . sip . ch17 ;
import java . awt . ;
import java . awt . geom . ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . display . ;
import org . opensourcephysics . frames . ;
public class Affine2DApp extends AbstractCalculation {
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 712
DisplayFrame frame = new DisplayFrame ( "2D Affine transformation" ) ;
RectShape rect = new RectShape ( ) ;
public void calculate ( ) {
/ / a l l o c a t e 3 rows but not the row elements
double [ ] [ ] matrix = new double [ 3 ] [ ] ;
/ / s e t the f i r s t row of the matrix
matrix [ 0] = ( double [ ] ) control . getObject ( "row 0" ) ;
/ / s e t the second row
matrix [ 1] = ( double [ ] ) control . getObject ( "row 1" ) ;
/ / s e t the t h i r d row
matrix [ 2] = ( double [ ] ) control . getObject ( "row 2" ) ;
rect . transform ( matrix ) ;
}
public void reset ( ) {
control . clearMessages ( ) ;
control . setValue ( "row 0" , new double [ ] {1 , 0 , 0 } ) ;
control . setValue ( "row 1" , new double [ ] {0 , 1 , 0 } ) ;
control . setValue ( "row 2" , new double [ ] {0 , 0 , 1 } ) ;
rect = new RectShape ( ) ;
frame . clearDrawables ( ) ;
frame . addDrawable ( rect ) ;
calculate ( ) ;
}
public s t a t i c void main ( String [ ] args ) {
CalculationControl . createApp (new Affine2DApp ( ) ) ;
}
class RectShape implements Drawable { / / inner c l a s s
Shape shape = new Rectangle2D . Double (50 , 50 , 100 , 100);
public void draw ( DrawingPanel panel , Graphics g ) {
Graphics2D g2 = ( ( Graphics2D ) g ) ;
g2 . setPaint ( Color .BLUE ) ;
g2 . f i l l ( shape ) ;
g2 . setPaint ( Color .RED) ;
g2 . draw ( shape ) ;
}
public void transform ( double [ ] [ ] mat ) {
shape = (new AffineTransform (mat [ 0 ] [ 0 ] , mat [ 1 ] [ 0 ] , mat [ 0 ] [ 1 ] ,
mat [ 1 ] [ 1 ] , mat [ 0 ] [ 2 ] , mat [ 1 ] [ 2 ] ) ) . createTransformedShape ( shape ) ;
}
}
}
Exercise 17.2. Two-dimensional aﬃne transformations
(a) Enter an aﬃne transformation for a 30◦ clockwise rotation into Affine2DApp. About what
point does the rectangle rotate? Why?
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 713
(b) Add and test a convenience method named translate to the RectShape class that takes two
parameters (dx,dy). Add a custom button to invoke this method.
(c) Add and test a convenience method named rotate to the RectShape class that takes a θ
parameter.
(d) An object can be rotated about its center by ﬁrst translating the object to the center of rotation,
performing the rotation, and then translating the object back to its original position.
Implement a method that performs a rotation about the center of the rectangle by invoking
a sequence of translate and rotate methods.
(e) Aﬃne transformations have the property that transformed parallel lines remain parallel.
Demonstrate that this property is plausible by transforming a rectangle using arbitrary values
for the transformation matrix.
To facilitate the creation of simple geometric shapes using world coordinates, the Open
Source Physics library deﬁnes the DrawableShape and InteractiveShape classes in the display
package. These classes deﬁne convenience methods to create common drawable shapes whose
(x,y) location is their geometric center. These shapes can later be transformed without having
to instantiate new objects. The following code fragment shows how these classes are used.
/ / c i r c l e of radius 3 in world units l o c a t e d at ( −1 ,2)
DrawableShape c i r c l e = InteractiveShape . createCircle ( −1 ,2 ,3);
c i r c l e . transform ( AffineTransform . getShearInstance ( 1 , 2 ) ) ;
frame . addDrawable ( c i r c l e ) ;
/ / r e c t a n g l e of width 2 and height 1 centered at (3 ,4)
InteractiveShape rect = InteractiveShape . createRectangle ( 3 , 4 , 2 , 1 ) ;
rect . transform (new AffineTransform ( 2 , 1 , 0 , 1 , 0 , 0 ) ) ;
frame . addDrawable ( rect ) ;
Because DrawableShape and InteractiveShape classes are written using the Java 2D API, the
objects that they deﬁne are fundamentally diﬀerent from the objects that use the awt API introduced
in Section 3.3 because Java 2D shapes can be transformed. In addition, the Java 2D API
is not restricted to pixel coordinates nor is it restricted to solid single-pixel lines.1
Exercise 17.3. Open Source Physics shape classes
The Open Source Physics shape classes can be manipulated using a wide variety of linear
algebra-based tools. Modify the Affine2DApp program so that it instantiates and transforms a
rectangular DrawableShape into a trapezoid. Test your program by repeating Exercise 17.2.
17.2 Three-Dimensional Transformations
There are several available APIs for three-dimensional visualizations using Java. Although Sun
has developed the Java 3D package, this package is currently not included in the standard Java
runtime environment. The gl4java and jogl libraries are also popular because they are based
on the Open GL language. Because 3D graphics libraries are in a state of active development
and because we want a three-dimensional visualization framework designed for physics simulations,
we have developed a three-dimensional visualization framework that relies only on
1See Chapter 4 in the Open Source Physics: A Users Guide with Examples for a more complete discussion of the DrawableShape
and InteractiveShape classes.
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 714
r
v'
v
v
v
O
N
N
P'
P
r x v
P
P'
v
θ
θ
Figure 17.2: A rotation of the vector v about an axis ˆr to produce the vector v can be decomposed
into parallel v and perpendicular v⊥ components.
the standard Java API. This section describes the mathematics that forms the basis of all threedimensional
libraries.
The simplest rotation is a rotation about one of the coordinate axes with the center of rotation
at the origin. This transformation can be written using a 3 × 3 matrix acting on the coordinates
(x,y,z). For example, a rotation about the z-axis can be written as


x
y
z


=


cosθ sinθ 0
−sinθ cosθ 0
0 0 1




x
y
z


= Rz


x
y
z


. (17.8)
The extension of homogeneous coordinates to three dimensions is straightforward. We add
a w-coordinate to the spatial coordinates to create a homogenous point (x,y,z,1). This point is
transformed as 

x
y
z
1


=


m00 m01 m02 m03
m10 m11 m12 m13
m20 m21 m22 m23
0 0 0 1




x
y
z
1


=
Rz dT
0 1


x
y
z
1


. (17.9)
Although rotations about one of the coordinate axes are easy to derive and can be combined
using the rules of linear algebra to produce an arbitrary orientation, the general case of rotation
about the origin by an angle θ around an arbitrary axis ˆr can be constructed directly. The
strategy is to decompose the vector v into components that are parallel and perpendicular to
the direction ˆr as shown in Figure 17.2. The parallel part v does not change, while the perpendicular
part v⊥ is a two-dimensional rotation in a plane perpendicular to ˆr. The parallel part is
the projection of ˆr onto v,
v = (v· ˆr)ˆr, (17.10)
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 715
and the perpendicular part is what remains of ˆv after we subtract the parallel part:
v⊥ = v − (v· ˆr)ˆr. (17.11)
To calculate the rotation of v⊥, we need two perpendicular basis vectors in the plane of rotation.
If we use v⊥ as the ﬁrst basis vector, then we can take the cross product with ˆr to produce a
vector w that is guaranteed to be perpendicular to v⊥ and ˆr:
w = ˆr × v⊥ = ˆr × v. (17.12)
The rotation of v⊥ is now calculated in terms of this new basis:
v = R(v⊥) = cosθ v⊥ + sinθ w. (17.13)
The ﬁnal result is the sum of this rotated vector and the parallel part that does not change:
R(v) = R(v⊥) + v (17.14a)
= cosθ v⊥ + sinθ w + v (17.14b)
= cosθ [v − (v· ˆr)ˆr] + sinθ (ˆr × v) + (v· ˆr)ˆr (17.14c)
or
R(v) = [1 − cosθ](v· ˆr)ˆr + sinθ (ˆr × v) + cosθ v. (17.15)
Equation (17.15) is known as the Rodrigues formula and provides a way of constructing
rotation matrices in terms of the direction of the axis of rotation ˆr = (rx,ry,rz), the cosine of the
rotation angle c = cosθ, and the sine of the rotation angle s = sinθ. If we expand the vector
products in (17.15), we obtain the matrix:
R =


trxrx + c trxry − srz trxrz + sry
trxry + srz tryry + c tryrz − srx
trxrz − sry tryrz + srx trzrz + c


, (17.16)
where t = 1 − cosθ. Homogeneous coordinates are transformed using
R 0T
0 1
, (17.17)
where the R submatrix is given in (17.16).
The Rotation3D class constructor (see Listing 17.2) computes the rotation matrix. The
direct method uses this matrix to transform a point. Note that the point passed to this method
as an argument is copied into a temporary vector and that the point’s coordinates are then
changed. You will deﬁne an inverse method that reverses this operation in Exercise 17.5.
Listing 17.2: The Rotation3D class implements three-dimensional rotations using a matrix rep-
resentation.
package org . opensourcephysics . sip . ch17 ;
public class Rotation3D {
/ / transformation matrix
private double [ ] [ ] mat = new double [ 4 ] [ 4 ] ;
public Rotation3D ( double theta , double [ ] axis ) {
double norm = Math . sqrt ( axis [ 0 ] axis [0]+ axis [ 1 ] axis [ 1 ]
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 716
+axis [ 2 ] axis [ 2 ] ) ;
double
x = axis [0]/norm , y = axis [1]/norm , z = axis [2]/norm ;
double
c = Math . cos ( theta ) , s = Math . sin ( theta ) ;
double t = 1−c ;
/ / matrix elements not l i s t e d are zero
mat [ 0 ] [ 0 ] = t x x+c ;
mat [ 0 ] [ 1 ] = t x y−s z ;
mat [ 0 ] [ 2 ] = t x y+s y ;
mat [ 1 ] [ 0 ] = t x y+s z ;
mat [ 1 ] [ 1 ] = t y y+c ;
mat [ 1 ] [ 2 ] = t y z−s x ;
mat [ 2 ] [ 0 ] = t x z−s y ;
mat [ 2 ] [ 1 ] = t y z+s x ;
mat [ 2 ] [ 2 ] = t z z+c ;
mat [ 3 ] [ 3 ] = 1;
}
public void direct ( double [ ] point ) {
int n = point . length ;
double [ ] pt = new double [n ] ;
System . arraycopy ( point , 0 , pt , 0 , n ) ;
for ( int i = 0; i <n ; i ++) {
point [ i ] = 0;
for ( int j = 0; j <n ; j ++) {
point [ i ] += mat [ i ] [ j ] pt [ j ] ;
}
}
}
}
Exercise 17.4. Rodrigues formula
Show that a rotation about the z-axis is consistent with (17.15) and (17.16). That is, deﬁne the
direction of rotation to be ˆr = (0,0,1) and show that both formulas give the same result and that
this result is consistent with a two-dimensional rotation in the x-y plane. Write a test program
to test the Rotation3D class.
Exercise 17.5. Inverse matrix
What is the inverse matrix of (17.16)? Hint: What happens physically if you change the sign of
the rotation angle. Is the matrix orthogonal? Add an inverse method to Rotation3D and write
a test program to test the inverse method. Show that the original vector is recovered if the
inverse and direct methods are applied in succession.
A projection transforms an object in a coordinate system of dimension d into another object
in a coordinate system less than d. The simplest projection is an orthographic parallel projection,
which maps an object onto a plane perpendicular to a coordinate axis. For example, if we choose
to project along the z-axis, the point (x,y,z) is mapped to the point (x,y) by dropping the third
coordinate. A line is projected by projecting the endpoints and then connecting the projected
values. A sphere with radius R is displayed by projecting the center and drawing a circle with
the sphere’s radius. Listing 17.3 uses these simple projections to visualize the structure of the
methane CH4 molecule.
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 717
Listing 17.3: The Methane class implements a visualization of the methane molecule CH4.
package org . opensourcephysics . sip . ch17 ;
import java . awt . ;
import org . opensourcephysics . display . ;
public class Methane implements Drawable {
s t a t i c final double cos30 = Math . cos (Math . PI / 6 ) ;
s t a t i c final double sin30 = Math . sin (Math . PI / 6 ) ;
s t a t i c final double h = Math . sqrt (1.0 −4.0 cos30 cos30 / 9 . 0 ) ;
double [ ] [ ] atoms = new double [ 5 ] [ ] ;
Circle c i r c l e = new Circle ( ) ;
public Methane ( ) {
/ / atom l o c a t i o n s in 3D homogeneous c o o r d i n a t e s
/ / C atom at o r i g i n
atoms [0 ] = new double [ ] {0 , 0 , 0 , 1 } ;
/ / H atom on z axis
atoms [1 ] = new double [ ] {0 , 0 , 0.75 h , 1 } ;
/ / H atom
atoms [2 ] = new double [ ] {2.0 cos30 /3.0 , 0 , −0.25 h , 1 } ;
/ / H atom
atoms [3 ] = new double [ ] {− cos30 /3.0 , sin30 , −0.25 h , 1 } ;
/ / H atom
atoms [4 ] = new double [ ] {− cos30 /3.0 , −sin30 , −0.25 h , 1 } ;
}
void transform ( Rotation3D t ) {
for ( int i = 0 , n = atoms . length ; i <n ; i ++) {
t . direct ( atoms [ i ] ) ;
}
}
public void draw ( DrawingPanel panel , Graphics g ) {
g . setColor ( Color . black ) ;
int x0 = panel . xToPix ( 0 ) ;
int y0 = panel . yToPix ( 0 ) ;
for ( int i = 0 , n = atoms . length ; i <n ; i ++) {
int xpix = panel . xToPix ( atoms [ i ] [ 0 ] ) ;
int ypix = panel . yToPix ( atoms [ i ] [ 1 ] ) ;
g . drawLine ( x0 , y0 , xpix , ypix ) ;
}
for ( int i = 0 , n = atoms . length ; i <n ; i ++) {
c i r c l e . setXY ( atoms [ i ] [ 0 ] , atoms [ i ] [ 1 ] ) ;
c i r c l e . draw ( panel , g ) ;
}
}
}
The carbon and hydrogen atom coordinates are given in the Methane constructor. These
coordinates are stored in a multidimensional array so that we can loop over the atoms in the
draw method. The carbon atom (the center of symmetry) is placed at the origin and a hydrogen
atom is placed above it along the z-axis. The three remaining hydrogen atoms are placed so as
to form a tetrahedron with an H-H separation equal to unity. This orientation can be changed
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 718
using the transform method to rotate the coordinates. Note that the draw method draws lines
from the origin to each hydrogen atom’s coordinates and then draws circles at these coordinates.
Exercise 17.6. Methane
Determine the angle between two hydrogen bonds using the data in the Methane class.
The MethaneApp program uses the Rotation3D and Methane classes by rotating the methane
molecule about the origin in response to mouse actions. The handleMouseAction method stores
the current mouse position when the mouse is clicked. A Rotation3D object is created when the
mouse is dragged; the drag distance determines the angle of rotation. If the mouse is dragged
vertically, the molecule is rotated about the y-axis, and if the mouse is dragged horizontally, the
molecule is rotated about the z-axis. The direct transform is then applied to every atom in the
methane molecule.
Listing 17.4: The MethaneApp class instantiates a Methane object and rotates this object using
mouse actions.
package org . opensourcephysics . sip . ch17 ;
import java . awt . event . MouseEvent ;
import javax . swing . JFrame ;
import org . opensourcephysics . frames . ;
import org . opensourcephysics . display . InteractiveMouseHandler ;
import org . opensourcephysics . display . InteractivePanel ;
public class MethaneApp implements InteractiveMouseHandler {
DisplayFrame frame = new DisplayFrame ( "Methane" ) ;
Methane methane = new Methane ( ) ;
double mouseX = 0 , mouseY = 0;
public MethaneApp ( ) {
frame . addDrawable ( methane ) ;
frame . setPreferredMinMax ( −1 , 1 , −1, 1 ) ;
frame . setInteractiveMouseHandler ( this ) ;
frame . s e t V i s i b l e ( true ) ;
frame . setDefaultCloseOperation ( JFrame .EXIT_ON_CLOSE ) ;
}
public void handleMouseAction ( InteractivePanel panel , MouseEvent evt ) {
switch ( panel . getMouseAction ( ) ) {
case InteractivePanel .MOUSE_DRAGGED :
double dx = panel . getMouseX() −mouseX ;
double dy = panel . getMouseY () −mouseY ;
Rotation3D rotation =
new Rotation3D (Math . sqrt ( dx dx+dy dy ) , new double [ ] { dy , 0 , dx } ) ;
methane . transform ( rotation ) ;
mouseX += dx ;
mouseY += dy ;
panel . repaint ( ) ;
break ;
case InteractivePanel .MOUSE_PRESSED :
mouseX = panel . getMouseX ( ) ;
mouseY = panel . getMouseY ( ) ;
break ;
}
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 719
}
public s t a t i c void main ( String [ ] args ) {
new MethaneApp ( ) ;
}
}
Exercise 17.7. Methane rotation
Modify the methane program by replacing the direct transformation with the inverse transformation.
Run and compare this program to the original program. How did this change aﬀect
the mouse actions? Why?
Exercise 17.8. Hidden surface removal
MethaneApp draws atoms using the same color which hides the fact that the drawing is not
correct. Modify the Methane class so that the carbon atom is drawn as a green circle and run the
MethaneApp program again. Notice that the carbon atom is hidden by the hydrogen atoms that
are behind the carbon atom. Rewrite the Methane class so that the drawing order is determined
by the atom’s z coordinate. Ordering along the line of sight is often used in graphics programs
for hidden line and hidden surface removal.
17.3 The Three-Dimensional Open Source Physics Library
The display3d package contains a three-dimensional drawing framework that makes it easy to
create simple visualizations. In the spirit of object oriented programming, we will not study the
Open Source Physics 3D implementation in detail but will describe only its API. The Box3DApp
class creates a simple three-dimensional visualization. Run the program and drag the mouse
within the panel. We need not concern ourselves with details such as hidden line removal,
perspective, or rotation.
The Open Source Physics 3D API is deﬁned in the org.opensourcephysics.display3d
package. This API can be be implemented using any 3D library, and we have done so in the
simple3d package using only standard Java. We use the simple3d package in this book because
programs written using this package will run on any Java enabled computer. If truly high resolution
rendering is required, it is best to import an OSP 3D implementation that uses an add-on
library such as Java for Open GL (JOGL) or Java 3D. These libraries must be installed, but they
support hardware accelerated rendering using Open GL graphics language drivers. A JOGL
implementation of Open Source Physics 3D is being developed. All that is required to use this
implementation is to change the import statement to use the jogl package.
Listing 17.5 demonstrates that it is just as easy to create a 3D visualization as it is to create
a 2D visualization.
Listing 17.5: The Box3DApp class creates a box within a Display3DFrame
package org . opensourcephysics . sip . ch17 ;
import java . awt . ;
import org . opensourcephysics . frames . ;
import org . opensourcephysics . display3d . simple3d . ;
public class Box3DApp {
public s t a t i c void main ( String [ ] args ) {
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 720
/ / c r e a t e s a drawing frame and a drawing panel
Display3DFrame frame = new Display3DFrame ( "3D demo" ) ;
frame . setPreferredMinMax ( −10 , 10 , −10, 10 , −10, 10);
frame . setDecorationType ( VisualizationHints .DECORATION_AXES) ;
/ / use shading when r o t a t i n g
frame . setAllowQuickRedraw ( false ) ;
Element block = new ElementBox ( ) ;
block . setXYZ (0 , 0 , 0 ) ;
block . setSizeXYZ (6 , 6 , 3 ) ;
block . getStyle ( ) . s e t F i l l C o l o r ( Color .RED) ;
/ / d i v i d e s block into subblocks
block . getStyle ( ) . setResolution (new Resolution (6 , 6 , 3 ) ) ;
frame . addElement ( block ) ;
frame . s e t V i s i b l e ( true ) ;
frame . setDefaultCloseOperation ( javax . swing . JFrame .EXIT_ON_CLOSE ) ;
}
}
Drawable objects in the simple3d package implement the Element interface. The Box3DApp
program demonstrates that this interface mimics simple geometric objects in the physical world.
Elements have position and size properties and contain Style and Resolution objects to control
their visual representation.
The Display3DFrame class behaves very much like its two-dimensional DisplayFrame counterpart.
Some methods such as setPreferredMinMax have been extended by adding parameters
for the third dimension. New methods such as getStyle and setDecorationType enable the
programmer to control the three-dimensional viewing perspective and axis types, respectively.
Three-dimensional elements, such as ElementBox, ElementCylinder, and ElementCircle, are
added to a container using the addElement method. As in the two-dimensional case, the high
level Display3DFrame is composed of lower level objects such as DrawingPanel3D which ﬁll the
frame’s viewing area. The program must repaint the frame for property changes to take eﬀect.
Elements generate interaction events and these events can have one or more interaction
targets. Interaction events are action events that are generated when an object is accessed using
the mouse. Interaction targets receive and process these events. Listing 17.6 creates a particle
and an arrow and responds to interaction events when these objects are moved. You can move
an interactive element by dragging it. You can also generate and process interaction events
at an arbitrary location within the viewing space, but you might wish to ﬁrst click the alt
( option on Mac OS X) key to disable the viewing space rotation. Clicking the shift key while
dragging enables you to zoom in and out. Clicking the control key while dragging enables you
to translate.
Listing 17.6: The Interaction3DApp class demonstrates how to respond to interaction events.
package org . opensourcephysics . sip . ch17 ;
import org . opensourcephysics . display3d . simple3d . ;
import org . opensourcephysics . frames . ;
import org . opensourcephysics . display3d . core . interaction . ;
public class Interaction3DApp implements InteractionListener {
Interaction3DApp ( ) {
Display3DFrame frame = new Display3DFrame ( "3D interactions" ) ;
frame . setPreferredMinMax ( −2.5 , 2.5 , −2.5 , 2.5 , −2.5 , 2 . 5 ) ;
/ / a c c e p t s i n t e r a c t i o n s from the frame ’ s 3D drawing panel
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 721
frame . addInteractionListener ( this ) ;
Element p a r t i c l e = new ElementCircle ( ) ;
p a r t i c l e . setSizeXYZ (1 , 1 , 1 ) ;
/ / enables i n t e r a c t i o n s that change p o s i t i o n s
p a r t i c l e . getInteractionTarget ( Element .TARGET_POSITION ) .
setEnabled ( true ) ;
/ / a c c e p t s i n t e r a c t i o n s from the p a r t i c l e
p a r t i c l e . addInteractionListener ( this ) ;
frame . addElement ( p a r t i c l e ) ; / / adds p a r t i c l e to panel
ElementArrow arrow = new ElementArrow ( ) ;
/ / enables i n t e r a c t i o n s that change the s i z e
arrow . getInteractionTarget ( Element . TARGET_SIZE)
. setEnabled ( true ) ;
/ / a c c e p t s i n t e r a c t i o n s from the arrow
arrow . addInteractionListener ( this ) ;
frame . addElement ( arrow ) ; / / adds the arrow to the panel
/ / enables i n t e r a c t i o n s with the 3D Frame
frame . enableInteraction ( true ) ;
/ / a c c e p t s i n t e r a c t i o n s from the frame
frame . addInteractionListener ( this ) ;
frame . setDefaultCloseOperation ( javax . swing . JFrame .EXIT_ON_CLOSE ) ;
frame . s e t V i s i b l e ( true ) ;
}
public void interactionPerformed ( InteractionEvent _evt ) {
Object source = _evt . getSource ( ) ;
i f ( _evt . getID ()== InteractionEvent .MOUSE_PRESSED) {
System . out . println ( "Mouse clicked" ) ;
}
i f ( source instanceof ElementCircle ) {
System . out . println ( "A particle has been hit" ) ;
}
}
s t a t i c public void main ( String args [ ] ) {
new Interaction3DApp ( ) ;
}
}
Elements can be grouped together and manipulated as a single object by creating geometric
shapes such as spheres, boxes, and arrows and adding them to a Group. An easy way to create
an Element consisting of many geometric shapes is to subclass Group and instantiate the other
Element objects in the constructor and add them to the Group. Listing 17.7 shows an example
of such a composite. The entire group acts like a single Element that can be translated and
rotated. The (x,y,z) parameters passed to the setXYZ method of an object placed within a group
are relative to the group’s position. The (x,y,z) parameters passed to an element’s setSizeXYZ
method are along the group’s axes even if the group has been rotated.
Listing 17.7: The Barbell3D class creates a compound object by instantiating simpler shapes
and adding them to a Group.
package org . opensourcephysics . sip . ch17 ;
import org . opensourcephysics . display3d . simple3d . ;
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 722
Figure 17.3: A visualization of the methane molecule using the Open Source Physics simple3d
package.
public class Barbell3D extends Group {
public Barbell3D ( ) {
ElementCylinder bar = new ElementCylinder ( ) ;
bar . setXYZ (0 , 0 , 5 ) ;
bar . setSizeXYZ (0.2 , 0.2 , 10);
addElement ( bar ) ;
Element sphere = new ElementSphere ( ) ;
sphere . setXYZ (0 , 0 , −5);
sphere . setSizeXYZ (4 , 4 , 4 ) ;
addElement ( sphere ) ;
sphere = new ElementSphere ( ) ;
sphere . setXYZ (0 , 0 , 5 ) ;
sphere . setSizeXYZ (4 , 4 , 4 ) ;
addElement ( sphere ) ;
}
}
Exercise 17.9. Group test
Write a test program that instantiates and displays a Barbell. Describe the change in rendering
while dragging within the view.
The code package for this chapter includes an Open Source Physics 3D version of the
methane molecule (see Figure 17.3) but is not listed here because of its length. As in the previous
example, a Group is used to deﬁne a Methane class. This model is instantiated and added to
a Display3DFrame in the Methane3DApp class.
17.4 Dynamics of a Rigid Body
The dynamical behavior of a rigid body is determined by
dP
dt
= F (17.18a)
dL
dt
= N, (17.18b)
where the rate of change of the total linear momentum P and total angular momentum L about
a point O is determined by the total force F on the body and the total torque N about O. These
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 723
Body Axis Moment of Inertia
ellipsoid, axes (2a,2b,2c) axes a I = m(b2 + c2)/5
axes b I = m(a2 + c2)/5
axes c I = m(a2 + b2)/5
parallelopiped, sides (a,b,c) perpendicular to (a,b) I = m(a2 + b2)/12
perpendicular to (b,c) I = m(b2 + c2)/12
perpendicular to (c,a) I = m(c2 + a2)/12
sphere, radius r any diameter I = m(2/5)r2
thin rod, length l normal to length at center I = ml2/12
thin circular sheet, radius r normal to sheet at center I = mr2/2
any diameter I = mr2/4
thin rectangular sheet, sides (a,b) normal to sheet at center I = m(a2 + b2)/2
parallel to a at center I = mb2/12
parallel to b at center I = ma2/12
Table 17.1: The moment of inertia about the center of mass of various geometric shapes. The
mass m is assumed to be uniformly distributed.
momenta are expressed in terms of the translational V and rotational velocity ω by
P = MV (17.19a)
L = Iω, (17.19b)
where M is the mass, and I is the moment of inertia tensor. For an unconstrained body, the
point O is usually taken to be the center of mass. For a constrained body, such as a spinning top,
the point O is usually taken to be the point of support.
Although the translational and rotational equations of motion appear similar, the fact that
the inertia tensor is not always constant with respect to axes ﬁxed in space complicates the analysis.
To use a constant inertia tensor, we must describe the motion using a noninertial reference
frame known as the body frame that is ﬁxed in the body. Because we are free to orient the axes
within the body, we choose axes for which the moment of inertia tensor is diagonal. These axes
are referred to as the body’s principal axes and are easy to determine for symmetrical objects.
The diagonal elements of the moment of inertia tensor are calculated using the volume integrals
I1 =
V
ρ(y2
+ z2
)dV (17.20a)
I2 =
V
ρ(z2
+ x2
)dV (17.20b)
I3 =
V
ρ(x2
+ y2
)dV , (17.20c)
where ρ is the mass density. The oﬀ-diagonal elements are zero in the principal axis coordinate
system. The moments of inertia for various simple geometrical objects are shown in Table 17.1.
Given that the general relation between the derivative of a vector A in the space frame to
the derivative in the body frame is (see Goldstein)
dA
dt space
=
dA
dt body
+ ω × A, (17.21)
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 724
it is easy to show that the rotational equation of motion in the body frame can be written as
dL
dt
+ ω × (Iω) = N. (17.22)
Equation (17.22) is Euler’s equation for the motion of a rigid body. Because the moment of
inertia tensor I is diagonal in the body frame,(17.22) may be written in component form as
I1 ˙ω1 + (I3 − I2)ω3ω2 = N1 (17.23a)
I2 ˙ω2 + (I1 − I3)ω1ω3 = N2 (17.23b)
I3 ˙ω3 + (I2 − I1)ω2ω1 = N3. (17.23c)
Open source physics elements support the concept of a body frame by providing toBodyFrame
and toSpaceFrame methods in the Element class. These methods are used in Listing 17.8
to show the torque on a rectangular sheet rotating on a ﬁxed shaft with uniform angular velocity
ω (see Figure 17.4). The mass can be tilted on the shaft to produce an out-of-balance conﬁguration
that causes the system to shake unless a torque is applied to the shaft. The program displays
the angular momentum vector and the torque vector using color-coded arrows.
Listing 17.8: A mass rotating on a ﬁxed shaft with uniform angular velocity ω.
package org . opensourcephysics . sip . ch17 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . display3d . simple3d . ;
import org . opensourcephysics . frames . Display3DFrame ;
import org . opensourcephysics . numerics . ;
public class TorqueApp extends AbstractSimulation {
Display3DFrame frame = new Display3DFrame ( " Rotation test" ) ;
Element body = new ElementBox ( ) ; / / shows r i g i d body
Element shaft = new ElementCylinder ( ) ; / / shows s h a f t
Element arrowOmega = new ElementArrow ( ) ; / / shows angular v e l o c i t y of s h a f t
Element arrowL = new ElementArrow ( ) ; / / shows angular momentum of body
Element arrowTorque = new ElementArrow ( ) ; / / shows torque on s h a f t
/ / contains s h a f t and arrowOmega
Group shaftGroup = new Group ( ) ;
/ / contains body , arrowL , and arrowTorque
Group bodyGroup = new Group ( ) ;
double theta = 0 , omega = 0.1 , dt = 0 . 1 ;
/ / p r i n c i p a l moments of i n e r t i a
double Ixx = 1.0 , Iyy = 1.0 , Izz = 2 . 0 ;
public TorqueApp ( ) {
frame . setDecorationType ( VisualizationHints .DECORATION_AXES) ;
body . setSizeXYZ (1.0 , 1.0 , 0 . 1 ) ; / / thin r e c t a n g l e
shaft . setSizeXYZ (0.1 , 0.1 , 0 . 8 ) ;
arrowL . getStyle ( ) . setLineColor ( java . awt . Color .MAGENTA) ;
arrowTorque . getStyle ( ) . setLineColor ( java . awt . Color .CYAN) ;
bodyGroup . addElement ( body ) ;
bodyGroup . addElement ( arrowTorque ) ;
bodyGroup . addElement ( arrowL ) ;
shaftGroup . addElement ( bodyGroup ) ;
shaftGroup . addElement ( arrowOmega ) ;
shaftGroup . addElement ( shaft ) ;
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 725
frame . addElement ( shaftGroup ) ;
}
void computeVectors ( ) {
/ / convert omega to body frame
double [ ] omega =
body . toBodyFrame (new double [ ] {0 , 0 , this . omega } ) ;
/ / L in body frame
double [ ] angularMomentum =
new double [ ] { omega [ 0 ] Ixx , omega [ 1 ] Iyy , omega [ 2 ] Izz } ;
/ / torque i s computed in body frame
double [ ] torque = VectorMath . cross3D (omega , angularMomentum ) ;
arrowL . setSizeXYZ ( angularMomentum ) ;
arrowTorque . setSizeXYZ ( torque ) ;
/ / p o s i t i o n torque arrow at t i p of angular momentum
arrowTorque . setXYZ ( angularMomentum ) ;
}
public void i n i t i a l i z e ( ) {
omega = control . getDouble ( "omega" ) ;
arrowOmega . setSizeXYZ (0 , 0 , omega ) ;
double t i l t = control . getDouble ( "tilt" ) ;
double cos = Math . cos ( t i l t /2) , sin = Math . sin ( t i l t / 2 ) ;
Transformation rotation = new Quaternion ( cos , sin , 0 , 0 ) ;
bodyGroup . setTransformation ( rotation ) ;
computeVectors ( ) ;
}
public void reset ( ) {
control . setValue ( "omega" , "pi/4" ) ;
control . setValue ( "tilt" , "pi/5" ) ;
}
protected void doStep ( ) {
theta += omega dt ;
double cos = Math . cos ( theta /2) , sin = Math . sin ( theta / 2 ) ;
shaftGroup . setTransformation (new Quaternion ( cos , 0 , 0 , sin ) ) ;
computeVectors ( ) ;
}
public s t a t i c void main ( String [ ] args ) {
SimulationControl . createApp (new TorqueApp ( ) ) ;
}
}
TorqueApp instantiates a shaft group and a body group in its constructor. The shaft group
contains visual representations of the shaft, the angular velocity vector, and the body group.
The body group contains representations of a thin rectangular sheet, the angular velocity, and
the torque. The body group is rotated about the x-axis in the initialize method and the shaft
group is rotated about the z-axis in the doStep method. Because the body group is within the
shaft group, the body group also rotates. The computeVectors method displays the angular momentum
and torque vectors by computing their values in the rotating (body) frame of reference.
The torque is computed using a cross produce of the body frame angular velocity and the body
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 726
Z
Y
X
Figure 17.4: TorqueApp shows the torque on a rotating shaft with an attached rectangular sheet.
frame angular momentum according to (17.21).
A complete description of a rigid body requires three angular orientation variables as well
as three angular velocity variables. The orientation is often given in terms of the Euler angles ψ,
θ, and φ. These angles specify the orientation as a series of three independent rotations about
prechosen axes and must be applied in exactly the order given because the rotation matrices
do not commute. To use these angles as diﬀerential equation state variables, we must be able to
calculate their rate as a function of the body frame angular velocity in (17.23). The expression
for the angular velocity in the body frame in terms of the Euler angles is (see Goldstein)


ω1
ω2
ω3


=


sinθ sinψ cosψ 0
sinθ cosψ −sinψ 0
cosθ 0 1




˙φ
˙θ
˙ψ


. (17.24)
Unfortunately, the matrix in (17.24) is singular when sinθ = 0.
Because we must invert this matrix to solve for the rate of the Euler angles, the Euler angle
rate equation becomes unstable when θ approaches 0 or π. A better approach for numerical
work is to abandon Euler angles and to use quaternions.
17.5 Quaternion Arithmetic
The rotation of an arbitrary two-dimensional vector A = ax ˆx + ay ˆy by an angle θ can be reformulated
using complex numbers as
a = aeiθ
= eiθ/2
aeiθ/2
, (17.25)
where the vector A is expressed as a complex number a = ax + iay. The real and imaginary components
correspond to the vector components. This idea can be extended to three dimensions
using quaternions. A quaternion can be represented in terms of real and hypercomplex numbers
i, j, and k as
ˆq = q0 + iq1 + jq2 + kq3 = (q0,q1,q2,q3), (17.26)
where the hypercomplex numbers obey Hamilton’s rules
i2
= j2
= k2
= ijk = −1. (17.27)
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 727
Y
Z
X
θ (u1, u2, u3)
Figure 17.5: The most general change of a rigid body with one point ﬁxed is a rotation about a
ﬁxed axis with direction cosines (u1,u2,u3) by the angle θ.
Similar to imaginary numbers, the quaternion conjugate is deﬁned as
ˆq∗
= q0 − iq1 − jq2 − kq3 = (q0,−q1,−q2,−q3). (17.28)
Unlike imaginary numbers, quaternion multiplication does not commute and obeys the rules
ij = k, jk = i, ki = j, ji = −k, kj = −1, ik = −j. (17.29)
Although it would be a mistake to identify hypercomplex numbers with unit vectors in threedimensional
space (just as it would be a mistake to identify the imaginary number i with the y
direction in a two-dimensional space), it is convenient to think of a quaternion as the sum of a
scalar q0 and a vector q
ˆq = q0 + q = (q0,q). (17.30)
The quaternion is said to be pure if the scalar part is zero.
By using the above deﬁnitions, the product of two quaternions ˆp and ˆq can be shown to be
ˆp ˆq = (p0,p)(q0,q) = q0p0 − q·p + q0p + p0q + p × q. (17.31)
Note that except for the additional cross product term, quaternion multiplication is similar
to complex multiplication, (a0,a1)(b0,b1) = (a0b0 − a1b1,a1b0 + b1a0). The norm (length) of a
quaternion is deﬁned to be | ˆq ˆq∗| = q0q0 + q1q1 + q2q2 + q3q3.
Exercise 17.10. Quaternion multiplication
Show that Hamilton’s rules for hypercomplex numbers in (17.27) lead to (17.31).
Euler has shown that the most general change of a rigid body with one point ﬁxed is a rotation
about a ﬁxed axis as shown in Figure 17.5. Quaternions provide an elegant representation
of both the rotation angle and the axis orientation. It can be shown (see Shoemake) that a rotation
through an angle θ about an axis with direction cosines (u1,u2,u3) can be represented as a
unit quaternion with components
ˆq = cos
θ
2
+ (iu1 + ju2 + ku3)sin
θ
2
. (17.32)
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 728
To stress the analogy with the complex exponential eiθ = cosθ+i sinθ, some textbooks represent
this rotation through θ about the axes u using exponential notation ˆq = euθ/2.
Given a unit quaternion ˆq that represents a rotation, how do we apply this rotation to an
arbitrary vector A? If we deﬁne a pure quaternion ˆa = (0,ax,ay,az), it can be shown that
ˆa = ˆq ˆa ˆq∗
, (17.33)
where the resulting quaternion ˆa = (0,ax,ay,az) contains the components of the rotated vector
A (see Rapaport). Note the similarity to (17.25) if the quaternion is represented using exponential
notation.
Exercise 17.11. Quaternion rotation
(a) Use the properties of the direction cosines to show that (17.32) deﬁnes a quaternion of unit
length.
(b) Show that the length of the vector A does not change when using (17.33).
Bodies can be oriented using any representation of a rotation including quaternions, Euler
angles, and rotation matrices. Because the quaternion representation does not involve trigonometric
functions, it is very eﬃcient for computing rigid body dynamics. To make it easy to create
other representations of transformations, we deﬁne a Transformation interface in the numerics
package. A concrete implementation of the interfaces based on rotation matrices is described in
Appendix A. The quaternion implementation is similar and is available in the numerics package.
Listing 17.9 shows how the quaternion implementation is used to rotate a BoxWithArrows.
Because BoxWithArrows is a subclass of Group containing a box and three arrows, it is not listed.
Listing 17.9: A class that tests the quaternion representation of rotations.
package org . opensourcephysics . sip . ch17 ;
import org . opensourcephysics . controls . ;
import org . opensourcephysics . numerics . Quaternion ;
import org . opensourcephysics . frames . ;
import org . opensourcephysics . display3d . simple3d . VisualizationHints ;
public class QuaternionApp extends AbstractCalculation {
Display3DFrame frame = new Display3DFrame ( "Quaternion rotations" ) ;
Quaternion transformation = new Quaternion (1 , 0 , 0 , 0 ) ;
BoxWithArrows box = new BoxWithArrows ( ) ;
public QuaternionApp ( ) {
frame . setDecorationType ( VisualizationHints .DECORATION_AXES) ;
/ / scene i s simple , so draw i t properly when r o t a t i n g
frame . setAllowQuickRedraw ( false ) ;
frame . setPreferredMinMax ( −6 , 6 , −6, 6 , −6, 6 ) ;
box . setTransformation ( transformation ) ;
frame . addElement ( box ) ;
}
public void calculate ( ) {
double q0 = control . getDouble ( "q0" ) ;
double q1 = control . getDouble ( "q1" ) ;
double q2 = control . getDouble ( "q2" ) ;
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 729
double q3 = control . getDouble ( "q3" ) ;
transformation . setCoordinates ( q0 , q1 , q2 , q3 ) ;
box . setTransformation ( transformation ) ;
}
public void reset ( ) {
control . clearMessages ( ) ;
control . setValue ( "q0" , 1 ) ;
control . setValue ( "q1" , 0 ) ; / / i n i t i a l o r i e n t a t i o n i s along x axis
control . setValue ( "q2" , 0 ) ;
control . setValue ( "q3" , 0 ) ;
calculate ( ) ;
}
public s t a t i c void main ( String [ ] args ) {
CalculationControl . createApp (new QuaternionApp ( ) ) ;
}
}
Exercise 17.12. Quaternion representation of rotations
(a) Compile and run the QuaternionApp target class. Find and then test a quaternion that orients
the long side of the box at 45◦ in the xy plane. Repeat for the yz plane.
(b) Use a quaternion to orient the long axis of the box along an axis in the (1,2,1) direction.
(c) What happens if the quaternion does not have unit norm?
17.6 Quaternion equations of motion
Because we will often transform torque and other vectors to and from the rotating object’s body
frame, we need the transformation matrix from the space frame to the body frame using quaternions.
We represent a rotation using a unit quaternion (q0,q1,q2,q3), carry out (17.33) using
(17.31), and express the result in matrix form. The resulting rotation matrix is
R = 2


1
2 − q2
2 − q2
3 q1q2 + q0q3 q1q3 − q0q2
q1q2 − q0q3
1
2 − q2
1 − q2
3 q2q3 + q0q1
q1q3 + q0q2 q2q3 − q0q1
1
2 − q2
1 − q2
2


. (17.34)
This matrix is equal to the rotation matrix derived from the Rodrigues formula (17.15).
The angular velocity in the body frame can be written as ˙ˆq(t) = 1
2 ˆω(t) ˆq(t), where ˆω is a
pure quaternion (0,ω1,ω2,ω3). Our discussion follows the derivation in Rapaport. The time
dependence of a vector r can be expressed as a transformation of its initial value r0 as
ˆr(t) = ˆq(t)ˆr0 ˆq∗
(t). (17.35)
If we diﬀerentiate ˆr(t) with respect to time, we have
˙ˆr = ˙ˆqˆr0 ˆq∗
+ ˆqˆr0
˙ˆq∗
, (17.36)
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 730
where we have dropped the explicit time dependence of ˆq. We substitute ˆr0 = ˆq∗ ˆr ˆq and obtain
˙ˆr = ˙ˆq ˆq∗
ˆr + ˆr ˆq ˙ˆq∗
= ˙ˆq ˆq∗
ˆr − ˆr ˙ˆq ˆq∗
, (17.37)
where we have used the fact ˆq ˆq∗ = 1. The only part of the ˆr and ˙ˆq ˆq∗ product that does not
commute is the vector cross product. The scalar part commutes and is zero. If we denote the
pure (vector) part of the quaternion ˙ˆq ˆq∗ by u, we ﬁnd
˙r = u × r − r × u = 2u × r. (17.38)
Because ˙r = ω × r for rotational motion, we obtain
ω = 2u. (17.39)
Equation (17.39) can be expressed using components by writing ω as


ω1
ω2
ω3
0


= 2W


˙q0
˙q1
˙q2
˙q3


, (17.40)
where
W =


−q1 q0 q3 −q2
−q2 −q3 q0 q1
−q3 q2 −q1 q0
q0 q1 q2 q3


. (17.41)
Because W is orthogonal, the transpose of (17.41) is its inverse WT W = 1, and


˙q0
˙q1
˙q2
˙q3


=
1
2
WT


ω1
ω2
ω3
0


. (17.42)
Because the quaternion derivatives depend on both ˆq and ω even in the simplest Eulerlike
ODE algorithm, solving (17.42) is a two-step process. A modiﬁed leap-frog algorithm that
advances the angular momentum in the space frame at every time step
L(t + ∆t/2) = L(t − ∆t/2) + N(t)∆t (17.43)
and then transforms the angular momentum to the body frame to advance the quaternion
ˆq(t + ∆t) = ˆq(t) + ˙ˆq(t)∆t (17.44)
may be applied to simple systems (see Allen and Tildesley). We use this algorithm to study the
dynamics of a spinning plate.
Richard Feynman (see references) noticed that the wobble and spin of a Cornell cafeteria
plate as it was tossed into the air was in a 2:1 ratio. Although this result can be derived analytically,2
we will simulate the dynamics using (17.44) to discover the simple dynamics and to test
the numerical algorithm.
FeynmanPlate in Listing 17.10 and FeynmanPlateView in Listing 17.11 simulate the torquefree
rotation of a body about its center of mass. Because FeynmanPlateApp is similar to other
target programs, we do not list it here.
2See Tuleja et al. for a simple and elegant proof of the wobble-to-spin ratio using basic mechanics and symmetry
arguments.
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 731
Listing 17.10: The FeynmanPlate class implements the dynamics of a rotating plate.
package org . opensourcephysics . sip . ch17 ;
import org . opensourcephysics . numerics . ;
public class FeynmanPlate {
Quaternion toBody = new Quaternion (1 , 0 , 0 , 0 ) ;
/ / spaceview d i s p l a y s the p l a t e as seen from the l a b o r a t o r y
FeynmanPlateView spaceView = new FeynmanPlateView ( this ) ;
double [ ] spaceL = new double [ 3 ] ; / / space frame angular momentum
double I1 = 1 , I2 = 1 , I3 = 1; / / d e f a u l t moments of i n e r t i a
double wx = 0 , wy = 0 , wz = 0; / / angular v e l o c i t y in the body frame
double q0 , q1 , q2 , q3 ; / / quaternion components
double dt = 0 . 1 ;
void setOrientation ( double [ ] q ) {
double norm = Math . sqrt (q [ 0 ] q[0]+q [ 1 ] q[1]+q [ 2 ] q[2]+q [ 3 ] q [ 3 ] ) ;
q0 = q [0]/norm ;
q1 = q [1]/norm ;
q2 = q [2]/norm ;
q3 = q [3]/norm ;
toBody . setCoordinates ( q0 , q1 , q2 , q3 ) ;
spaceView . i n i t i a l i z e ( ) ;
}
Transformation getTransformation ( ) {
toBody . setCoordinates ( q0 , q1 , q2 , q3 ) ;
return toBody ;
}
void s e t I n e r t i a ( double I1 , double I2 , double I3 ) {
setOrientation (new double [ ] {1 , 0 , 0 , 0 } ) ; / / d e f a u l t o r i e n t a t i o n
this . I1 = I1 ;
this . I2 = I2 ;
this . I3 = I3 ;
spaceView . i n i t i a l i z e ( ) ;
}
void computeOmegaBody ( ) {
double [ ] bodyL = ( double [ ] ) spaceL . clone ( ) ;
toBody . inverse ( bodyL ) ;
wx = bodyL [0]/ I1 ;
wy = bodyL [1]/ I2 ;
wz = bodyL [2]/ I3 ;
}
public void advanceTime ( ) {
computeOmegaBody ( ) ;
/ / compute quaternion r a t e of change
double q0dot = 0.5 ( −q1 wx−q2 wy−q3 wz ) ; / / dq0 / dt
double q1dot = 0.5 ( q0 wx−q3 wy+q2 wz ) ; / / dq1 / dt
double q2dot = 0.5 ( q3 wx+q0 wy−q1 wz ) ; / / dq2 / dt
double q3dot = 0.5 ( −q2 wx+q1 wy+q0 wz ) ; / / dq3 / dt
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 732
/ / update quaternion
q0 += q0dot dt ;
q1 += q1dot dt ;
q2 += q2dot dt ;
q3 += q3dot dt ;
/ / normalize to elimina te d r i f t
double norm = 1/Math . sqrt ( q0 q0+q1 q1+q2 q2+q3 q3 ) ;
q0 = norm ;
q1 = norm ;
q2 = norm ;
q3 = norm ;
toBody . setCoordinates ( q0 , q1 , q2 , q3 ) ;
spaceView . update ( ) ;
}
}
FeynmanPlate contains a transformation QuaternionRotation that keeps track of a spinning
plate using quaternions. The physics and the algorithm are contained in the advanceTime
method. The ﬁrst step is to compute ω in the body frame using ω = I−1L. This calculation is
simple because the angular momentum in the space frame L is constant and the inverse of a
diagonal matrix is also diagonal with diagonal elements equal to the reciprocal of the original
elements. The second step is to compute ˙ˆq using (17.42) and advance the quaternion components.
Because most numerical algorithms are subject to drift, we renormalize the quaternion
before updating the view as seen from the inertial laboratory (space) reference frame.
Listing 17.11: The FeynmanPlateView class shows a spinning plate as seen from the space
frame.
package org . opensourcephysics . sip . ch17 ;
import org . opensourcephysics . display3d . simple3d . ;
import org . opensourcephysics . frames . ;
import org . opensourcephysics . numerics . VectorMath ;
public class FeynmanPlateView {
Element plate = new ElementCylinder ( ) ;
Element omega = new ElementArrow ( ) ;
Element angularMomentum = new ElementArrow ( ) ;
Element bodyX = new ElementArrow ( ) ; / / x axis in body frame
Element bodyY = new ElementArrow ( ) ; / / y axis in body frame
ElementTrail trailX = new ElementTrail ( ) ;
ElementTrail t r a i l Y = new ElementTrail ( ) ;
/ / note that OSP does Greek symbols
ElementText labelOmega = new ElementText ( "$\\omega$" ) ;
Group bodyGroup = new Group ( ) ;
Display3DFrame frame = new Display3DFrame ( "Space View" ) ;
FeynmanPlate rigidBody ;
public FeynmanPlateView ( FeynmanPlate rigidBody ) {
this . rigidBody = rigidBody ; / / save a r e f e r e n c e to the model
plate . setSizeXYZ (2 , 2 , 0 . 1 ) ;
bodyX . setSizeXYZ (1.4 , 0 , 0 ) ;
bodyY . setSizeXYZ (0 , 1.4 , 0 ) ;
frame . setPreferredMinMax ( −2 , 2 , −2, 2 , −2, 2 ) ;
frame . setSize (600 , 600);
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 733
plate . getStyle ( ) . s e t F i l l C o l o r ( java . awt . Color .LIGHT_GRAY ) ;
frame . setDecorationType ( VisualizationHints .DECORATION_NONE) ;
/ / frame . setDecorationType ( VisualizationHints .DECORATION_AXES) ;
omega . getStyle ( ) . s e t F i l l C o l o r ( java . awt . Color .YELLOW) ;
angularMomentum . getStyle ( ) . s e t F i l l C o l o r ( java . awt . Color .BLUE ) ;
bodyX . getStyle ( ) . setResolution (new Resolution ( 1 0 ) ) ;
bodyY . getStyle ( ) . setResolution (new Resolution ( 1 0 ) ) ;
bodyX . getStyle ( ) . s e t F i l l C o l o r ( java . awt . Color .RED) ;
bodyY . getStyle ( ) . s e t F i l l C o l o r ( java . awt . Color .GREEN) ;
trailX . getStyle ( ) . setLineColor ( java . awt . Color .RED) ;
t r a i l Y . getStyle ( ) . setLineColor ( java . awt . Color .GREEN) ;
bodyGroup . addElement ( plate ) ;
bodyGroup . addElement (omega ) ;
bodyGroup . addElement ( labelOmega ) ;
bodyGroup . addElement (bodyX ) ;
bodyGroup . addElement ( bodyY ) ;
/ / c o o r d i n a t e s in space
frame . addElement ( trailX ) ;
frame . addElement ( t r a i l Y ) ;
frame . addElement ( angularMomentum ) ;
frame . addElement ( bodyGroup ) ;
/ / scene i s simple , so draw i t properly when r o t a t i n g
frame . setAllowQuickRedraw ( false ) ;
bodyGroup . setTransformation ( rigidBody . getTransformation ( ) ) ;
}
void i n i t i a l i z e ( ) {
trailX . clear ( ) ;
t r a i l Y . clear ( ) ;
update ( ) ;
}
void update ( ) {
bodyGroup . setTransformation ( rigidBody . getTransformation ( ) ) ;
double [ ] vec =
new double [ ] { rigidBody .wx, rigidBody .wy, rigidBody .wz } ;
double norm = Math . sqrt ( vec [ 0 ] vec [0]+ vec [ 1 ] vec [1]+ vec [ 2 ] vec [ 2 ] ) ;
norm = Math .max(norm , 1.0 e −6);
double s = VectorMath . magnitude ( rigidBody . spaceL )/norm ;
/ / s c a l e omega to same length as angular momentum
omega . setSizeXYZ ( s vec [ 0 ] , s vec [ 1 ] , s vec [ 2 ] ) ;
labelOmega . setXYZ ( s vec [ 0 ] , s vec [ 1 ] , s vec [ 2 ] ) ;
angularMomentum . setSizeXYZ ( rigidBody . spaceL ) ;
double [ ] tipX = new double [ ] { bodyX . getSizeX ( ) , 0 , 0 } ;
bodyGroup . toSpaceFrame ( tipX ) ;
trailX . addPoint ( tipX [ 0 ] , tipX [ 1 ] , tipX [ 2 ] ) ;
double [ ] tipY = new double [ ] {0 , bodyY . getSizeY ( ) , 0 } ;
bodyGroup . toSpaceFrame ( tipY ) ;
t r a i l Y . addPoint ( tipY [ 0 ] , tipY [ 1 ] , tipY [ 2 ] ) ;
}
}
FeynmanPlateView shows a spinning plate as seen from the space frame. The elements
needed to visualize the plate as seen in the laboratory (space) frame are instantiated in the
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 734
constructor. The plate is represented by an ElementCylinder object and the x- and y-body axes
are ElementArrow objects. Additional arrows are used to show the angular momentum and
angular velocity. As the body frame axes rotate, we add points to trail objects to display the
trajectories of the body axis arrows. These trails show the wobble. The plate, angular velocity
arrow, and axes arrows are placed in a group so that their orientation can be set with a single
transformation. The angular momentum arrow is added to the 3D frame because it is constant
in the space frame. Because the trails are also added to the 3D frame, we will need to transform
the tips of the body axes vectors into the space frame in order to add points to the trails.
The update method in the FeynmanPlateView is invoked after every time step. The method
begins by setting the body group’s transformation using the transformation computed in FeynmanPlate.
The body frame’s angular velocity components are then scaled so that the angular velocity arrow
has the same length as the angular momentum arrow. (Relative arrow length has little
meaning because angular velocity has dimension 1/L and angular momentum has dimension
ML2/T .) Note that ω is not transformed separately because this arrow is in the body group and
this group has been transformed. However, the coordinates of the tips of the body-axes arrows
are transformed into the space frame because the trails are not part of the body group.
Problem 17.13. Simulation of a wobbling plate
(a) Run the target class FeynmanPlateApp. Does your simulation conﬁrm Feynman’s observation
of the wobbling plate in the Cornell cafeteria? Are the discrepancies between Feynman’s
description and the simulation due to the fact that the wobble is not small or due to
numerical inaccuracies? Justify your answer.
(b) How well does the algorithm conserve energy and angular momentum?
(c) Will a rectangular food tray wobble the same way as a plate?
(d) Does the plate wobble if it spun about an axis close to a diameter rather than close to a line
perpendicular to the ﬂat side of the disk?
Problem 17.14. Torque-free rotation about a principal axis
(a) Add a second visualization showing the motion as viewed in the noninertial body frame.
(b) If the angular velocity vector coincides with a body’s principal axis, the angular momentum
and the angular velocity coincide. The body should then rotate steadily about the corresponding
principal axis because the net torque is zero and the angular momentum is constant.
Does the simulation show this result for all three axes if the moments of inertia are
unequal? Perturb the angular velocity. Are rotations about the three principal axes stable
or unstable? Check each axis. Repeat this simulation with a diﬀerent set of moments of
inertia.
Problem 17.15. Torque-free rotation of a symmetrical body
(a) Add a plot showing the time dependence of the angular velocity components ω1(t), ω2(t),
and ω3(t) to the FeynmanPlateApp class.
(b) Verify that ω3(t) is constant if I1 = I2 = Is.
(c) Verify that ω1(t) and ω2(t) exhibit an out-of-phase sinusoidal dependence if I1 = I2.
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 735
Figure 17.6: The dynamics of a spinning top are computed using a RigidBodyModel and displayed
using a SpinningTopSpaceView.
(d) If α denotes the angle between the body-frame ˆ3-axis and the axis of rotation, verify that
Ω =
I3
Is
− 1 ωcosα, (17.45)
where Ω is the angular velocity of the ω vector’s precession about the body frame’s ˆ3-axis.
Problem 17.16. Free rotation of a thin cylindrical rod
Model the free rotation of a thin cylindrical rod. Show that ω precesses about the axis of the
rod. Why does ω precess? Why does L not precess? How is the frequency of precession related
to the moment of inertia of the rod?
17.7 Rigid Body Model
As seen in Problem 17.13, the simple rigid body algorithm presented in Section 17.6 does a good
job of conserving energy and angular momentum but should be improved if our goal is to compute
accurate trajectories over long times. To use a higher-order diﬀerential equation solver, we
need to obtain an expression for the second derivative of the quaternion by eliminating the angular
velocities from Euler’s equation of motion (17.23). The resulting quaternion acceleration
is the rate for the quaternion velocity in a diﬀerential equation solver.
If we start with
d
dt
ˆq∗ ˙ˆq = ˙ˆq∗ ˙ˆq + ˆq∗ ¨ˆq, (17.46)
multiply by ˆq, and rearrange the terms, we obtain
¨ˆq = ˆq
d
dt
ˆq∗ ˙ˆq − ˙ˆq∗ ˙ˆq . (17.47)
Some messy algebra shows that (17.47) can be written in matrix form as


¨q0
¨q1
¨q2
¨q3


=
1
2
WT


˙ω1
˙ω2
˙ω3
−2Σ ˙q2
m


. (17.48)
The dynamical behavior of a rigid body can now be expressed in terms of a quaternion state
vector (q0, ˙q0,q1, ˙q1,q2, ˙q2,q3, ˙q3,t). The following steps summarize the algorithm for computing
the rate for this state:
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 736
1. Use the state vector to compute the angular velocity (ω1,ω2,ω3) in the body frame using
(17.40).
2. Project the external torques onto the body frame and compute ˙ω using Euler’s equation
(17.23).
3. Compute the quaternion acceleration using (17.48).
These steps are implemented in the RigidBody class shown in Listing 17.12.
Listing 17.12: The RigidBody class solves Euler’s equation of motion for a rotating rigid body
using quaternions.
package org . opensourcephysics . sip . ch17 ;
import org . opensourcephysics . numerics . ;
public class RigidBody implements ODE {
Quaternion rotation = new Quaternion (1 , 0 , 0 , 0 ) ;
double [ ] s t a t e = new double [ 9 ] ;
ODESolver solver = new RK45MultiStep ( this ) ;
double [ ] omegaBody = new double [ 3 ] ; / / body frame omega
/ / body frame angular momentum
double [ ] angularMomentumBody = new double [ 3 ] ;
/ / p r i n c i p a l moments of i n e r t i a
double I1 = 1 , I2 = 1 , I3 = 1;
/ / angular a c c e l e r a t i o n in the body frame
double wxdot = 0 , wydot = 0 , wzdot = 0;
/ / torque in the body frame
double t1 = 0 , t2 = 0 , t3 = 0;
void setOrientation ( double [ ] q ) {
double norm = Math . sqrt (q [ 0 ] q[0]+q [ 1 ] q[1]+q [ 2 ] q[2]+q [ 3 ] q [ 3 ] ) ;
s t a t e [ 0 ] = q [0]/norm ;
s t a t e [ 2 ] = q [1]/norm ;
s t a t e [ 4 ] = q [2]/norm ;
s t a t e [ 6 ] = q [3]/norm ;
rotation . setCoordinates ( s t a t e [ 0 ] , s t a t e [ 2 ] , s t a t e [ 4 ] , s t a t e [ 6 ] ) ;
}
Transformation getTransformation ( ) {
rotation . setCoordinates ( s t a t e [ 0 ] , s t a t e [ 2 ] , s t a t e [ 4 ] , s t a t e [ 6 ] ) ;
return rotation ;
}
void setBodyFrameOmega ( double [ ] omega ) {
/ / use components f o r c l a r i t y
double q0 = s t a t e [ 0 ] , q1 = s t a t e [ 2 ] , q2 = s t a t e [ 4 ] , q3 = s t a t e [ 6 ] ;
double wx = omega [ 0 ] ;
double wy = omega [ 1 ] ;
double wz = omega [ 2 ] ;
s t a t e [ 1 ] = 0.5 ( −q1 wx−q2 wy−q3 wz ) ; / / dq0 / dt
s t a t e [ 3 ] = 0.5 ( q0 wx−q3 wy+q2 wz ) ; / / dq1 / dt
s t a t e [ 5 ] = 0.5 ( q3 wx+q0 wy−q1 wz ) ; / / dq2 / dt
s t a t e [ 7 ] = 0.5 ( −q2 wx+q1 wy+q0 wz ) ; / / dq3 / dt
updateVectors ( ) ;
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 737
}
public double [ ] getBodyFrameOmega ( ) {
return omegaBody ;
}
public double [ ] getBodyFrameAngularMomentum ( ) {
return angularMomentumBody ;
}
void updateVectors ( ) {
double q0 = s t a t e [ 0 ] , q1 = s t a t e [ 2 ] , q2 = s t a t e [ 4 ] , q3 = s t a t e [ 6 ] ;
omegaBody [ 0] = 2 ( −q1 s t a t e [1]+q0 s t a t e [3]+q3 s t a t e [5] −q2 s t a t e [ 7 ] ) ;
omegaBody [ 1] = 2 ( −q2 s t a t e [1] −q3 s t a t e [3]+q0 s t a t e [5]+q1 s t a t e [ 7 ] ) ;
omegaBody [ 2] = 2 ( −q3 s t a t e [1]+q2 s t a t e [3] −q1 s t a t e [5]+q0 s t a t e [ 7 ] ) ;
angularMomentumBody [ 0] = I1 omegaBody [ 0 ] ;
angularMomentumBody [ 1] = I2 omegaBody [ 1 ] ;
angularMomentumBody [ 2] = I3 omegaBody [ 2 ] ;
}
public void advanceTime ( ) {
solver . step ( ) ;
double norm =
1/Math . sqrt ( s t a t e [ 0 ] s t a t e [0]+ s t a t e [ 2 ] s t a t e [2]+ s t a t e [ 4 ] s t a t e [ 4 ]
+s t a t e [ 6 ] s t a t e [ 6 ] ) ;
s t a t e [ 0 ] = norm ;
s t a t e [ 2 ] = norm ;
s t a t e [ 4 ] = norm ;
s t a t e [ 6 ] = norm ;
updateVectors ( ) ;
}
public double [ ] getState ( ) {
return s t a t e ;
}
public void getRate ( double [ ] state , double [ ] rate ) {
computeBodyFrameAcceleration ( s t a t e ) ;
double sum = 0;
for ( int i = 1; i <9; i += 2) { / / sum the q dot values
sum += s t a t e [ i ] s t a t e [ i ] ;
}
sum = −2.0 sum;
/ / use q components f o r c l a r i t y
double q0 = s t a t e [ 0 ] , q1 = s t a t e [ 2 ] , q2 = s t a t e [ 4 ] , q3 = s t a t e [ 6 ] ;
rate [ 0] = s t a t e [ 1 ] ;
rate [ 1] = 0.5 ( −q1 wxdot−q2 wydot−q3 wzdot+q0 sum ) ;
rate [ 2] = s t a t e [ 3 ] ;
rate [ 3] = 0.5 ( q0 wxdot−q3 wydot+q2 wzdot+q1 sum ) ;
rate [ 4] = s t a t e [ 5 ] ;
rate [ 5] = 0.5 ( q3 wxdot+q0 wydot−q1 wzdot+q2 sum ) ;
rate [ 6] = s t a t e [ 7 ] ;
rate [ 7] = 0.5 ( −q2 wxdot+q1 wydot+q0 wzdot+q3 sum ) ;
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 738
rate [ 8] = 1 . 0 ; / / time r a t e
}
void computeBodyFrameTorque ( double [ ] s t a t e ) {
t1 = t2 = t3 = 0;
}
void computeBodyFrameAcceleration ( double [ ] s t a t e ) {
/ / use components f o r c l a r i t y
double q0 = s t a t e [ 0 ] , q1 = s t a t e [ 2 ] , q2 = s t a t e [ 4 ] , q3 = s t a t e [ 6 ] ;
double wx = 2 ( −q1 s t a t e [1]+q0 s t a t e [3]+q3 s t a t e [5] −q2 s t a t e [ 7 ] ) ;
double wy = 2 ( −q2 s t a t e [1] −q3 s t a t e [3]+q0 s t a t e [5]+q1 s t a t e [ 7 ] ) ;
double wz = 2 ( −q3 s t a t e [1]+q2 s t a t e [3] −q1 s t a t e [5]+q0 s t a t e [ 7 ] ) ;
computeBodyFrameTorque ( s t a t e ) ;
wxdot = ( t1 −( I3−I2 ) wz wy)/ I1 ; / / Euler ’ s equations of motion
wydot = ( t2 −( I1−I3 ) wx wz)/ I2 ;
wzdot = ( t3 −( I2−I1 ) wy wx)/ I3 ;
}
}
The RigidBody class makes use of the RigidBodyUtil utility class in the ch17 package
to handle quaternion normalization and transformations between space and body frames. The
RigidBodyUtil class is not listed. The static spaceToBody method multiples a given vector by
(17.34) and the static bodyToSpace method multiples the given vector by the inverse.
FreeRotationApp and FreeRotationView use a subclass of RigidBody to display the dynamics
of free rotation. The FreeRotation subclass does a number of simple housekeeping
chores such as computing the principal moments using the formulas for an ellipsoid in Table
17.1. FreeRotationApp animates torque-free rotation by extending AbstractSimulation
and implementing the doStep method. These classes are not listed because they are similar to
other classes we have studied.
FreeRotationSpaceView shown in Listing 17.13 displays the body, the angular momentum
vector, and the angular velocity vector in a Display3DFrame. The body is represented using an
ellipsoid whose principal axes are (2a,2b,2c) and the orientation of the body is computed in
RigidBody. The angular momentum and angular velocity vectors are retrieved from RigidBody
and used to set the direction of the corresponding arrows. The path of the angular velocity
arrow through space is recorded by adding points to an ElementTrail object.
Exercise 17.17. Rigid body model
Study the FreeRotationSpaceView and how it is used by FreeRotationApp. Adapt the FeynmanPlateView
so that it too uses the RigidBody class to compute the dynamics. Do you obtain the same results
as in Problem 17.13?
Listing 17.13: The FreeRotationSpaceView class displays the rotation of a rigid body as seen
in the space frame.
package org . opensourcephysics . sip . ch17 ;
import org . opensourcephysics . display3d . simple3d . ;
import org . opensourcephysics . frames . ;
public class FreeRotationSpaceView {
Element e l l i p s o i d = new ElementEllipsoid ( ) ;
Element omega = new ElementArrow ( ) ;
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 739
Element angularMomentum = new ElementArrow ( ) ;
ElementTrail omegaTrace = new ElementTrail ( ) ;
Display3DFrame frame = new Display3DFrame ( "Space view" ) ;
FreeRotation rigidBody ;
double scale = 1;
public FreeRotationSpaceView ( FreeRotation _rigidBody ) {
this . rigidBody = _rigidBody ;
frame . setSize (600 , 600);
frame . setDecorationType ( VisualizationHints .DECORATION_AXES) ;
omega . getStyle ( ) . s e t F i l l C o l o r ( java . awt . Color .RED) ;
omegaTrace . getStyle ( ) . setLineColor ( java . awt . Color .RED) ;
angularMomentum . getStyle ( ) . s e t F i l l C o l o r ( java . awt . Color .GREEN) ;
e l l i p s o i d . setTransformation ( rigidBody . getTransformation ( ) ) ;
frame . addElement ( e l l i p s o i d ) ;
frame . addElement (omega ) ;
frame . addElement ( angularMomentum ) ;
frame . addElement ( omegaTrace ) ;
}
void i n i t i a l i z e ( double a , double b , double c ) {
e l l i p s o i d . setSizeXYZ (2 a , 2 b , 2 c ) ;
/ / bounding dimension f o r space frame
scale = Math .max(Math .max(3 a , 3 b ) , 3 c ) ;
frame . setPreferredMinMax(− scale , scale , −scale , scale ,
−scale , scale ) ;
omegaTrace . clear ( ) ;
update ( ) ;
}
void update ( ) {
e l l i p s o i d . setTransformation ( rigidBody . getTransformation ( ) ) ;
double [ ] vec = e l l i p s o i d . toSpaceFrame ( rigidBody . getBodyFrameOmega ( ) ) ;
double norm = Math . sqrt ( vec [ 0 ] vec [0]+ vec [ 1 ] vec [1]+ vec [ 2 ] vec [ 2 ] ) ;
norm = Math .max(norm , 1.0 e −6);
double s = 0.75 scale /norm ;
omega . setSizeXYZ ( s vec [ 0 ] , s vec [ 1 ] , s vec [ 2 ] ) ;
omegaTrace . addPoint ( s vec [ 0 ] , s vec [ 1 ] , s vec [ 2 ] ) ;
vec = e l l i p s o i d . toSpaceFrame (
rigidBody . getBodyFrameAngularMomentum ( ) ) ;
norm = Math . sqrt ( vec [ 0 ] vec [0]+ vec [ 1 ] vec [1]+ vec [ 2 ] vec [ 2 ] ) ;
norm = Math .max(norm , 1.0 e −6);
s = 0.75 scale /norm ;
angularMomentum . setSizeXYZ ( s vec [ 0 ] , s vec [ 1 ] , s vec [ 2 ] ) ;
frame . repaint ( ) ;
}
}
17.8 Motion of a spinning top
Object oriented programming makes it easy to add a torque to the RigidBody class. All that
needs to be done is to extend RigidBody and override the computeBodyFrameTorque method. A
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 740
spinning top shown in Figure 17.6 is a rigid body that rotates about a pivot. Because the pivot
point is not the center of mass, there is an external torque due to the weight of the top. We must
compute the torque’s vector components in the body frame to use Euler’s equation of motion.
The SpinningTop shown in Listing 17.14 models a symmetric body rotating about the axis
of symmetry. The axis of symmetry is taken to be the ˆ3-axis. If we assume that the center of mass
lies a unit distance from the pivot along the ˆ3-axis, then the torque is computed by taking the
cross project after projecting the force into the body frame τ = ˆ3 × Fbody. Listing 17.14 assumes
that the center of mass is one unit from the pivot and is acted on by an external force in the
z-direction (0,0,1).
Listing 17.14: The SpinningTop class models a symmetric body rotating about the axis of sym-
metry.
package org . opensourcephysics . sip . ch17 ;
public class SpinningTop extends RigidBody {
SpinningTopSpaceView spaceView = new SpinningTopSpaceView ( this ) ;
void s e t I n e r t i a ( double Is , double Iz ) {
I1 = Is ;
I2 = Is ;
I3 = Iz ;
/ / o r i e n t top along y axis
setOrientation (new double [ ] {1/Math . sqrt ( 2 ) ,
1/Math . sqrt ( 2 ) , 0 , 0 } ) ;
spaceView . i n i t i a l i z e ( ) ;
}
public void advanceTime ( ) {
super . advanceTime ( ) ;
spaceView . update ( ) ;
}
void setBodyFrameOmega ( double [ ] omega ) {
super . setBodyFrameOmega (omega ) ;
spaceView . i n i t i a l i z e ( ) ;
}
void computeBodyFrameTorque ( double [ ] s t a t e ) {
/ / e x t e r n a l f o r c e in space frame
double [ ] vec = new double [ ] {0 , 0 , 1 } ;
RigidBodyUtil . spaceToBody ( state , vec ) ;
t1 = −vec [ 1 ] ; / / torque components decl ared in RigidBody
t2 = vec [ 0 ] ;
t3 = 0;
}
}
The visualization of a spinning top can take many forms. Listing 17.15 presets the spinning
top model as a table-top gyroscope using cylinders and arrows. The shaft of the gyroscope is
the body’s axis of symmetry (the body frame’s ˆ3-axis) and draws a trace showing the gyroscope’s
precession. The spinning about the ˆ3-axis) might appear to be incorrect if the animation step
is too large. This aliasing eﬀect is often seen in movies showing carriage wheel spokes rotating
backward to the direction of travel.
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 741
Listing 17.15: The SpinningTopSpaceView class models a symmetric body rotating about the
axis of symmetry.
package org . opensourcephysics . sip . ch17 ;
import org . opensourcephysics . frames . ;
import org . opensourcephysics . display3d . simple3d . ;
import org . opensourcephysics . numerics . Transformation ;
public class SpinningTopSpaceView {
Group topGroup = new Group ( ) ;
Element shaft = new ElementCylinder ( ) ;
Element disk = new ElementCylinder ( ) ;
Element base = new ElementCylinder ( ) ;
Element post = new ElementCylinder ( ) ;
Element orientation = new ElementArrow ( ) ;
ElementTrail orientationTrace = new ElementTrail ( ) ;
Display3DFrame frame = new Display3DFrame ( "Space View" ) ;
SpinningTop rigidBody ;
public SpinningTopSpaceView ( SpinningTop rigidBody ) {
this . rigidBody = rigidBody ;
frame . setSize (600 , 600);
/ / panel . setDisplayMode ( VisualizationHints . DISPLAY_NO_PERSPECTIVE ) ;
double d = 4;
frame . setPreferredMinMax(−d , d , −d , d , −d , d ) ;
frame . setDecorationType ( VisualizationHints .DECORATION_AXES) ;
orientation . getStyle ( ) . s e t F i l l C o l o r ( java . awt . Color .RED) ;
orientationTrace . getStyle ( ) . setLineColor ( java . awt . Color .BLACK) ;
base . setSizeXYZ (2 , 2 , 0 . 1 5 ) ;
base . getStyle ( ) . setResolution (new Resolution (4 , 12 , 1 ) ) ;
base . getStyle ( ) . s e t F i l l C o l o r ( java . awt . Color .RED) ;
base . setZ ( −3);
post . setSizeXYZ (0.2 , 0.2 , 3 ) ;
post . getStyle ( ) . setResolution (new Resolution (2 , 10 , 1 5 ) ) ;
post . setZ ( −1.5); / / s h i f t by h a l f the length
post . getStyle ( ) . s e t F i l l C o l o r ( java . awt . Color .RED) ;
shaft . setSizeXYZ (0.2 , 0.2 , 3 ) ;
shaft . setXYZ (0 , 0 , 1 . 5 ) ;
shaft . getStyle ( ) . setResolution (new Resolution (1 , 10 , 1 5 ) ) ;
disk . setSizeXYZ (1.75 , 1.75 , 0 . 2 5 ) ;
disk . setXYZ (0 , 0 , 2 . 0 ) ;
disk . getStyle ( ) . setResolution (new Resolution (4 , 12 , 1 ) ) ;
topGroup . addElement ( shaft ) ;
topGroup . addElement ( disk ) ;
topGroup . setTransformation ( rigidBody . getTransformation ( ) ) ;
frame . addElement ( base ) ;
frame . addElement ( post ) ;
frame . addElement ( orientation ) ;
frame . addElement ( orientationTrace ) ;
frame . addElement ( topGroup ) ;
}
void i n i t i a l i z e ( ) {
/ / dimension of e l l i p s o i d i s i n v e r s e to i n e r t i a
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 742
double dx = 1/Math . sqrt ( rigidBody . I1 ) ;
double dy = 1/Math . sqrt ( rigidBody . I2 ) ;
double dz = 1/Math . sqrt ( rigidBody . I3 ) ;
/ / bounding dimension
double scale = Math .max(Math .max(4 dx , 4 dy ) , 4 dz ) ;
frame . setPreferredMinMax(− scale , scale , −scale , scale ,
−scale , scale ) ;
orientationTrace . clear ( ) ;
update ( ) ;
}
void update ( ) {
Transformation transformation = rigidBody . getTransformation ( ) ;
topGroup . setTransformation ( transformation ) ;
double s = 1.5 shaft . getSizeZ ( ) ;
double [ ] vec = topGroup . toSpaceFrame (new double [ ] {0 , 0 , 1 } ) ;
orientation . setSizeXYZ ( s vec [ 0 ] , s vec [ 1 ] , s vec [ 2 ] ) ;
orientationTrace . addPoint ( s vec [ 0 ] , s vec [ 1 ] , s vec [ 2 ] ) ;
frame . render ( ) ;
}
}
Problem 17.18. Uniform precession
If the angular velocity about the axis of symmetry is large, there are two possible rates of steady
precession. These precession rates ˙φ are approximately
˙φ ≈
I3
Is
w3
cosθ0
(17.49a)
and
˙φ ≈
mgl
I3ω3
, (17.49b)
where θ0 is the angle between the vertical and the axis of gyroscope, and mgl is the weight times
the distance from the pivot to the center of mass. Demonstrate both precession rates.
17.9 Projects
Project 17.19. Rotating reference frames
Model the projectile motion of a ball thrown into the air as seen from a rotating platform. Do so
by solving the equations of motion in an inertial reference frame and transforming the trajectory
to the noninertial rotating frame.
Project 17.20. Falling box
Do a two-dimensional simulation of a rotating and falling box hitting and rebounding from a
ﬂoor. If you are bold, do the simulation in three dimensions by adding translational center of
mass coordinates to the quaternion state vector. Does the average kinetic energy of translation
equal the average energy of rotation over many bounces?
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 743
Appendix 17A: Matrix Transformations
Transformations, such as rotations, can be implemented using matrices, Euler angles, or analytic
functions. All that is required is that a transformation provide a rule for mapping points in a
domain to points in a range. Some, but not all, transformations have an inverse that reverses
this operation. These abstract concepts are used to deﬁne the Transformation interface in the
numerics package. This interface contains the direct and inverse methods for transforming
points. The clone method creates a new transformation that is an exact duplicate of the existing
transformation. Objects that can make copies of themselves implement the clone interface and
are said to be cloneable.
package org . opensourcephysics . numerics ;
public interface Transformation extends Cloneable {
public void direct ( double [ ] point ) ;
public void inverse ( double [ ] point ) throws
UnsupportedOperationException ;
public Object clone ( ) ;
}
The Affine3DMatrix class shown in Listing 17.16 implements the Transformation interface.
Because rotations are very common, the class implements the Rotation convenience
method to create a matrix using (17.16). The direct and inverse methods use the matrix and
the inverse matrix to transform a point, respectively. Because a program might not need the
inverse (which might not exist) and because an inverse is expensive to compute, we do not compute
this matrix until it is needed, in which case the inverse is computed only once. Note that
the inverse is calculated using a numerical method known as lower upper (LU) matrix decomposition
(see Press et al.). The name LU decomposition comes from the realization that a nonsingular
square matrix A can be decomposed into a product of two matrices L and U whose components
below and above the diagonal are zero, respectively:
A = LU. (17.50)
We will not describe this technique, but you are encouraged to test that the method in the
numerics package works.
Listing 17.16: The Affine3DMatrix class implements the Transformation interface using a
matrix representation of the aﬃne transformations.
package org . opensourcephysics . sip . ch17 ;
import org . opensourcephysics . numerics . ;
public class Affine3DMatrix implements Transformation {
/ / transformation matrix
private double [ ] [ ] matrix = new double [ 4 ] [ 4 ] ;
/ / i n v e r s e transformation matrix i f i t e x i s t s
private double [ ] [ ] inverse = null ;
/ / true i f i n v e r s e has been computed
private boolean inverted = false ;
public Affine3DMatrix ( double [ ] [ ] matrix ) {
i f ( matrix==null ) { / / i d e n t i t y matrix
this . matrix [ 0 ] [ 0 ] = this . matrix [ 1 ] [ 1 ] = this . matrix [ 2 ] [ 2 ] =
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 744
this . matrix [ 3 ] [ 3 ] = 1;
return ;
}
for ( int i = 0; i <matrix . length ; i ++) { / / loop over the rows
System . arraycopy ( matrix [ i ] , 0 , this . matrix , 0 ,
matrix [ i ] . length ) ;
}
}
public s t a t i c Affine3DMatrix Rotation ( double theta , double [ ] axis ) {
Affine3DMatrix at = new Affine3DMatrix ( null ) ;
double [ ] [ ] atMatrix = at . matrix ;
double norm = Math . sqrt ( axis [ 0 ] axis [0]+ axis [ 1 ] axis [ 1 ]
+axis [ 2 ] axis [ 2 ] ) ;
double x = axis [0]/norm , y = axis [1]/norm ;
double z = axis [2]/norm ;
double c = Math . cos ( theta ) , s = Math . sin ( theta ) ;
double t = 1−c ;
/ / matrix elements not l i s t e d are zero
atMatrix [ 0 ] [ 0 ] = t x x+c ;
atMatrix [ 0 ] [ 1 ] = t x y−s z ;
atMatrix [ 0 ] [ 2 ] = t x y+s y ;
atMatrix [ 1 ] [ 0 ] = t x y+s z ;
atMatrix [ 1 ] [ 1 ] = t y y+c ;
atMatrix [ 1 ] [ 2 ] = t y z−s x ;
atMatrix [ 2 ] [ 0 ] = t x z−s y ;
atMatrix [ 2 ] [ 1 ] = t y z+s x ;
atMatrix [ 2 ] [ 2 ] = t z z+c ;
atMatrix [ 3 ] [ 3 ] = 1;
return at ;
}
public s t a t i c Affine3DMatrix Translation ( double dx , double dy , double dz ) {
Affine3DMatrix at = new Affine3DMatrix ( null ) ;
double [ ] [ ] m = at . matrix ;
/ / matrix elements not l i s t e d are zero
m[ 0 ] [ 0 ] = 1;
m[ 0 ] [ 3 ] = dx ;
m[ 1 ] [ 1 ] = 1;
m[ 1 ] [ 3 ] = dy ;
m[ 2 ] [ 2 ] = 1;
m[ 2 ] [ 3 ] = dz ;
m[ 3 ] [ 3 ] = 1;
return at ;
}
public Object clone ( ) {
return new Affine3DMatrix ( matrix ) ;
}
public double [ ] direct ( double [ ] point ) {
int n = point . length ;
double [ ] tempPoint = new double [n ] ;
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 745
System . arraycopy ( point , 0 , tempPoint , 0 , n ) ;
for ( int i = 0; i <n ; i ++) {
point [ i ] = 0;
for ( int j = 0; j <n ; j ++) {
point [ i ] += matrix [ i ] [ j ] tempPoint [ j ] ;
}
}
return point ;
}
public double [ ] inverse ( double [ ] point ) throws
UnsupportedOperationException {
i f ( ! inverted ) {
calcInverse ( ) ; / / computes i n v e r s e using LU decompostion
}
i f ( inverse==null ) { / / i n v e r s e does not e x i s t
throw new UnsupportedOperationException ( "inverse matrix does
not exist." ) ;
}
int n = point . length ;
double [ ] pt = new double [n ] ;
System . arraycopy ( point , 0 , pt , 0 , n ) ;
for ( int i = 0; i <n ; i ++) {
point [ i ] = 0;
for ( int j = 0; j <n ; j ++) {
point [ i ] += inverse [ i ] [ j ] pt [ j ] ;
}
}
return point ;
}
/ / c a l c u l a t e s i n v e r s e using Lower−Upper decomposition
private void calcInverse ( ) {
LUPDecomposition lupd = new LUPDecomposition ( matrix ) ;
inverse = lupd . inverseMatrixComponents ( ) ;
/ / s i g n a l that the i n v e r s e computation has been performed
inverted = true ;
}
}
Exercise 17.21. Transformation interface
Write a simple program to test the Affine3DMatrix class. Show that the direct and inverse
transformations reverse the mappings.
Appendix 17B: Conversions
Physicists use Euler angles because they are useful for analytically solving Euler’s rigid body
equations of motion in a small number of special cases. Because the quaternion representation
is unfamiliar and is not taught in standard texts, we present conversion formulas between
quaternions, rotation matrices, and Euler angles. See Shoemake for a more complete discussion
of these conversions.
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 746
Quaternion to matrix. For a quaternion q = (q0,q1,q2,q3) that satisﬁes the normalization
condition 1 = q2
0 + q2
1 + q2
2 + q2
3, the rotation matrix is
R = 2


1
2 − q2
2 − q2
3 q1q2 + q0q3 q1q3 − q0q2
q1q2 − q0q3
1
2 − q2
1 − q2
3 q2q3 + q0q1
q1q3 + q0q2 q2q3 − q0q1
1
2 − q2
1 − q2
2


. (17.51)
Matrix to quaternion. The quaternion components can be computed using linear combinations
of the rotation matrix elements of the (3 × 3) matrix R = [ri,j]3×3. To avoid the pitfall
of dividing by a small number (the machine precision), we compute quaternion components
using if statements:
• Compute w2 = (1 + r0,0 + r1,1 + r2,2)/4. If w2 > , then
q0 =
√
w2 (17.52a)
q1 = (r1,2 − r2,1)/4q0 (17.52b)
q2 = (r2,0 − r0,2)/4q0 (17.52c)
q3 = (r0,1 − r1,0)/4q0, (17.52d)
else compute x2 = −1/2(r1,1 + r2,2).
• If x2 > , then
q0 = 0 (17.53a)
q1 =
√
x2 (17.53b)
q2 = r0,1/2q1 (17.53c)
q3 = r0,2/2q1, (17.53d)
else compute y2 = 1/[2(1 − r2,2)].
• If y2 > , then
q0 = 0 (17.54a)
q1 = 0 (17.54b)
q2 = y2 (17.54c)
q3 = r1,2/2q2, (17.54d)
else compute
q0 = 0 (17.55a)
q1 = 0 (17.55b)
q2 = 0 (17.55c)
q3 = 1. (17.55d)
Euler angles to matrix. Euler angles are generally described in physics texts (see Goldstein)
as a group of three rotations about a set of body frame axes. An object is created with the body
frame aligned with the space frame. The ﬁrst rotation is about the body frame’s ˆ3-axis by an
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 747
angle φ; the second rotation is about the new x-axis by an angle θ, and the third rotation is
about the new z-axis by an angle φ. Other deﬁnitions of Euler angles are possible. For example,
the Java 3D API deﬁnes Euler angles as three rotations about a set of axes ﬁxed in space. All
possible positions of an object can be represented using either of these conventions.
The ﬁrst rotation is about the z-axis and is given by
A(φ) =


cosφ sinφ 0
−sinφ cosφ 0
0 0 1


. (17.56)
The second rotation is about the new x-axis and is given by
B(θ) =


1 0 0
0 cosθ sinθ
0 −sinθ cosθ


. (17.57)
The last rotation is again about the new z-axis
C(ψ) =


cosψ sinψ 0
−sinψ cosψ 0
0 0 1


. (17.58)
The application of the three Euler rotation matrices C(ψ), B(θ), A(φ) in this order produces
the transformation
R(ψ,θ,φ) =


cosψ cosφ − cosθ sinφsinψ cosψ sinφ + cosθ cosφsinψ sinψ sinθ
−sinψ cosφ − cosθ sinφcosψ −sinψ sinφ + cosθ cosφcosψ cosψ sinθ
sinθ sinφ −sinθ cosφ cosθ


. (17.59)
Euler angles to quaternion. There are many possible conventions for the Euler angles. We
again use the deﬁnition found in Goldstein:
q0 = cosθ/2cos
1
2
(φ + ψ) (17.60a)
q1 = sinθ/2cos
1
2
(φ − ψ) (17.60b)
q2 = sinθ/2sin
1
2
(φ − ψ) (17.60c)
q3 = sinθ/2cos
1
2
(φ + ψ). (17.60d)
Matrix to Euler Angles. The conversion from matrix elements to Euler angles is ill-deﬁned
because inverse trigonometric functions do not uniquely specify the resulting quadrant. From
(17.59) we see that cosθ = r2,2. We then use sinθ =
√
1 − cos2 θ to compute sinθ to within a
sign. As in the matrix to quaternion conversion, we again use if statements to avoid dividing a
number less than the machine precision :
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 748
If |r2,2| > , then
cosθ = r2,2 (17.61a)
sinθ =
√
1 − cos2 θ (17.61b)
cosφ == r1,0/ sinθ (17.61c)
sinφ = −r2,0/ sinθ (17.61d)
cosψ = r1,2/ sinθ (17.61e)
sinψ = r0,2/ sinθ, (17.61f)
else
cosθ = 0 (17.62a)
sinθ = 1 (17.62b)
cosφ = r1,0 (17.62c)
sinφ = −r2,0 (17.62d)
cosψ = 1 (17.62e)
sinψ = 0. (17.62f)
Quaternions to Euler Angles. Convert the quaternion to a rotation matrix and then convert
the matrix to Euler angles. This conversion is easy if these two conversion algorithms have
already been programmed.
References and Suggestions for Further Reading
M. P. Allen and D. J. Tildesley, Computer Simulation of Liquids (Oxford University Press, 1987).
Didier H. Besset, Object-Oriented Implementation of Numerical Methods (Morgan Kaufmann,
2001).
David H. Eberly, Game Physics (Morgan Kaufmann, 2004).
Denis J. Evans, “On the representation of orientation space,” Mol. Phys. 34 (2) 317–325 (1977).
R. P. Feynman, Surely You are Joking, Mr. Feynman!, W. W. Norton (1985). See pp. 157–158 for
a discussion of the motion of the rotating plate. Feynman had fun doing physics.
James D. Foley, Andries van Dam, Steven Feiner, and John Hughes, Computer Graphics: Principles
and Practice, 2nd ed. (Addison–Wesley, 1990).
Herbert Goldstein, Charles P. Poole, and John L. Safko, Classical Mechanics, 3rd ed. (Addison–
Wesley, 2002). Chapter 4 discusses the kinematics of rigid body motion.
William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery, Numerical
Recipes, 2nd ed. (Cambridge University Press, 1992).
Jerry Marion and Stephen Thornton, Classical Dynamics, 5th ed. (Brooks/Cole, 2003).
D. C. Rapaport, “Molecular dynamics simulation using quaternions,” J. Comp. Phys. 60, 306–
314 (1985).
CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 749
D. C. Rapaport, The Art of Molecular Dynamics Simulation, 2nd ed. (Cambridge University
Press, 2004). Chapter 8 discusses the molecular dynamics of rigid molecules.
Philip J. Schneider and David H. Eberly, Geometric Tools for Computer Graphics (Morgan
Kaufmann, 2003).
Ken Shoemake,“Animating rotation with quaternion curves,” ACM Transactions in Graphics
19 (3), 256–276 (1994).
Keith R. Symon, Mechanics (Addison–Wesley, 1971).
Slavomir Tuleja, Boris Gazovic, and Alexander Tomori, and Jozef Hanc, “Feynman’s wobbling
plate,” Am. J. Phys. 75, 240–244 (2007).
John R. Taylor, Classical Mechanics (University Science Books, 2005).
James M. Van Verth and Lars M. Bishop, Essential Mathematics for Games and Interactive
Applications (Morgan Kaufmann, 2004).
Chapter 18
Seeing in Special and General
Relativity
We compute how objects appear at relativistic speeds and in the vicinity of a large spherically
symmetric mass.
18.1 Special Relativity
How do objects appear at relativistic speeds? The Lorentz–Fitzgerald length contraction in the
direction of motion is not the only eﬀect that needs to be considered when determining the
apparent shape of an object. A single observer forms an image of an object by collecting light
emitted from the entire object. When an observer sees the object, the observer does not see its
current position nor its true shape but sees each part of the object where it was when the light
was emitted. This position is known as the retarded position. Because of the ﬁnite speed of
light, we must calculate when and where along the object’s trajectory each light ray originated
to determine the image formed on the observer’s retina.
The relative velocity of an object with respect to a single observer deﬁnes a direction, which
we take to be the direction of the x-axis in the observer’s frame of reference S. Let S be the rest
frame of the object and v be the velocity of S with respect to S , such that the origins coincide
at t = t = 0. The Lorentz transformation connecting S and S is
x =
1
γ
(x − vt) (18.1a)
y = y (18.1b)
z = z (18.1c)
t = γ(t − vx/c2
), (18.1d)
where γ = 1/
√
1 − v2/c2 = 1/ 1 − β2 and β = v/c. In the rest frame of the object, the spatial
separation between two points on the object is
d = (x2 − x1)2 + (y2 − y1)2 + (z2 − z1)2. (18.2)
In the rest frame of the observer, the separation is
d = (x2 − x1)2 + (y2 − y1)2 + (z2 − z1)2, (18.3)
750
CHAPTER 18. SEEING IN SPECIAL AND GENERAL RELATIVITY 751
where
x2 − x1 = γ(x2 − x1) (18.4a)
y2 − y1 = y2 − y1 (18.4b)
z2 − z1 = z2 − z1. (18.4c)
The change in the x separation is known as the Lorentz-Fitzgerald contraction. If we know the
shape of the object in the rest frame, we can determine the shape in the observer’s frame by
applying an aﬃne transformation (see Chapter 17) that rescales the object’s x dimension by γ.
Listing 18.1 shows how this transformation is done using a two-dimensional wire frame model
of a ring. The ContractedRing class deﬁnes a ring of unit radius in the object’s rest frame using
an array of Point2D objects. Point2D represents a location and can be transformed because it
is part of the standard Java 2D API. These points are transformed into the observer’s frame and
the transformed shape is drawn by connecting the points using line segments.
Listing 18.1: The ContractedRing class implements the Lorentz–Fitzgerald contraction of a
ring moving in the x direction.
package org . opensourcephysics . sip . ch18 ;
import java . awt . ;
import java . awt . geom . ;
import org . opensourcephysics . display . ;
public class ContractedRing implements Drawable {
double vx = 0 , time = 0;
Point2D [ ] labPoints , pixPoints ;
public ContractedRing ( double x0 , double y0 , double vx ,
int numberOfPoints ) {
labPoints = new Point2D [ numberOfPoints ] ;
pixPoints = new Point2D [ numberOfPoints ] ;
double
theta = 0 , dtheta = 2 Math . PI /( numberOfPoints −1);
/ / unit radius c i r c l e
for ( int i = 0; i <numberOfPoints ; i ++) {
double x = Math . cos ( theta ) ; / / x coordinate
double y = Math . sin ( theta ) ; / / y coordinate
labPoints [ i ] = new Point2D . Double ( x , y ) ;
theta += dtheta ;
}
this . vx = vx ;
/ / Lorentz − F i t z g e r a l d c o n t r a c t i o n
AffineTransform at =
AffineTransform . getScaleInstance (Math . sqrt (1−vx vx ) , 1 ) ;
at . transform ( labPoints , 0 , labPoints , 0 , labPoints . length ) ;
/ / t r a n s l a t e to i n i t i a l p o s i t i o n
at = AffineTransform . getTranslateInstance ( x0 , y0 ) ;
at . transform ( labPoints , 0 , labPoints , 0 , labPoints . length ) ;
}
public void setTime ( double t ) {
double dt = t −time ;
/ / convert p o s i t i o n to p o s i t i o n at new time
CHAPTER 18. SEEING IN SPECIAL AND GENERAL RELATIVITY 752
v
y
x
r
O
d
rret
Figure 18.1: The geometry used to derive the retarded position and time for an observer at the
origin. The observed point is moving with constant speed v in the x direction.
AffineTransform at = AffineTransform . getTranslateInstance ( vx dt , 0 ) ;
at . transform ( labPoints , 0 , labPoints , 0 , labPoints . length ) ;
time = t ;
}
void drawShape ( DrawingPanel panel , Graphics2D g2 ) {
/ / convert from lab c o o r d i n a t e s to p i x e l s
AffineTransform at = panel . getPixelTransform ( ) ;
at . transform ( labPoints , 0 , pixPoints , 0 , labPoints . length ) ;
g2 . setColor ( Color .RED) ;
for ( int i = 1 , n = labPoints . length ; i <n ; i ++) {
g2 . draw (new Line2D . Double ( pixPoints [ i −1] , pixPoints [ i ] ) ) ;
}
}
public void draw ( DrawingPanel panel , Graphics g ) {
Graphics2D g2 = ( Graphics2D ) g ;
drawShape ( panel , g2 ) ;
}
}
Exercise 18.1. Lorentz–Fitzgerald contraction
Write a test program that instantiates and displays a ContractedRing object. Measure the dimensions
of the on-screen object to verify that the Lorentz–Fitzgerald contraction is computed
correctly.
We now introduce retardation eﬀects. Let r = (x,y) be the current location of an arbitrary
point on the object as shown in Figure 18.1. Because of the ﬁnite speed of light, an observer at
the origin sees a point moving with a speed v in the x direction not at its current location, but
at the previous location
rret = (x − δ,y). (18.5)
The x-coordinate is retarded by δ = vτ, where τ is the travel time of light from rret to the observer.
The distance from the retarded position to the observer is
rret = (x − δ)2 + y2. (18.6)
CHAPTER 18. SEEING IN SPECIAL AND GENERAL RELATIVITY 753
We use the speed of light to convert distance to light travel time (rret = cτ), substitute for δ, and
obtain
cτ = (x − vτ)2 + y2. (18.7)
If we square both sides and solve for τ, we ﬁnd
τ =
−xβ + x2β2 + (1 − β2)(x2 + y2)
c(1 − β2)
, (18.8)
where we have chosen the positive square root to make the travel time τ positive.
We can incorporate the time delay (18.8) into a program to obtain the image seen by a
stationary observer. We subclass ContractedRing and add methods to compute and draw the
retarded positions of the points on the ring. Retarded points are stored in an array and the
retarded positions are computed by solving (18.8). The class ObservedRing is shown in Listing
18.2.
Listing 18.2: The ObservedRing class models the appearance of a ring traveling in the x direction
at relativistic speeds.
package org . opensourcephysics . sip . ch18 ;
import java . awt . ;
import java . awt . geom . ;
import org . opensourcephysics . display . ;
public class ObservedRing extends ContractedRing {
Point2D [ ] retardPts ;
public ObservedRing ( double x0 , double y0 , double vx , int numPts ) {
super ( x0 , y0 , vx , numPts ) ; / / x would change to numberOfPoints
retardPts = new Point2D [ numPts ] ;
for ( int i = 0; i <numPts ; i ++) {
retardPts [ i ] = new Point2D . Double ( ) ;
}
}
void setRetardedPts ( ) {
double oneOverGammaSquared = (1−vx vx ) ;
for ( int i = 0 , n = labPoints . length ; i <n ; i ++) {
double x = labPoints [ i ] . getX ( ) ;
double y = labPoints [ i ] . getY ( ) ;
double tau =
(−vx x+Math . sqrt ( x x vx vx+oneOverGammaSquared ( x x+y y ) ) )
/oneOverGammaSquared ;
retardPts [ i ] . setLocation ( x−vx tau , y ) ;
}
}
void drawObservedShape ( DrawingPanel panel , Graphics2D g2 ) {
setRetardedPts ( ) ;
/ / converts from view to p i x e l c o o r d i n a t e s
AffineTransform at = panel . getPixelTransform ( ) ;
at . transform ( retardPts , 0 , pixPoints , 0 , retardPts . length ) ;
g2 . setColor ( Color .BLACK) ;
CHAPTER 18. SEEING IN SPECIAL AND GENERAL RELATIVITY 754
for ( int i = 1 , n = retardPts . length ; i <n ; i ++) {
g2 . draw (new Line2D . Double ( pixPoints [ i −1] , pixPoints [ i ] ) ) ;
}
}
public void draw ( DrawingPanel panel , Graphics g ) {
Graphics2D g2 = ( Graphics2D ) g ;
drawShape ( panel , g2 ) ;
drawObservedShape ( panel , g2 ) ;
}
}
Exercise 18.2. Relativistic ring
Write a test program that instantiates and displays the apparent shape of a rapidly moving ring.
Explain the sharp convex point when the front edge of the ring touches (reaches) the observer.
Does the ring ever appear to be concave? Why?
Exercise 18.3. Relativistic ruler
Modify the ObservedRing class to display a long narrow rectangle. Can an observer see the
Lorentz–Fitzgerald contraction if this “ruler” is moving along the x-axis? Click-drag within the
display to measure the apparent length of the ruler at various positions. Explain the meaning of
the term observer in relativity. Some authors (see Taylor and Wheeler) prefer the term bookkeeper.
Why might this term be better?
What an observer sees is quite diﬀerent from what is given by the Lorentz contraction.
What makes Einstein’s special theory of relativity profound is not the appearance, but rather
that length really does contract and time really does slow down.
Exercise 18.4. Relativistic square
Write a target class that instantiates and displays the apparent shape of a moving square whose
trajectory is (x0 − vt,b) past a stationary observer at (0,0). Is the apparent shape still a square?
Explain why the observer can see the square’s hidden side. This eﬀect is known as Terrell rota-
tion.
We can rotate the shape seen in the simulation around the x-axis to visualize the apparent
shape of a three-dimensional sphere approaching an observer head-on. This case was treated
analytically by Suﬀern, but most other two- and three-dimensional objects cannot be treated
analytically and are best visualized using the help of a computer. A complete and accurate
visualization must take into account additional physics such as the Doppler eﬀect, aberration,
and angular changes in the intensity distribution of the emitted light (see Weisskopf).
18.2 General Relativity
The idea that space is curved was ﬁrst tested by Gauss who measured the interior angles of a
large triangle by placing lanterns on three mountain tops. Although Gauss obtained the Euclidian
(ﬂat-space) result of 180◦, measurements of stellar positions during the 1919 total solar
eclipse by Eddington showed that the sum of the interior angles is not 180◦. It is an experimental
fact that the universe is non-Euclidian.
CHAPTER 18. SEEING IN SPECIAL AND GENERAL RELATIVITY 755
The Eddington experiment was remarkable because it conﬁrmed Einstein’s general theory
of relativity and showed that space and time are not separate entities. We cannot measure space,
only distances between events in space using rulers, light beams, and clocks. Furthermore, the
separation between events is not the same for diﬀerent observers unless they incorporate both
spatial and temporal separations into their deﬁnition of distance. In the absence of gravitational
ﬁelds, observers moving at constant relative velocity can reconcile (18.2) and (18.3) and obtain
the same “distance” only if they agree that the distance between events ∆σ includes time and is
measured as
(∆σ)2
= (∆x)2
+ (∆y)2
+ (∆z)2
− c2
(∆t)2
, (18.9)
where c is the speed of light, ∆t is the temporal separation, and ∆x, ∆y, and ∆z are the spatial
coordinate separations. Equation (18.9) is based on Einstein’s special theory of relativity and
is known as the Minkowski metric. It follows from Einstein’s assumption that Maxwell’s equations
must be the same for all observers in uniform relative motion and leads naturally to the
equivalence of mass and energy embodied in the famous equation E = mc2.
Einstein’s great insight that acceleration and gravity are indistinguishable enabled him to
incorporate gravity into the spacetime fabric by generalizing (18.9). Imagine an elevator compartment
resting on the surface of Earth in which the occupants perform experiments, such as
dropping an object or observing a swinging pendulum, that reveal the presence of Earth’s gravity.
Then the occupants are placed in a compartment far from any gravitational object, and the
compartment accelerates at 9.8 m/s2. According to Einstein, the experimental results must be
identical. Furthermore, if the near-Earth elevator cable is cut to produce a freely falling reference
frame, then the occupants will be unable to detect the gravitational ﬁeld. The implication
is that we can do away with gravity and regard it as a consequence of an accelerated reference
frame in four-dimensional spacetime. It took Einstein ten years to incorporate this equivalence
of gravitational forces and accelerated motion to the special theory of relativity to produce the
general theory.
Einstein’s general theory of relativity produces ten simultaneous coupled nonlinear partial
diﬀerential equations. Calculations using this theory are truly daunting and require sophisticated
mathematical techniques such as tensor analysis and Riemannian geometry. All forms of
energy gravitate (attract), and nonlinearities arise because a body’s gravitational ﬁeld is itself
a form of energy and therefore gravitates. Few analytic results are known. Two of the most
important are the Schwarzschild and Kerr metrics in the vicinity of a spherically symmetric
mass. Except for very special cases or very weak ﬁelds, the dynamical equations derived from
these metrics and the principle that the path taken by an objects is a maximum as measured by
a watch carried with the object (maximum aging) must be solved numerically to predict how
particles move and how they appear when seen by an observer.
18.3 Dynamics in Polar Coordinates
General relativistic trajectories of particles and light in the vicinity of spherically symmetric
gravitational ﬁelds are conveniently described using polar coordinates. We therefore reformulate
the classical two-body problem (see Chapter 5) using polar coordinates. If the motion is
conﬁned to a plane, rectangular coordinates (x,y) and polar coordinates (r,φ) are related by
x = r cosφ, y = r sinφ (18.10)
and
r = x2 + y2, φ = arctan
y
x
. (18.11)
CHAPTER 18. SEEING IN SPECIAL AND GENERAL RELATIVITY 756
(a) Classical (b) General relativistic
Figure 18.2: Comparison of classical and general relativistic particle trajectories in the vicinity
of a spherically symmetric mass.
The radial velocity is given by
˙r =
dr
dt
=
r·v
r
, (18.12)
and the angular velocity is given by
˙φ =
dφ
dt
=
L
mr2
, (18.13)
where L is the magnitude of the conserved angular momentum L = r × p.
To construct the appropriate diﬀerential equations, the radial and angular accelerations can
be obtained by diﬀerentiating (18.12) and (18.13) with respect to time. Another approach is to
use the Lagrangian
L =
1
2
˙r2
+ r2 ˙φ2
+
GM
r
, (18.14)
and apply Lagrange’s equations of motion:
d
dt
∂L
∂ ˙r
−
∂L
∂r
= 0,
d
dt
∂L
∂ ˙φ
−
∂L
∂φ
= 0. (18.15)
If we do the diﬀerentiation, we obtain the following rate equations for the polar state vector
CHAPTER 18. SEEING IN SPECIAL AND GENERAL RELATIVITY 757
(r, ˙r,φ : ˙φ,t).
dr
dt
= ˙r (18.16a)
d ˙r
dt
= r ˙φ2
−
GM
r2
(18.16b)
dφ
dt
= ˙φ (18.16c)
d ˙φ
dt
= −
2
r
˙φ ˙r (18.16d)
dt
dt
= 1. (18.16e)
Exercise 18.5. Angular momentum
Show that (18.16) leads to the polar coordinate expression for conservation of angular momen-
tum
r ¨φ + 2 ˙φ ˙r = 0. (18.17)
Exercise 18.6. Classical trajectories
Modify the PlanetApp program introduced in Chapter 5 so that the classical trajectory of a
particle is calculated using polar variables rather than cartesian variables. Use (18.10)–(18.13)
to determine the initial state and compare your results with those of the PlanetApp program.
The Open Source Physics plotting panel contains an axis object that displays a cartesian
coordinate grid by default. This grid can be replaced by a polar grid by using the method
setPolar (see Figure 18.2):
plotFrame . setPolar ( "Trajectory" , 1 . 0 ) ;
The ﬁrst parameter is the plot’s title and the second parameter is the radial grid separation.
The new plotting panel also displays the polar coordinates in the bottom left when the mouse
is clicked or dragged.
Exercise 18.7. Polar coordinates
Modify Exercise 18.6 so that polar coordinate values are displayed when the mouse is clickdragged
within the display.
18.4 Black Holes and Schwarzschild Coordinates
Our goal is to compute the worldlines (trajectories in spacetime) of particles and light in the
vicinity of spherically symmetric gravitational objects. Readers should consult the classic text
Exploring Black Holes by Edwin Taylor and John Wheeler for a more complete discussion of the
physics near objects that have undergone gravitational collapse. Such an object is known as a
black hole because light cannot escape from its vicinity. We will calculate the general relativistic
trajectories of particles and light near a spherically symmetric gravitational mass. Because
physical space is non-Euclidian, a two-dimensional plot of these trajectories will be distorted.
Unlike classical orbits, the general relativistic orbits appear very diﬀerent when seen by a viewer
CHAPTER 18. SEEING IN SPECIAL AND GENERAL RELATIVITY 758
in the real world. We must calculate the trajectories of multiple light rays to construct the view
as seen by a single observer.
Because time is incorporated as a fourth dimension and because space is curved, a general
relativistic coordinate system centered on a spherically symmetric mass is more complicated
than a three-dimensional Euclidean coordinate system. The azimuthal angle φ can still be deﬁned
as the ratio of the arc length to the circumference on an imaginary circle because the
spherically symmetric gravitational mass M is located at the origin. However, the radial coordinate
is not deﬁned as the physical distance from the center. Rather, it is calculated using a path
that circumnavigates the central mass:
r = circumference/(2π). (18.18)
The time coordinate is deﬁned using a wristwatch located far from the center of attraction.
Note the nonlocal character of the (r,φ,t) spacetime coordinates. The wristwatch worn by the
surveyor circumnavigating the mass used to measure r is not the time used to record events at
that value of r.
This (r,φ,t) spacetime coordinate system is known as Schwarzschild coordinates and is a
universal bookkeeping device that enables us to translate observations from one reference frame
to another. The Schwarzschild coordinates give rise to a metric, known as the Schwarzschild
metric, that enables us to calculate the four-dimensional distance between adjacent spacetime
events. This metric is given by
dσ2
= −dτ2
= − 1 −
2M
r
dt2
+ 1 −
2M
r
−1
dr2
+ r2
dφ2
, (18.19)
where t, r, and φ refer to the faraway time, the radial coordinate, and the azimuthal coordinate,
respectively. Because we associate distance with a positive number, it is common to use dσ2
when the right-hand side of (18.19) is positive and to use dτ2 when the right-hand side of
(18.19) is negative. As in special relativity, these two forms are referred to as the space-like
form and the time-like form of the metric, respectively. Note that both time and distance have
units of length in (18.19). The speed of light c is the conversion factor
tmeters = ctseconds. (18.20)
Mass also has units of meters, and the conversion factor to kilograms is given in terms of the
speed of light and Newton’s gravitational constant, G:
M =
G
c2
Mkg. (18.21)
If we freeze time so that dt = 0, then the Schwarzschild metric predicts that two simultaneous
events far from the central mass are separated by the Euclidian metric in polar coordinates,
dσ2
= dr2
+ r2
dφ2
. (18.22)
Strange things happen if two events are near the gravitational mass. The separation (known as
the proper distance) between two adjacent events becomes
dσ2
= 1 −
2M
r
−1
dr2
+ r2
dφ2
. (18.23)
CHAPTER 18. SEEING IN SPECIAL AND GENERAL RELATIVITY 759
The proper distance dσ is the distance measured by a surveyor placing meter sticks in space
between two locations. This distance is clearly greater than the result predicted by (18.22)
when the events occur at diﬀerent r-coordinates. In fact, the rate of change of the proper length
with respect to the r-coordinate becomes inﬁnite as we approach what is known as the event
horizon r = 2M. The distance around a gravitational mass has no such singularity, which is
why this distance is used to deﬁne the r-coordinate. (The singularity at the event horizon is
an artifact of the Schwarzschild coordinate system. An object falling into a black hole passes
through the event horizon without incident and is crushed only at r = 0.)
Exercise 18.8. Measuring distance
Although the rate of change of distance with respect to r becomes inﬁnite, the distance from a
point outside the event horizon to the event horizon is ﬁnite. Write a test program that integrates
dσ from a point r = a outside the event horizon to a point arbitrarily close to the event
horizon. What is the distance from r = 4 to r = 2 when M = 1?
Exercise 18.9. Event horizon
a) Although general relativity predicts the shape of orbits around any spherically symmetric
gravitational object, not all objects have an event horizon. (The objects that do are black
holes.) The event horizon assumes that the mass of the entire object is within the horizon,
which implies very high mass densities. Calculate the r-coordinate of the event horizon for
an object having the mass of the Earth and compare it to the radius of the Earth. Repeat the
calculation for the sun.
b) The event horizon for the black hole believed to exist at the center of our galaxy has an event
horizon of r = 7.6 × 109 m. What is its mass in units of our own Sun (solar mass)?
The variable t in the Schwarzschild metric is the time as measured by a faraway observer.
Time as measured by a local observer is known as the proper time, τ. Observers experience
time as measured by their wristwatches and (18.19) shows that the wristwatch time interval
dτ depends on location. A faraway observer who measures the time between two light ﬂashes
records a value of dt, while an observer standing next to these ﬂashes measures a time interval
dτ given by
dτ2
= 1 −
2M
r
dt2
. (18.24)
Proper time intervals near a gravitational mass are clearly smaller than faraway time intervals.
This result gives rise to the gravitational red shift when applied to light.
Exercise 18.10. Local and faraway time
Estimate the diﬀerence due to gravitational eﬀects between local time and faraway time during
one hour for an observer on Earth. Does special relativity play a role in an actual measurement?
18.5 Particle and Light Trajectories
The physics describing the trajectory of a particle in the vicinity of a gravitational mass can
be formulated using the principle of stationary aging (see Hanc). This principle states that a
particle takes a path through spacetime such that the elapsed time δτ recorded by the wristwatch
attached to the particle is a maximum. (In general, δτ is an extrenum so it could also
CHAPTER 18. SEEING IN SPECIAL AND GENERAL RELATIVITY 760
be a minimum or a saddle point.) Because Lagrangian dynamics is based on the principle that
the integral of the Lagrangian over time (called the action) also is stationary, we construct a
Lagrangian using the Schwarzschild metric:
L(r, ˙r,φ, ˙φ) = 1 −
2M
r
− 1 −
2M
r
−1
˙r2
− r2 ˙φ2
1/2
, (18.25)
such that
τ =
ﬁnalevent
initialevent
dτ =
ﬁnalevent
initialevent
L(r, ˙r,φ, ˙φ)dt. (18.26)
Because the Lagrangian in (18.25) satisﬁes (18.15), we can take the required derivatives and
simplify terms to obtain the following system of ﬁrst-order diﬀerential equations:
dr
dt
= ˙r (18.27a)
d ˙r
dt
=
4M3 − 4M2r − 4M2r3 ˙φ2 + 4Mr4 ˙φ2 − r5 ˙φ2 + r2(M − 3M ˙r2)
(2M − r)r3
(18.27b)
dφ
dt
= ˙φ (18.27c)
d ˙φ
dt
=
2(−3M + r) ˙r ˙φ
(2M − r)r
(18.27d)
dt
dt
= 1. (18.27e)
Note that the independent variable in (18.27) is faraway time. The metric provides an additional
diﬀerential equation if we wish to track the particle’s proper time τ:
dτ
dt
= 1 −
2M
r
− 1 −
2M
r
−1
˙r2
− r2 ˙φ2
1/2
. (18.28)
Exercise 18.11. General relativistic trajectories
(a) Write a program that plots the general relativistic trajectory of a particle using Schwarzschild
coordinates. Verify that circular orbits are obtained for v =
√
M/r for r ≥ 6M.
(b) Show that there are no stable circular orbits for r < 6M.
(c) Add the diﬀerential equation for proper time. What is the proper time for one complete
orbit at r = 6? This interval is the orbital period as measured by an observer traveling with
the particle. Compare this wristwatch orbital period to the faraway orbital period and to
the time interval predicted by (18.24). Explain any discrepancies in your numerical values.
(d) Perturb the circular orbit at r = 9 by giving the particle an initial tangential velocity of
v = 0.345c. At what rate does the perihelion of the orbit advance?
The equations for light can be obtained by adding a constraint to (18.25) using a Lagrange
multiplier. The constraint is the condition that the proper time along a light worldline is zero:
0 = − 1 −
2M
r
dt2
+ 1 −
2M
r
−1
dr2
+ r2
dφ2
. (18.29)
CHAPTER 18. SEEING IN SPECIAL AND GENERAL RELATIVITY 761
If we add the Lagrange multiplier, do the diﬀerentiation, and simplify terms, we obtain rate
equations that can be solved using standard numerical techniques. (The use of a computer
algebra program would be helpful.)
dr
dt
= ˙r (18.30a)
d ˙r
dt
=
−4M2 + 2Mr + (r − 5M)r3 ˙φ2
r3
(18.30b)
dφ
dt
= ˙φ (18.30c)
d ˙φ
dt
=
2(−3M + r) ˙r ˙φ
(2M − r)r
(18.30d)
dt
dt
= 1. (18.30e)
Exercise 18.12. Light trajectories
a) Write a program that plots the general relativistic trajectory of light using Schwarzschild
coordinates. Demonstrate the deﬂection of starlight passing near a gravitational mass by
plotting the trajectory of a light ray.
b) Verify that light orbits a black hole at r = 3 and M = 1.
c) Show that a gravitational mass can act as a lens by plotting the trajectory of two light rays
that leave a point source at diﬀerent angles but later cross. The two light rays should pass on
opposite sides of the mass. Do the two light beams always arrive at the crossing point at the
same time?
18.6 Seeing
Because of the nonlinearity of the Schwarzschild metric, simulation plays an essential role. A
calculation of a view of the stars in the vicinity of a black hole, for example, would require the
solution of the light-ray trajectory for angles within the eye’s ﬁeld of view.
The angles drawn on a Schwarzschild map are not the same as the angles seen by an observer
because distances on the map are distorted. A stationary observer at a constant r-value
is known as a shell observer because he is on a stationary shell at ﬁxed (r,φ) coordinates. Launch
angles measured by such a shell observer can easily be converted to angles on the Schwarzschild
map by taking into account the contraction by
√
1 − 2M/r in the radial direction:
tanθshell = 1 −
2M
r
1/2
tanθschw. (18.31)
Use this transformation in Exercise 18.13 and Project 18.20.
Exercise 18.13. Knife-edge trajectory
Many important properties of light rays can be expressed in terms of an impact parameter, b
deﬁned as
b = r 1 −
2M
r
−1/2
sinθshell. (18.32)
CHAPTER 18. SEEING IN SPECIAL AND GENERAL RELATIVITY 762
For example, light that is launched with b =
√
27M enters an unstable orbit that teeters between
an escape to inﬁnity and a plunge into the black hole. This trajectory is known as a knife-edge
trajectory because the result is very sensitive to the initial conditions and numerical roundoﬀ
error and cannot be predicted. What will a shell observer see if he looks into space at an angle
that has this impact parameter?
Problem 18.14. Seeing near a black hole
Imagine a grid of light beacons located far away from a black hole in the φ = π direction. A
shell observer at φ = 0 with an arbitrary value of r attempts to view the grid by looking toward
the black hole. What will the observer see? One way to answer this question is to assume
a reasonable ﬁeld of view (for example, 180◦) for the eye and calculate light rays leaving the
eye at equal angular intervals. Compute the light paths and tabulate where the ray crosses
the beacon grid as a function of viewing angle. Because it is unlikely that the light rays will
intersect a beacon location, use interpolation to determine the angles at which beacons appear.
Plot these locations to show the observer’s view.
18.7 General Relativistic Dynamics
Figure 18.3: The eﬀective potential V (r) of a particle in the vicinity of a black hole.
In general relativity the magnitude of the angular momentum L per unit mass m of a particle
is
≡
L
m
= r2 dφ
dτ
, (18.33)
and the energy E per unit mass is
e ≡
E
m
= 1 −
2M
r
dt
dτ
. (18.34)
We can solve (18.33) for dφ and (18.34) for dt and substitute the result into the time-like form
of the metric and obtain a relation for dr/dτ:
dr
dτ
2
= e2
− 1 −
2M
r
1 +
r
2
. (18.35)
CHAPTER 18. SEEING IN SPECIAL AND GENERAL RELATIVITY 763
In analogy with the classical eﬀective potential function for a particle in a gravitational
ﬁeld, we use (18.35) to deﬁne a relativistic eﬀective potential (see Figure 18.3):
V (r)
m
2
= 1 −
2M
r
1 +
r
2
. (18.36)
Exercise 18.15. Energy and angular momentum
Show that the energy and angular momentum are conserved for the orbits you observed in
Exercise 18.11.
Exercise 18.16. Eﬀective potential
Add a plot of the eﬀective potential V (r) to your program for Exercise 18.11. Add a horizontal
line showing the energy per unit mass and place a red marker on this line showing the particle’s
radial position. Describe the eﬀective potential and the motion of the marker when the orbit is
circular, when the orbit precesses, and when the orbit plunges toward the event horizon.
18.8 ∗
The Kerr Metric
Because almost all astronomical objects rotate, most black holes likely have angular momentum.
The metric for a spinning black hole was derived by Kerr in 1964. For simplicity, we show the
metric for particle motion in the equatorial plane. Note that this metric contains a new angular
momentum parameter a:
dτ2
= 1 −
2M
r
dt2
+
4Ma2
r
dt dφ − 1 −
2M
r
+
a2
r2
−1
dr2
− 1 +
a2
r2
+
2Ma2
r3
r2
dφ2
. (18.37)
Because there are two values at which the coeﬃcient of dr2 increases without limit, rh =
M ±
√
M2 − a2, there are two horizons. We also see that the largest real value of a consistent
with real values of rh is a = M. This maximum value of a limits the angular momentum of a
black hole. Because we are interested in maximizing the eﬀect of rotation, we simplify (18.37)
by letting the angular momentum parameter take on its maximum value. The metric for this
extreme Kerr black hole is
dτ2
= 1 −
2M
r
dt2
+
4Ma
r
dt dφ − 1 −
M
r
−2
dr2
− R2
dφ2
, (18.38)
where
R2
≡ r2
+ M2
+
2M3
r
. (18.39)
We recast this metric as a Lagrangian and follow the derivation by Hanc and Tuleja and
CHAPTER 18. SEEING IN SPECIAL AND GENERAL RELATIVITY 764
obtain the rate
dr
dt
= ˙r (18.40a)
d ˙r
dt
= −
(M − r)2(M − 2M2 ˙φ + M3 ˙φ2 − r3 ˙φ2)
r4
+
2M3 − 2M4 ˙φ + 3Mr2 − M2r(1 + 6r ˙φ)
r2(M − r)2
˙r2
(18.40b)
dφ
dt
= ˙φ (18.40c)
d ˙φ
dt
=
4M3 ˙φ − 2M4 ˙φ2 + 6Mr2 ˙φ − 2r3 ˙φ − 2M2(1 + 3r2 ˙φ2)
r2(M − r)2
˙r (18.40d)
dt
dt
= 1. (18.40e)
Problem 18.17. Falling into a spinning black hole
a) Write a program that plots the general relativistic trajectory of a particle near an extreme
black hole using (18.40).
b) Follow the trajectory of a particle that starts from rest far from the center of the extreme
black hole. Describe the trajectory.
c) A particle is thrown with an angular momentum opposite to the hole’s spin starting at r =
3M. Write a program to simulate this situation and describe the particle’s motion.
A space ship near a black hole must ﬁre its rockets radially to keep from falling into a black
hole. It has an angular momentum appropriate for that radius, so that the remote stars do not
move overhead and, therefore, does not ﬁre its rockets tangentially. However, if the space ship
moves inward, it must ﬁre its rockets tangentially or it will be swept sideways with respect to
the remote stars. (The ship must only ﬁre its rockets while moving inward.) This eﬀect, known
as frame dragging, occurs near any spinning gravitational object including Earth.
Problem 18.17 shows that frame dragging becomes dramatic as the falling particle approaches
the horizon for the extreme black hole, rh = M. (The horizon is where the metric
coeﬃcient of dr2 becomes inﬁnite.) Note that the coeﬃcient of the dt2 term goes to zero at
r = rs = 2M. This value is called the static limit. The space between the static limit and the
horizon is dragged along in the direction of rotation of the black hole so that an observer cannot
remain at a ﬁxed angle no matter how powerful the rockets are.
18.9 Projects
Numerical relativity is still in its infancy but is making progress in simulating astrophysical
scenarios such as binary black hole mergers, binary neutron star mergers, and supernova core
collapse. A key problem is achieving long-time stability of the numerical solutions. A search
on “numerical relativity” will yield many interesting Web sites and entrees to current research.
CHAPTER 18. SEEING IN SPECIAL AND GENERAL RELATIVITY 765
Project 18.18. Three-dimensional rapidly moving objects
Extend the analysis in Section 18.1 to three-dimensional objects and model their appearance as
seen by a single observer at the origin using the transformation and rendering techniques described
in Chapter 17. Does a sphere appear to be a sphere even when it passes by an observer?
Does a cube appear to be a cube?
Project 18.19. Light Links
a) Imagine two stationary observers near a black hole wishing to establish a communication
link using a laser beam. In what direction should the laser be pointed to establish such a
link? Simulate this scenario using two dragable objects on a Schwarzschild map and draw
the light ray representing the communication link. Use a root ﬁnding algorithm, such as the
bisection method introduced in Chapter 6, to determine the proper launch angle. Calculate
and display the proper distance along this light path and study how this distance changes as
the light path grazes the event horizon.
b) Construct a light triangle connecting three observers. Display the sum of the interior angles
as measured by the observers to simulate Gauss’s mountain top experiment.
Project 18.20. Seeing orbits
Viewing an orbit requires that we calculate the particle’s trajectory and the trajectory of the light
ray from the particle to the viewer. An added complication arises because the light reaching the
view is retarded by the travel time. Write a program that shows an orbiting particle as seen by
a stationary observer in the equatorial plane by keeping track of both particle and light-link
parameters.
References and Suggestions for Further Reading
Robert J. Deissler, “The appearance, apparent speed, and removal of optical eﬀects for relativistically
moving objects,” Am. J. Phys. 73 (7), 663–669 (2005).
Jozef Hanc and Edwin Taylor, “From conservation of energy to the principle of least action: A
story line,” Am. J. Phys. 72 (4), 514–521 (2004).
Charles W. Misner, Kip S. Thorn, and John A. Wheeler, Gravitation (W. H. Freeman, 1973).
Kevin G. Suﬀern, “The apparent shape of a rapidly moving sphere,” Am. J. Phys. 56 (8), 729–
733 (1988).
James Terrell, “ Invisibility of the Lorentz contraction," Phys. Rev. 116, 1041–1045 (1959). This
paper corrected the erroneous belief that had been taught for ﬁfty years that an observer
sees the Lorentz contraction when viewing a relativistically moving object.
Kip S. Thorne, Black Holes and Time Warps (W. M. Norton and Company, 1994).
Victor Weisskopf, “The visual appearance of rapidly moving objects,” Physics Today 13 (9),
24–27 (1960).
Most general relativity texts begin with a treatment of tensor analysis. The following two
texts present this material using the four-dimensional spacetime metric.
CHAPTER 18. SEEING IN SPECIAL AND GENERAL RELATIVITY 766
Edwin F. Taylor and John A. Wheeler, Exploring Black Holes: An Introduction to General Relativity
(Addison–Wesley Longman, 2000).
James Hartle, Gravity: An Introduction to General Relativity (Addison–Wesley, 2003).
The website, <archive.ncsa.uiuc.edu/Cyberia/NumRel/NumRelHome.html>, developed by
the National Center for Supercomputing Applications, is one of many that discusses Einstein’s
contributions and recent progress in numerical relativity.
Chapter 19
Epilogue: The Unity of Physics
We emphasize that the methods we have discussed can be applied to a wide variety of natural
phenomena and contexts.
19.1 The Unity of Physics
Although we have discussed many topics and applications, we have covered only a small fraction
of the possible computer simulations and models of natural phenomena. However, we
know that the same algorithms can be applied to many kinds of phenomena. For example, the
Monte Carlo methods that we applied to the simulation of classical liquids and to the analysis of
quantum mechanical wave functions have been applied to the transport of neutrons and problems
in chemical kinetics. Similar Monte Carlo methods are being used to analyze problems
in quark conﬁnement. Indeed, the increasing role of the computer in research is strengthening
the interconnections of the various subﬁelds of physics and the relation of physics to other
disciplines.
We have also emphasized that the computer has helped us think of natural phenomena in
new ways that complement traditional methods. For example, consider a predator-prey model
of the dynamics of ﬁsh (minnows) and sharks. Assume that the birth rate of the ﬁsh is independent
of the number of sharks, and that each shark kills a number of ﬁsh proportional to their
number. If we assume that F(t), the number of ﬁsh at time t, changes continuously, we can write
dF(t)
dt
= [b1 − d1S(t)]F(t), (19.1)
where S(t) is the number of sharks at time t, and b1 and d1 are parameters independent of F
and S. To obtain an equation for the rate of change of the number of sharks, we assume that the
number of oﬀspring produced by each shark is proportional to the number of ﬁsh eaten by the
shark. If we also assume that the death rate of the sharks is constant, we can write
dS(t)
dt
= [b2F(t) − d2]S(t). (19.2)
Equations (19.1) and (19.2) are known as the Lotka–Volterra equations. They can be analyzed
by standard methods and solved numerically using simple algorithms. Why is the dynamical
behavior of (19.1) and (19.2) cyclic?
767
CHAPTER 19. EPILOGUE: THE UNITY OF PHYSICS 768
In the Lotka–Volterra model the numbers of predator and prey are assumed to change continuously
and their spatial distribution is ignored. We now summarize an alternative model
that can be simply expressed as a computer algorithm. The model is a two-dimensional cellular
automaton known as Wa-Tor.
1. Fish and sharks are placed at random on the sites of a lattice with the desired concentrations.
The ﬁsh and sharks are assigned random ages.
2. At time (iteration) t, consider each ﬁsh sequentially. Determine the number of nearest
neighbor sites that are unoccupied at time t − 1 and move the ﬁsh at random to one of the
unoccupied sites. If all the nearest neighbor sites are occupied, the ﬁsh does not move.
3. If a ﬁsh has survived for a time that is equal to a multiple of fbreed, the ﬁsh has a single
oﬀspring. The new ﬁsh is placed at the previous position of the parent ﬁsh.
4. At time t, consider each shark sequentially. If one or more of the nearest neighbor sites
at time t − 1 is occupied by a ﬁsh, the shark moves at random to one of the occupied sites
and eats the ﬁsh. If not, the shark moves to one of the unoccupied sites at random.
5. If a shark moves nstarve times without eating, the shark dies. If a shark survives for a
multiple of sbreed iterations, the shark has a single oﬀspring. The new shark is placed at
the previous position of the parent shark.
What is the dynamical behavior of Wa-Tor? Do the Wa-Tor and the Lotka–Volterra equations
exhibit similar behavior? Is the Wa-Tor model more realistic than the Lotka–Volterra equations?
Which approach would be easier to explain to a nonexpert? Which approach is more ﬂexible?
See the references for suggestions for the numerical values of the parameters.
19.2 Spiral Galaxies
In addition to making it easier to investigate complex nonlinear problems and more realistic
systems, the computer has reinforced one of the contemporary themes in physics, the unifying
role of collective behavior. Systems composed of many individual constituents can exhibit common
properties under certain conditions, even though there might be diﬀerences in the nature
of the constituents and in their mutual interaction. The behavior of a system near a critical
point is probably the best example of collective behavior in a familiar context. In the following,
we discuss an example of collective behavior in the context of the structure of spiral galaxies.
The internal structure of a galaxy has traditionally been studied using Newtonian dynamics.
This point of view is very useful but is complemented by thinking about the large scale
structure of a galaxy using ideas from statistical mechanics. Because we only brieﬂy summarize
this alternative point of view here, we encourage you to explore the properties of the
percolation-based model of Schulman and Seiden by running GalaxyApp, a simple version of
their model, which can be downloaded from the ch19 directory.
The basic assumption of the model is that even though a region of the galaxy might have
the necessary ingredients for star formation, nothing happens if it is left alone. However, if a
shock wave from a supernova passes through the gas, there is a good chance that a star will
be formed. The supernova is itself the result of an earlier nearby star formation. The theory
of self-propagating star formation is based on the importance of this mechanism. Rather than
determining which regions have the necessary conditions for star formation, we summarize
CHAPTER 19. EPILOGUE: THE UNITY OF PHYSICS 769
Figure 19.1: The nature of the polar grid used in GalaxyApp. Each cell has the same area and
six nearest neighbors on the average. The ﬁlled circle denotes an active region of star formation.
At the next time step, it can induce star formation in cells containing open circles. As time
passes, the neighbors in adjacent rings change because of diﬀerential rotation.
all the uncertainty and variability in a single parameter p, the probability that a supernova
explosion in one region gives rise to star formation in a neighboring region.
The other important observation we need to make about spiral galaxies is that galaxies do
not rotate rigidly (with a constant angular velocity), but to a good approximation each region
rotates with the same tangential velocity. The properties of random self-propagating star formation
and constant tangential velocity are incorporated into GalaxyApp as follows. Imagine dividing
a galaxy into concentric rings which are divided into cells of equal size (see Figure 19.1).
Initially, a small number of cells are activated. Each cell corresponds to a region of space that is
the size of a giant molecular cloud and moves with the same tangential velocity v. The angular
velocity is given by ω = v/r, where r is the distance of the ring from the center of the galaxy. At
each time step, the active cells activate neighboring cells with probability p and then become
inactive. Then the rings are rotated, and the process is repeated again in the next time step. At
each time step, cells that have been active within the last 15 time steps are displayed as ﬁlled
boxes, with the size of each box inversely proportional to the time since the cell become active.
More details of the simulation are shown in Figure 19.1 and in the program. A typical galaxy
generated by GalaxyApp is shown in Figure 19.2.
Our brief discussion of galaxies is not meant to convince you that the mechanism proposed
by Schulman and Seiden is correct. Rather our purpose is to show how an alternative point
of view can suggest new approaches in diﬀerent ﬁelds. The images produced by computer
simulations of the galaxy model show unanticipated features and have been the impetus for
further studies by astrophysicists and astronomers.
CHAPTER 19. EPILOGUE: THE UNITY OF PHYSICS 770
Figure 19.2: A typical structure generated by GalaxyApp. The parameters are the number of
rings, nring = 50, the initial number of active cells, nactive = 200, the circular velocity v =
1 (200 km/s), the probability of induced star formation p = 0.18, and the time interval dt = 10
(107 years). The structure shown is at t = 2720 with 393 active star clusters. The diameter of
the circle representing a star cluster is proportional to the remaining lifetime of the cluster.
19.3 Numbers, Pretty Pictures, and Insight
The power of physics comes in part from its ability to give numerical agreement between theory
and experiment. However, numerical agreement has little signiﬁcance unless this agreement
leads to insight into the phenomena of interest. For example, it is possible to design elaborate
epicycle models of planetary motion that yield numerical results which are consistent with observations.
Nevertheless, we prefer the Copernican approach, not for its impressive numerical
success, but because it provided insight and lead to further advances by Kepler and Newton.
Computer simulations raise similar questions. The numbers produced by simulations,
which are consistent with experimental observations, and the pictures that are suggestive of
physical phenomena are not suﬃcient to establish the value of a simulation. As an example, let
us brieﬂy consider a simulation of river networks. You might have seen aerial photographs of
the Earth’s topography and the fractal-like drainage patterns formed by many rivers. A variety
of random walk models can generate patterns that look remarkably like river networks and
even share some of their statistical properties. In these models the path of a walker represents a
river, and the branching and intersections of rivers are modeled by the intersection of the paths
of many walkers. However, because models do not directly incorporate the important physical
processes of erosion and sedimentation, they do not provide much insight.
Leheny has proposed a lattice model whose dynamics reﬂect actual physical processes. The
model consists of ﬁrst creating a terrain for the network and then deﬁning the network on the
terrain. The model can be summarized as follows:
CHAPTER 19. EPILOGUE: THE UNITY OF PHYSICS 771
1. The initial terrain is assumed to have a constant slope m. Each site of the lattice is given
an initial height, h(x,y) = my.
2. Precipitation is placed on a random site of the (square) lattice.
3. Water ﬂows from this site to a nearest neighbor site with a probability proportional to
eE∆h, where ∆h is the diﬀerence in height between the site and a neighbor, and E is a
parameter. If ∆h < 0, then the probability is equal to zero, and the ﬂow will not return to
the site previously visited if there is a nonzero probability of ﬂowing to another site.
4. Step 3 is repeated until the water ﬂows to the bottom of the lattice, y = 0.
5. Each lattice point that has been visited by the ﬂowing water has its height reduced by a
constant amount D. This process represents erosion.
6. Any site at which the height diﬀerence ∆h with a neighbor exceeds a critical amount M is
reduced in height by an amount ∆h/S, where S is another parameter in the model. This
process represents mud slides.
7. Steps 2–6 are repeated until you wish to analyze the resulting network. The river network
is deﬁned as follows. Every site in the lattice receives one unit of precipitation. Then
water ﬂows from a site to the nearest neighbor with the smallest height. Then the water
ﬂows to the neighbor of this new site with the smallest height. This ﬂow continues until
the ﬂow reaches the bottom of the lattice. This process is repeated for each site, and the
number of times that a site receives water is recorded. The river network is deﬁned as the
set of all sites that received at least R units of water, where R is another parameter of the
model.
The parameters of the model can be related to measurable quantities, and the diﬀerent steps
of the algorithm correspond to real dynamical processes. If you are interested, you will ﬁnd rich
literature on the structure of river networks. An eﬃcient way of understanding this literature
is to think in terms of a model such as the one we have presented. Of course, the best way to
understand the model is by converting it into a working program.
19.4 Constrained Dynamics
After ﬁnishing many of the simulations in the text, you might wonder whether similar techniques
can be used for systems in everyday life such as a roller coaster at an amusement park.
How would you simulate the motion of such a system? In this case the motion of the roller
coaster is constrained to a curved surface. Such a simulation is of general interest and physically–
based modeling has become an important technique in computer animation and computer graphics
(see Pixar and Eberly). The approach we will discuss is also of interest in simulating polymers
where it is frequently necessary to introduce geometrical restrictions such as constant
bond lengths (see Rapaport). The following considerations remind us that much of computer
science is of interest in physics.
We will consider holonomic constraints — constraints that can be eliminated by expressing
the coordinates in terms of a new set of variables. The new variables, which are fewer than
the original ones, are called generalized coordinates and implicitly satisfy the constraints. An
example of a system with a holonomic constraint is a simple pendulum. The state of the system
can be described by the (x,y) coordinates of the swinging mass, along with a constraint that the
CHAPTER 19. EPILOGUE: THE UNITY OF PHYSICS 772
connecting rod has a ﬁxed length. This constraint is holonomic because we can also describe the
system by the variable θ, the angle of the mass from the pivot. The constraint is automatically
satisﬁed in the θ-representation. Other examples of holonomic constraints include particles
restricted to motion on a curve and two particles restricted to be a given distance apart.
More generally, suppose the original system coordinates are given by r = (r1,r2,...,rn),
where each ri corresponds, for example, to a Cartesian component of a particle in the system.
For example, r1 might correspond to the horizontal position x and r2 might correspond to the
vertical position y of the ﬁrst particle. If there are holonomic constraints, not all values of r correspond
to valid system conﬁgurations. The idea is to introduce a set of generalized coordinates
q = (q1,q2,...,qf ) and express r as a function of q such that r(q) is always a valid system conﬁguration.
The reduction in the conﬁgurational dimension n − f equals the number of holonomic
constraints.
Our goal is to derive an ordinary diﬀerential equation of motion in terms of the generalized
coordinates, that is, ¨q = ¨q(q, ˙q,t). The standard way to do so is to ﬁrst determine the Lagrangian,
L = T − U, where T is the kinetic energy and U the potential energy. The equations of motion,
derived in intermediate classical mechanics textbooks, are then given by
∂L
∂qi
−
d
dt
∂L
∂ ˙qi
= 0. (19.3)
We can convert (19.3) to a set of ﬁrst-order diﬀerential equations in the usual way and solve
them numerically. Unfortunately, in many cases, because the numerical solvers are not exact,
the constraints may not be satisﬁed. Various methods have been developed to maintain the
constraints. One of the more common methods called Shake iteratively enforces each constraint
one at a time until all the constraints are satisﬁed within the desired level of accuracy. In the
following, we discuss a more recent technique developed by several computer scientists.
We begin with Newton’s equations of motion in Cartesian coordinates, which we express as
µi ¨ri = Fi(r, ˙r) = F
(a)
i + F
(c)
i , (19.4)
where Fi is the total force on a component of a particle, and µi is the associated mass. For
example, for the motion of two particles in two dimensions of mass m1 and m2, respectively, we
have F1 = F1x, F2 = F1y, F3 = F2x, F4 = F2y, µ1 = µ2 = m1, and µ3 = µ4 = m2. The total force is
the sum of the applied force F(a) and constraint force F(c). For example, the applied force could
be due to gravity, interparticle potentials, and friction. The constraint forces are not initially
known but are such that the evolution of ri satisﬁes the constraints.
Because forces that do work are best interpreted as applied forces, we assume without loss
of generality that the constraint forces do no work. This assumption uniquely speciﬁes F(c) and
implies that
i
F
(c)
i ˙ri = 0, (19.5)
if ˙r is consistent with the constraints. To express ˙r in terms of q and ˙q, we introduce the Jacobian
of r(q), Ji,j = ∂ri/∂qj. We will see how to compute the matrix J in the example that follows. We
use the chain rule to write
˙ri =
j
Ji,j ˙qj, (19.6)
and substitute (19.6) into (19.5) and obtain
i,j
F
(c)
i Ji,j ˙qj = 0, (19.7)
CHAPTER 19. EPILOGUE: THE UNITY OF PHYSICS 773
which is valid for all ˙q. Because of this independence, the prefactor for each term must be
identically zero:
i
F
(c)
i Ji,j = 0. (19.8)
Equation (19.8) gives us a way to eliminate the constraint force. That is, we multiply (19.4)
by Jij and sum over i. If we use the condition (19.8), the F(c) term will disappear, and we are
left with
i
Ji,jµiqi ¨ri =
i
Ji,jF
(a)
i . (19.9)
Note that we have eliminated the constraint force from the equation of motion.
Because we want to describe the system by the generalized coordinates, we need to know ¨q.
Fortunately, this information is embedded in ¨r. We diﬀerentiate (19.6) with respect to time and
switch the summation index from j to k and write
¨ri =
k
( ˙Ji,k ˙qk + Ji,k ¨qk). (19.10)
We still need to determine ˙Ji,k(q, ˙q). We again apply the chain rule and write
˙Ji,k =
l
(∂Ji,k/∂ql) ˙ql. (19.11)
We substitute (19.11) into (19.10), rearrange terms, and obtain our desired result:
i,k
Ji,jµiJi,k ¨qk =
i,k
(Ji,jF
(a)
i − Ji,jµi
˙Ji,k ˙qk), (19.12)
or in matrix notation,
(JT
MJ)¨q = JT
F(a)
− JT
M ˙J ˙q, (19.13)
where Mi,j = δi,jµi, and JT is the transpose of J. The unknown in (19.13) is ¨q, which can be
obtained at each time step by inverting the matrix JT MJ.
To make the above matrix manipulations more concrete, we consider the motion of two
particles of mass m1 and mass m2 connected by a spring and constrained to move on the curve
given by y(x) = x4 − 2x2. The equilibrium length of the spring is L0. We choose q1 and q2 to be
x1 and x2, respectively, and calculate the various matrices in (19.13). In this case the matrix J
can be written as
J =


∂r1
∂q1
∂r1
∂q2
...
...
∂r4
∂q1
∂r4
∂q2


=


1 0
y1 0
0 1
0 y2


, (19.14)
where y1 = 4x3
1 − 4x1 and y2 = 4x3
2 − 4x2. We also have
M =


m1 0 0 0
0 m1 0 0
0 0 m2 0
0 0 0 m2


(19.15)
CHAPTER 19. EPILOGUE: THE UNITY OF PHYSICS 774
MJ =


m1 0
m2y1 0
0 m2
0 m2y2


(19.16)
JT
MJ =
m1 + m1y1
2
0
0 m2 + m2y2
2 . (19.17)
Because JT MJ is a diagonal matrix, it is straightforward to calculate its inverse.
The force due to gravity can be written as
Fg
=


0
−m1g
0
−m2g


. (19.18)
We write the force due to the spring connecting the two masses as |Fs| = k(L−L0), where L is the
length of the spring and is given by L2 = (x2 − x1)2 + (y2 − y1)2. It can be shown that Fs can be
written as
Fs
=


(x2 − x1)keﬀ
(y2 − y1)keﬀ
−(x2 − x1)keﬀ
−(y2 − y1)keﬀ


, (19.19)
where keﬀ = k(1 − L0/L).
The class ConstraintApp solves the equation of motion (19.13) using the Open Source
Physics ODE solver in the usual way. To see how straightforward it is to implement the constrained
dynamics in this case, we show the getRate method in Listing 19.1.
Listing 19.1: The getRate method.
public void getRate ( double [ ] state , double [ ] rate ) {
/ / g e n e r a l i z e d c o o r d i n a t e s
double x1 = s t a t e [ 0 ] ;
double vx1 = s t a t e [ 1 ] ;
double x2 = s t a t e [ 2 ] ;
double vx2 = s t a t e [ 3 ] ;
double y1 = y ( x1 ) ;
double y2 = y ( x2 ) ;
double yp1 = yp ( x1 ) ;
double yp2 = yp ( x2 ) ; / / f i r s t d e r i v a t i v e
double ypp1 = ypp( x1 ) ;
double ypp2 = ypp( x2 ) ; / / second d e r i v a t i v e
/ / displacements
double Lx = x2−x1 , Ly = y2−y1 ;
double L = Math . sqrt ( Lx Lx + Ly Ly ) ; / / length of spring
/ / f o r c e s . L0 i s equilibrium length of spring
double keff = k (1 −L0/L ) ; / / e f f e c t i v e spring constant
/ / net applied f o r c e on p a r t i c l e 1 in x− d i r e c t i o n
double fx1 = keff Lx ;
double fy1 = keff Ly−g m1;
CHAPTER 19. EPILOGUE: THE UNITY OF PHYSICS 775
double fx2 = −keff Lx ; / / net applied f o r c e on p a r t i c l e 2
double fy2 = −keff Ly−g m2;
/ / elements of diagonal matrix ( J^T M J )^ −1:
double a11 = 1/(m1 (1+yp1 yp1 ) ) ;
double a22 = 1/(m2 (1+yp2 yp2 ) ) ; / / other matrix elemts are zero
/ / elements of v e c t o r J^T F^(a ) :
double b1 = fx1 + yp1 fy1 ;
double b2 = fx2 + yp2 fy2 ;
/ / elements of v e c t o r J^T M J du / dt :
double c1 = m1 yp1 ypp1 vx1 vx1 ;
double c2 = m2 yp2 ypp2 vx2 vx2 ;
rate [ 0] = vx1 ;
rate [ 1] = a11 ( b1 − c1 ) ;
rate [ 2] = vx2 ;
rate [ 3] = a22 ( b2 − c2 ) ;
}
The complete program can be downloaded from the ch19 directory.
19.5 What are Computers Doing to Physics?
There is probably no need to convince you that computers are changing the way we think about
the physical world. The question, “How can I formulate this problem for a computer?” has lead
to new insights into old problems and is allowing us to consider new problems.
What will be the eﬀect of computers in physics education? The most common use of computers
has been to assist students to understand topics that have been in the curriculum for
many years. So far the computer has not qualitatively changed the way we learn nor the topics
we study. Will computer simulation and numerical analysis make analytic methods less important?
Has this happened already? Should calculus retain its traditional importance in the
curriculum? Do we understand a natural phenomenon when we are able to construct a computer
model that allows us to make predictions that agree with experiment? Is it necessary to
obtain at least some analytic results? What do you think should be the role of computers in
education?
Computers and the visual images produced by computer models can be very seductive.
However, we need to remember that the goal of science is to understand nature. Theory and
experiment have been the traditional routes to this end, and computation has become a third
and complementary route. Although we have stressed the importance of computation in this
text, it is important to stress its complementary role. We must not let the rapid advances of
computer technology and the easy availability of information overshadow our ultimate goal of
gaining more knowledge and a deeper understanding of natural phenomena.
References and Suggestions for Further Reading
It would be impossible to list even a small subset of references to areas of physics and related
disciplines that we have not discussed. Also the development of algorithms and applications
in areas we have discussed is evolving rapidly. Many references to other applications and current
developments can be found in archival journals. An important site for recent developments
is <arxiv.org/>, an e-print service in physics, mathematics, nonlinear science, computer
science, and quantitative biology. The magazine, Computing in Science and Engineering,
CHAPTER 19. EPILOGUE: THE UNITY OF PHYSICS 776
<cise.aip.org/cise/>, especially the Departments on Education, Scientiﬁc Programming,
and Computer Simulations, regularly feature articles that are generally accessible. The journal,
American Journal of Physics, <scitation.aip.org/ajp/>, regularly has articles on computers
and physics. We encourage readers to regularly visit <www.opensourcephysics.org/sip>,
where new developments will be listed and discussed, as well as opensourcephysics.org> for
updates of the Open Source Physics library.
Several references relevant to this chapter are given in the following.
Eric Bonabeau and Laurent Dagorn, “Possible universality in the size distribution of ﬁsh schools,”
Phys. Rev. E 51, R5220–R5223 (1995). The authors apply a model originally developed for
river networks to the size distribution of schools of ﬁsh. See also Kjartan G. Magnüsson,
“Can physics save ﬁsh stocks?” Physics World 13 (2), 21–22 (2000).
Marek Cieplak, Achille Giacometti, Amos Maritan, Andrea Rinaldo, Ignacio Rodriguez–Iturbe,
and Jayanth R. Banavar, “Models of fractal river basins,” J. Stat. Phys. 91, 1–15 (1998), or
cond-mat/9803287.
A. K. Dewdney, “Computer Recreations,” Sci. Am. 251 (12), 14–22 (1984). A discussion of the
Wa-Tor model. Also see R. E. Durrett and S. Levin, “Lessons on pattern formation from
planet WATOR,” J. Theor. Biology 205, 201–214 (2000).
David Eberly, Game Physics (Morgan Kaufmann, 2004).
Zvonko Fazarinc, Sa˘sa Divjak, Dean Koro˘sec, Ale˘s Holobar, Matja˘z Divjak, and Damjan Zazula,
“Quest for eﬀective use of computer technology in education: From natural sciences to
medicine,” Computer Applications in Engineering Education 11 (3), 116–131 (2003). The
authors discuss some of the reasons for the relatively low impact of computer technology
on university education. Also see Zvonko Fazarinc, “A viewpoint on calculus,” Hewlett–
Packard Journal 38 (3), 38–40 (1987).
Alexander K. Hartmann and Heiko Rieger, Optimization Algorithms in Physics (Wiley–VCH,
2002). A graduate level textbook that illustrates the need for physicists to become familiar
with developments in computer science.
Robert L. Leheny, “A simple model for river network evolution,” Phys. Rev. E 52, 5610–5620
(1995) and references therein.
Simon Hubbard, Petro Babak, Sven Th. Sigurdsson, and Kjartan G. Magnüsson, “A model of
the formation of ﬁsh schools and migrations of ﬁsh,” Ecological Modelling 174, 359–374
(2004).
D. C. Rapaport, “Molecular dynamics simulation of polymer helix formation using rigid-link
methods,” Phys. Rev. E 66, 011906-1–15 (2002).
Lawrence S. Schulman and Philip E. Seiden, “Percolation and Galaxies,” Science 233, 425
(1986) and Philip E. Seiden and Lawrence S. Schulman, “Percolation model of galactic
structure,” Adv. Phys. 39, 1 (1990). See also J. Perdang and A. Lejeune, “Cellular automaton
experiments on local galactic structure. I. Model assumptions,” Astron. Astrophys.
Suppl. Series 119, 231–248 (1996); A. Lejeune and J. Perdang, “ Cellular automaton experiments
on local galactic structure. II. Numerical simulations,” Astron. Astrophys. Suppl.
Vol. 119, 249–263 (1996); and D. Cartin and G. Khanna, “A self-regulated model of galactic
spiral structure formation,” Phys. Rev. E 65, 016120-1–7 (2002).
CHAPTER 19. EPILOGUE: THE UNITY OF PHYSICS 777
Section 19.4 on constrained dynamics is based in part on lectures notes by Andrew Witkin and
David Baraﬀ available at <www.pixar.com/companyinfo/research/pbm2001/>.
The website <immsimteam.med.nyu.edu>, Immune System Modeling and Simulation, is an
example of the result of the collaborative research of physicists, physicians, and computer
scientists to construct a cellular automaton model of the immune system. A version of
their model, IMMSIM, can be downloaded from their website.
Index
acceptance probability, 599, 621
accuracy, 28
action, see principle of least action
air resistance, 59
algorithm, 3
stability, 102
alloy, 616
analogout thermal exponent, 637
Anderson localization, see localization
animation, 34–40
annealing, 634
ant in the labyrith, 502
antiferromagnetism, 615–617
area-preserving map, 177
argon, 8, 255–257
array, 16
aspect ratio, 115
attractor, see chaos
autocorrelation function, 237, see velocity autocorrelation
function
ballistic deposition, 512
BASIC, 4
beats, 88, 339
bifurcation
Hopf, 246
binary alloys, 651
bit manipulation, 235, 561–562, 640
Boltzmann distribution, see probability distri-
bution
C, 4
C++, 4
canonical ensemble, 275, 589, 601
ﬂuctuations, 655
cellular automata, 522–560
simple models, 540–543
central limit theorem, 212–214
chaos, see chapter 6
attractor, 145
stable attractor, 145
basin of attraction, 145, 174
bifurcation, 145
Hopf bifurcation, 246
period doubling, 160
pitchfork, 147, 155
tangent, 153
bifurcation diagram, 146
billiard models, 184–186
butterﬂy eﬀect, 170
chaotic scattering, 187–188
circle map, 186–187
controlling chaos, 163–167
double pendulum, 179–180
Feigenbaum constant, 154
estimating, 155
ﬁxed point
deﬁnition, 164
qualitative properties, 152–153
root ﬁnding algorithms, 164
stability, 189
stable ﬁxed point, 144, 145
unstable ﬁxed point, 145
forced damped pendulum, 170–174
fractals, 512–514
Hénon map, 167–168
Hamiltonian chaos, 174–181
intermittency, 154
KAM tori, 175
logistic map, 143–167
Lorenz’s atmospheric model, 168–170, 514
linearized equations, 183
Lyapunov exponent, 159–162, 168
deﬁnition, 159
for the logistic map, 161
Lyapunov spectrum, 183–184
period doubling, 158
bifurcation, 160
chemical reactions, 188–189
route to chaos, 188
pitchfork bifurcation, 147, 155
778
INDEX 779
Poincaré map, 171
double pendulum, 180
scale invariance, 485
self-similarity, 154–158, 485, 492
sensitivity to initial conditions, 158, 270
stadium billiard model, 184–186
standard map, 175, 177
strange attractors, 512
superstable trajectory, 155, 182
tangent bifurcation, 153
universality, 158
weak chaos, 161
chemical reactions
diﬀusion controlled, 232–234
oscillations, 103–104
chi-square
test, 236–237
cluster, see percolation, Ising model
deﬁnition, 446
rare gas, 650
combinatorial optimization, 633
complex systems, see chapter 14
computer simulation, 2
conservative system, 96, 175
control, 40
CalculationControl, see Open Source Physics
SimulationControl, see Open Source Physics
core repulsion, 255
correlation length, 611–613
thermal, 637
correlation time, 230–231, 609
coupled oscillators, 310–315
critical exponents, 464–467, see also percola-
tion
critical slowing down, 245, 614, 636
crossover, 554
curl, 381, 393–396
damping, 92–93
coeﬃcient, 92
critical, 93
darts
random throwing, 216–217
data analysis, 3
Debye potential, 133
demon algorithm, 582, 584–590, 652–654
detailed balance condition, 429, 597
diﬀerential cross section, see scattering
diﬀerential equations
adaptive step size algorithm, 79, 139
Beeman algorithm, 76
Euler algorithm, 13, 14
stability, 103
Euler–Cromer algorithm, 28, 45–46, 74, 87
Euler–Richardson algorithm, 46–47, 75, 78
global algorithm, 245
half-step algorithm, 74–75, 679, 686, 703
leapfrog algorithm, see Verlet algorithm
midpoint algorithm, 74
Modiﬁed Euler algorithm, 45–47
predictor–corrector algorithm, 81
Runge–Kutta algorithm, 55, 76–79
Verlet algorithm, 75–76, 82, 257, 263, 271
velocity form, 76
diﬀraction, 326, 340
single slit, 346, 347
diﬀraction grating, 346
diﬀusion
equation, 215–216, 247–249, 654, 662, 689
relation to fragmentation, 243
relation to random walks, 247–249
relation to Schrödinger’s equation, see
also quantum systems
mean square displacement, 291
quantum Monte Carlo, 695–698
relation to Laplace’s equation, 377
self-diﬀusion coeﬃcient, 208, 215
diﬀusion limited aggregation (DLA), see frac-
tals
diﬀusion term, 689
diﬀussion
equation
relation to Schrödinger’s equation, 695
relation to Schrödinger’s equation, 662, 689
dimension
Eucledian, 484
dispersion relation, 312
dissipative system, 175
divergence, 393
coordinate free deﬁnition, 393
drag force, 59, 61, 63, 70
dynamical system, see chaos
deﬁnition, 143
electric ﬁeld, 361–365
electric ﬁeld lines, 365–370
electric potential, 371–373
electrical circuits
INDEX 780
analogy to mechanical systems, 99
ﬁlter, 99, 101
Kirchhoﬀ’s loop rule, 97
voltage drops, 98
oscillations, 97–101
RC, 98–100
resonance curve, 101
resonant frequency, 101
RLC, 101
electrostatic shielding, 372–373
encapsulation, 5, 26, 43, 47
energy conservation, 76, 86–87, 98
entropy, see also chaos, 162, 297
generalized, 163
logistic map, 163
strong and weak chaotic systems, 163
equilibrium
approach, 199, 201–202, 269–270, 570–571,
582
ﬂuctuations, 200–201, 271
nature, 271
equipartition theorem, 588
equipotential lines, 372
ergodicity, 299
error analysis
Monte Carlo, 420–422
exact enumeration, 202, 504
Fermat’s principle, 239, 242
Fermi, Pasta, Ulam (FPU) problem, 336
ﬁeld-driven transition, 627
ﬁlter, 99
ﬁnite diﬀerence, 13
ﬁnite size scaling, 464–467
ﬁrst-order phase transition, 616, 627, 629
ﬁtting, 221, see also least squares, 442
exponential, 112
power law, 112
ﬂoating point, 102
ﬂuid ﬂow
lattice gas model, 560–571
Fortran, 4
Fourier synthesis, 321
Fourier transform, 310, 319–336, 684–686
fast Fourier transform algorithm (FFT), 678
Fourier analysis, 324–327
Fourier coeﬃcients, 319
Fourier integrals, 331–332
Fourier Series, 319–322
Gibbs phenomenon, 321
Nyquist frequency, 322
Parseval’s theorem, 332
power spectrum, 332–336
spatial, 329
two-dimensional, 329
fractal dimension, 484
box dimension, 494, 514
correlation dimension, 513
correlation function, 513
generalized dimension, 515
information dimension, 515
mass dimension, 484, 514, 515
percolation clusters, 486
fractals, see chapter 13
cluster-cluster aggregation(CCA), 517–519
diﬀusion limited aggregation (DLA), 505
diﬀusion limited aggregation(DLA), 507–
508
Eden model, 496–497, 510–512
epidemic model, 495–496
invasion percolation, 497–502
Koch curve, 493–495
Laplacian growth model, 508–510
monofractals, 515
multifractals, 515
self-aﬃne, 511
Sierpiński carpet, 495, 496
Sierpiński gasket, 495, 496
strange attractors, see chaos
free fall, 14–19
frequency, 86
frustration, 618
Game of Life, 533–535
genetic algorithm, 554–560
genotype, 554–555
geometrical growth, 143
geometrical optics, 238
Gram-Schmidt orthogonalization, 183, 184
granular matter, 303
Green’s function, 383–384, 695, 697
Hamiltonian, 81, 174–177, 179–181, 628, 644,
646
hard disks
molecular dynamics, 281–295
Monte Carlo, 620–622, 650–651
hard rods, 620–621
hard spheres, 620
INDEX 781
heat capacity, 600, 655
heat ﬂow, 652
Heisenberg model, 298, 646–647
Helmholtz free energy, 589, 616, 624, 626, 641
histogram method, 625–626
Hopﬁeld model, 544, 546
hysteresis, 616
impedance, 101
importance sampling, see Monte Carlo
index of refraction, 239
inheritance, 5, 26
extends, 26
subclass, 26
superclass, 26
inner class, 124
instance variable, 23
integer variable, 16, 22
integrable, 175
integration, see numerical integration
interference, 340–347
Ising model, 546, 590
cluster algorithms(Swendsen-Wang, Wolﬀ),
638–640
demon algorithm, see demon algorithm
Glauber dynamics, 648
single spin dynamics (Metropolis algorithm),
640
single spin ﬂip dynamics(Metropolis algorithm),
429–431, 601
Zero temperature, 648–649
Java, 4
byte code, 14
classpath, 15
main method, 14
package, 15
variable scope, 18
Virtual Machine, 5
Java language, 4
abstract, 26
arrays, 16, 37, 38, 52
assignment operator, 17
case-sensitivity, 15, 16, 21
cast, 17
constructor, 21, 23
dot operator, 22
for loop, 17
if, 37
import, 30
inheritance, 25–29
interface, 47–48
keywords, 21
methods, 14
new operator, 22
objects, 19–25
operators, 19
pass by value, 53, 54
primitive data types, 16
protection
package, 20
private, 20
public, 15
statement, 14, 15
String, 17
thread, 34
variables, 14
while loop, 18
Kepler’s laws, 108, 110, 117
Kirchhoﬀ’s loop rule, 97
Kirkwood gaps, 137, 138
Kosterlitz-Thouless transition, 644–646
LabView, 3
Lagrangian, 242, 699, 760
Laplace’s equation, see also partial diﬀerential
equations, 373
lattice gas, 617, see also ﬂuid ﬂow
least squares, 220–224
Lee-Kosterlitz method, 627–651
Lennard–Jones potential, 255–256, 259, 269,
291
Lennard-Jones potential, 623–624
Liénard–Wiechert, 386, 389
linear regression, see least squares, 221
linear response function, 274, 601
Lissajous ﬁgures, 87–88
local variable, 23
localization, 334–335
logistic map, see also chaos, 143
magnetic ﬁeld, 381, 393
magnetization, 591
Magnus eﬀect, 70
master equation, 248
Maxwell’s Equations, see also partial diﬀerential
equations, 392
Maxwell–Boltzmann distribution, see probability
distribution
INDEX 782
mean free path, 290–291
melting, 277
memory recall, 546
metastable state, 276
Metropolis algorithm/method, see Monte Carlo
microcanonical ensemble, 273, 582–584, 588,
595
microwave cavity, 400–401
Mimas, see Saturn
molecular dynamics, see chapter 8, 254
momentum conservation, 271
Monte Carlo, see also quantum systems, 202
canonical ensemble, see canonical ensemble
and chapter 15
constant pressure ensemble, 629
error analysis, see error analysis
grand canonical ensemble, 616, 629
importance sampling, 423, 427–429, 596,
687, 698
Metropolis method, 429–431, 601, 640
microcanonical ensemble, see microcanonical
ensemble
time, 585, 600
trial, 204
variational, 238–243
Morse potential, 339, 702
multivariate optimization, 633
n-fold way, 642–644
networks
Erdös–Rényi model, 547
preferential attachment model, 549
Watts–Strogatz model, 549
neural membrane, 104–106
neural networks, 543–547
neutron transport, 431–434
Newton’s law of cooling, 68
Newton’s law of gravitation, 59, 108, 109
Newton’s method, see root ﬁnding
Newton’s second law of motion, 3, 12, 47, 175
Newton’s third law, 262
normal modes, see chapter 9, 310–315
NP-complete, 547
nuclear decay, 66–67, 216–218
numerical analysis, 1
numerical derivative
acceleration, 63
backward diﬀerence, 62
central diﬀerence, 62
deﬁnition, 48
numerical integration, 406–414
error estimates, 434–435
midpoint approximation, 411
Monte Carlo methods
hit or miss method, 416
sample mean method, 416
multidimensional integrals, 418
Newton’s equation of motion, 73–82
rectangular approximation, 411
Simpson’s rule, 408, 412–413, 415, 419, 435
trapezoidal approximation, 411–412
object-oriented language, 4
Open Source Physics
AbstractCalculation, 34
AbstractSimulation, 35
CalculationControl, 31, 32, 35
Drawable, 48
Function, 47
ODE, 54
ODESolver, 55
SimulationControl, 35
world coordinates, 39
order parameter, 465, 612
order-disorder transition, 615, 616
partial diﬀerential equations
boundary value problem, 374
Laplace’s equation, 373–384
Maxwell’s Equations, 392–401
wave equation, 336–340
partition function, 589, 597, 656
pendulum
double, 179–181
forced damped, 170–174
simple, 88–92, 170
percolation, see chapter 12
clusters
cluster labeling, 456
Ising model, 636–640
mean cluster size, 455
mean cluster size distribution, 454
connectedness, 446
connectedness length, 456, 465, 473, 637
continuum, 452–454
critical exponents, 464–480
ﬁnite size scaling, 467
random resistor network, 479–480
scaling law, 467
INDEX 783
site, 446
swiss cheese model, 454, 477
universality class, 467
periodic boundary conditions, 214, 257–259,
271, 275, 280, 316
phase separation, 616
phase space, 87, 171, 175
phase transition, 446, 466, 467, 590, 616, 626,
see also Lee-Kosterlitz method, 627
continuous, 465
ﬁrst-order, 615
geometrical, 465
Ising, 610–611
percolation, 445
thermodynamic, 464
phenotype, 554
phenotypic expression, 555
planar model, 644–646
Poincaré, 133, 171
Poisson’s equation, 379, 381
polymorphism, 6
porous media, 454
ﬂuid ﬂow, 571
Potts model, 628, 648, 649
precession, 118, 735
uniform, 742
pressure, 271
mean, 287
principle of least action, 242, 243, 699
probability density, 211, 249, 273, 424, 679
probability distribution
Boltzmann, 273, 588, 595–599
cumulative, 424
Gaussian, 214, 425, 536
Maxwell-Boltzmann, see Boltzmann above
nonuniform, 423–426
acceptance-rejection method, 437
inverse transform method, 424–425
Poisson, 216–217
programming
languages, 4–5
object oriented, 5–6
programs
Aﬃne2DApp, 711–712
Aﬃne3DMatrix, 743–745
AnalyzeApp, 323–324
Ball3DApp, 69
Ball3dApp, 68
Barbell3D, 721–722
BifurcateApp, 145–147
BoltzmannApp, 598–599
BouncingBallApp, 37–38
Box3DApp, 719–720
BoxApp, 199–200
BoxSuperpositionApp, 674–675
CalculationApp, 30
Complex2DFrameApp, 705
ComplexApp, 41–42
ComplexPlotFrameApp, 704–705
DrawingApp, 49–50
EigenstateApp, 671–672
ElectricFieldApp, 362–364
FallingBallApp, 21
FallingBallCalcApp, 32
FallingParticleApp, 27
FallingParticleCalcApp, 32
FallingParticleODEApp, 55–56
FallingParticlePlotApp, 33–34
FermatApp, 240–241
FeynmanPlateModel, 730–732
FFT2DCalculationApp, 330–331
FFTApp, 326–327
FFTCalculationApp, 328–329
FieldLineApp, 368–369
FirstFallingBallApp, 14–15, 22
FrauhoferApp, 348–349
FreeRotationSpaceView, 738–739
FreewayApp, 531–532
FresnelApp, 351–352
GeneticApp, 555–556
GraphicalSolutionApp, 151–152
Hopﬁeld, 546
HopﬁeldApp, 544
HuygensApp, 343–345
IdealDemonApp, 586–587
IntegralCalcApp, 413–414
Interaction3DApp, 720–721
IsingAutoCorrelatorApp, 607–609
IterateMapApp, 143–145
KochApp, 492–494
LagrangeInterpolatorApp, 440–441
LaplaceApp, 375–377
LifeApp, 533–535
LJParticlesApp, 267–268
MaxwellApp, 399–400
Methane, 716–717
MethaneApp, 718–719
MonteCarloEstimatorApp, 416–418
INDEX 784
MouseApp, 120
NumericalIntegrationApp, 409–410
OneDimensionalAutomatonApp, 523–524
PendulumApp, 91
PercolationApp, 448–450
PlanetApp, 115–116
PlotFrameApp, 29–30
PoincareApp, 173
PoincaretApp, 171
PolynomialApp, 438–439
Projectile, 58
ProjectileApp, 59
QMWalkApp, 693–694
QuaternionApp, 728–729
RadiatingEFieldApp, 390–391
RasterFrameApp, 358
RecursiveFixedPointApp, 164–166
RGApp, 471
RigidBodyModel, 736–738
Rotation3D, 715–716
Scalar2DFrameApp, 358–359
ScatterApp, 128–129
SchroedingerApp, 666–667
SecondLawPlotApp, 113–114
SimulationPlotApp, 35
SpinningTopModel, 740
SpinningTopSpaceView, 740–742
SynthesizeApp, 320–321
TDHalfStepApp, 681–682
ThreeBodyApp, 135–136
TorqueApp, 724–725
VectorPlotApp, 403
WalkerApp, 204–205
propagator, 695
Python, 4
Q value, 101
quality factor, 101
quantum systems, see chapter 16
bound states, 668–672
diﬀusion quantum Monte Carlo, 695–698
Green’s function, 695, 697
path integral quantum Monte Carlo, 698–
702
random walk quantum Monte Carlo, 689–
695
time-dependent solutions, 678–683
variational methods, 238–243
quasi-ergodic hypothesis, 273, 299, 335, 585
quasiequilibrium, 635
quasiperiodic, 181, 187
radial distribution function, 279–281, 290, 619
radiation from accelerated charges, 384–392
random deposition, 512
random number generator, 197–198, 234–238,
640
random process, see chapter 7, 197
random walk, 202–209
continuum, 211–212
mean ﬁrst passage time, 210
modiﬁed, 209–216
multistate, 211
persistent, 210–211
polymer application, 224–232
quantum Monte Carlo, 689–695
relation to diﬀusion, 215, 247–249
self-avoiding (SAW), 225–226
pivot algorithm, 245
solution to Laplace’s equation, 382–384
true self-avoiding (TSAW), 231–232
real-time control, 3
recombination, 554
recursion, 150–152, 492–494
recursion relation, 472
reduced mass, 109
refraction
law, 241
relaxation methods, 374
Gauss–Seidel, 377
Gauss-Seidel, 378
Jacobi, 377
multigrid, 402–403
relaxation time, 92
renormalization group, see also percolation, 468–
475
renormalization transformation, 472
reptation method, 227–228
resonance, 101, 137
retarded time, 385
root ﬁnding
bisection method, 164, 182, 191–192, 387
Newton’s method, 190–192
rotor, see standard map
roundoﬀ error, 102
sample variance, 420
sampling theorem, 322
Saturn, 177–179
INDEX 785
scattering
chaotic, 187–188
diﬀerential cross section, 126–128, 187
Rutherford, 132
Schrödinger’s equation, 181, 690
time-dependent, 678–683
searching, 501–502
binary, 499
linear, 499
self-averaging, 542
self-organization, 535–543
earthquake model, 540–541
forest ﬁre model, 541–542
punctuated equilibrium, 543
simple harmonic motion, 85–88
simulated annealing, 634–636
simulation, 3–4, 73
solar wind, 136
solitons, 340
speciﬁfc heat, 612
spin correlation function, 611, 645
spin exchange dynamics, 616
spin glass, 648
standard deviation of the means, 421, 435–436
structured programming, 4
surface growth, 510–512
susceptibility
zero ﬁeld, 656
symbolic manipulation, 1
temperature, 271, 274, 300
Curie temperature, 464
kinetic temperature, 587
relation to demon energy, 587, 591, 654–
655
terminal velocity, 60
thermal conductivity, 294, 653
three dimensional plots, 168
transformation
aﬃne, 709
Lorentz, 750
traveling salesman problem, 633
traveling salesperson, 636
triangular lattice, 275, 618, 621, 622
truncation error, 102
two-body problem, 109
universal computing machine, 535
Van Allen radiation belt, 72
van der Waals potential, 255
variance, 201, 212, 420, 421, 427
reduction of variance methods, see Metropolis
method under Monte Carlo, 421
velocity autocorrelation function, 294
velocity ﬂuctuation metric, 300
velocity metric, 300
virial, 262, 272, 619, 701
visual representation, 2
visualization, see chapter 17
vortex, see also Kosterlitz-Thouless transition,
645
wave equation, see also partial diﬀerential equations,
336
waves
reﬂection, 338
standing, 339
supersposition, 338
velocity, 338
Widom insertion method, 624
x-y model, see planar model
Yukawa potential, 133, 689