An Introduction to Computer Simulation Methods Applications to Physical System Harvey Gould, Jan Tobochnik, and Wolfgang Christian August 27, 2016 Contents Preface i 1 Introduction 1 1.1 Importance of computers in physics . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 The importance of computer simulation . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Programming languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Object oriented techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.5 How to use this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Tools for Doing Simulations 12 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2 Simulating Free Fall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3 Getting Started with Object-Oriented Programming . . . . . . . . . . . . . . . . . 19 2.4 Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.5 The Open Source Physics Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.6 Animation and Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.7 Model-View-Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3 Simulating Particle Motion 45 3.1 Modified Euler algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.2 Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.3 Drawing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.4 Specifying The State of a System Using Arrays . . . . . . . . . . . . . . . . . . . . . 51 3.5 The ODE Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.6 The ODESolver Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.7 Effects of Drag Resistance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.8 Two-Dimensional Trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.9 Decay Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.10 ∗Visualizing Three-Dimensional Motion . . . . . . . . . . . . . . . . . . . . . . . . 68 3.11 Levels of Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 1 CONTENTS 2 4 Oscillations 85 4.1 Simple Harmonic Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.2 The Motion of a Pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.3 Damped Harmonic Oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.4 Response to External Forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 4.5 Electrical Circuit Oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 4.6 Accuracy and Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 4.7 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 5 Few-Body Problems: The Motion of the Planets 108 5.1 Planetary Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 5.2 The Equations of Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 5.3 Circular and Elliptical Orbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 5.4 Astronomical Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 5.5 Log-log and Semilog Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 5.6 Simulation of the Orbit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 5.7 Impulsive Forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 5.8 Velocity Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 5.9 A Mini-Solar System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 5.10 Two-Body Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 5.11 Three-body problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 5.12 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 6 The Chaotic Motion of Dynamical Systems 142 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 6.2 A Simple One-Dimensional Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 6.3 Period Doubling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 6.4 Universal Properties and Self-Similarity . . . . . . . . . . . . . . . . . . . . . . . . 154 6.5 Measuring Chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 6.6 *Controlling Chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 6.7 Higher-Dimensional Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 6.8 Forced Damped Pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 6.9 *Hamiltonian Chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 6.10 Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 6.11 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 CONTENTS 3 7 Random Processes 197 7.1 Order to Disorder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 7.2 Random Walks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 7.3 Modified Random Walks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 7.4 The Poisson Distribution and Nuclear Decay . . . . . . . . . . . . . . . . . . . . . . 216 7.5 Problems in Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 7.6 Method of Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 7.7 Applications to Polymers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 7.8 Diffusion-Controlled Chemical Reactions . . . . . . . . . . . . . . . . . . . . . . . 232 7.9 Random Number Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234 7.10 Variational Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 7.11 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 Appendix 10: Random Walks and the Diffusion Equation . . . . . . . . . . . . . . . . . 247 8 The Dynamics of Many-Particle Systems 254 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 8.2 The Intermolecular Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 8.3 Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 8.4 The Numerical Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 8.5 Periodic Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 8.6 A Molecular Dynamics Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 8.7 Thermodynamic Quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 8.8 Radial Distribution Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 8.9 Hard Disks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 8.10 Dynamical Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 8.11 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295 8.12 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 9 Normal Modes and Waves 310 9.1 Coupled Oscillators and Normal Modes . . . . . . . . . . . . . . . . . . . . . . . . 310 9.2 Numerical Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 9.3 Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 9.4 Two-Dimensional Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 9.5 Fourier Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 9.6 Power Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332 9.7 Wave Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 9.8 Interference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340 9.9 Fraunhofer Diffraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347 9.10 Fresnel Diffraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350 Appendix 9A: Complex Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 Appendix 9B: Fast Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354 Appendix 9C: Plotting Scalar Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357 CONTENTS 4 10 Electrodynamics 361 10.1 Static Charges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 10.2 Electric Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362 10.3 Electric Field Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365 10.4 Electric Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371 10.5 Numerical Solutions of Boundary Value Problems . . . . . . . . . . . . . . . . . . . 373 10.6 Random Walk Solution of Laplace’s Equation . . . . . . . . . . . . . . . . . . . . . 382 10.7 *Fields Due to Moving Charges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384 10.8 *Maxwell’s Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392 10.9 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401 Appendix A: Plotting Vector Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403 11 Numerical and Monte Carlo Methods 406 11.1 Numerical Integration Methods in One Dimension . . . . . . . . . . . . . . . . . . 406 11.2 Simple Monte Carlo Evaluation of Integrals . . . . . . . . . . . . . . . . . . . . . . 415 11.3 Multidimensional Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418 11.4 Monte Carlo Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420 11.5 Nonuniform Probability Distributions . . . . . . . . . . . . . . . . . . . . . . . . . 423 11.6 Importance Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427 11.7 Metropolis Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429 11.8 *Neutron Transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431 12 Percolation 445 12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445 12.2 The Percolation Threshold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447 12.3 Finding Clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456 12.4 Critical Exponents and Finite Size Scaling . . . . . . . . . . . . . . . . . . . . . . . 464 12.5 The Renormalization Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468 12.6 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475 13 Fractals and Kinetic Growth Models 484 13.1 The Fractal Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484 13.2 Regular Fractals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492 13.3 Kinetic Growth Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495 13.4 Fractals and Chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512 13.5 Many Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514 13.6 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516 CONTENTS 5 14 Complex Systems 522 14.1 Cellular Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 522 14.2 Self-Organized Critical Phenomena . . . . . . . . . . . . . . . . . . . . . . . . . . . 535 14.3 The Hopfield Model and Neural Networks . . . . . . . . . . . . . . . . . . . . . . . 543 14.4 Growing Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547 14.5 Genetic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554 14.6 Lattice Gas Models of Fluid Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . 560 14.7 Overview and Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 571 15 Monte Carlo Simulations of Thermal Systems 582 15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 582 15.2 The Microcanonical Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 582 15.3 The Demon Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584 15.4 The Demon as a Thermometer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 588 15.5 The Ising Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 590 15.6 The Metropolis Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595 15.7 Simulation of the Ising Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601 15.8 The Ising Phase Transition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 610 15.9 Other Applications of the Ising Model . . . . . . . . . . . . . . . . . . . . . . . . . 615 15.10Simulation of Classical Fluids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 619 15.11Optimized Monte Carlo Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 624 15.12∗Other Ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629 15.13More Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633 15.14Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 636 16 Quantum Systems 662 16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 662 16.2 Review of Quantum Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663 16.3 Bound State Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668 16.4 Time Development of Eigenstate Superpositions . . . . . . . . . . . . . . . . . . . 673 16.5 The Time-Dependent Schrödinger Equation . . . . . . . . . . . . . . . . . . . . . . 678 16.6 Fourier Transformations and Momentum Space . . . . . . . . . . . . . . . . . . . . 684 16.7 Variational Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 687 16.8 Random Walk Solutions of the Schrödinger Equation . . . . . . . . . . . . . . . . . 689 16.9 Diffusion Quantum Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . . . . . . 695 16.10Path Integral Quantum Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . . . . 698 16.11Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 702 Appendix A: Visualizing Complex Functions . . . . . . . . . . . . . . . . . . . . . . . . 704 CONTENTS 6 17 Visualization and Rigid Body Dynamics 709 17.1 Two-Dimensional Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . 709 17.2 Three-Dimensional Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . 713 17.3 The Three-Dimensional Open Source Physics Library . . . . . . . . . . . . . . . . . 719 17.4 Dynamics of a Rigid Body . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722 17.5 Quaternion Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726 17.6 Quaternion equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 729 17.7 Rigid Body Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735 17.8 Motion of a spinning top . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 739 17.9 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 742 18 Seeing in Special and General Relativity 750 18.1 Special Relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 750 18.2 General Relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 754 18.3 Dynamics in Polar Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755 18.4 Black Holes and Schwarzschild Coordinates . . . . . . . . . . . . . . . . . . . . . . 757 18.5 Particle and Light Trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 759 18.6 Seeing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 761 18.7 General Relativistic Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 762 18.8 ∗The Kerr Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 763 18.9 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 764 19 Epilogue: The Unity of Physics 767 19.1 The Unity of Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 767 19.2 Spiral Galaxies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 768 19.3 Numbers, Pretty Pictures, and Insight . . . . . . . . . . . . . . . . . . . . . . . . . 770 19.4 Constrained Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 771 19.5 What are Computers Doing to Physics? . . . . . . . . . . . . . . . . . . . . . . . . . 775 Preface Computer simulations are now an integral part of contemporary basic and applied physics, and computation has become as important as theory and experiment. The ability to compute is now part of the essential repertoire of research scientists. Since writing the first two editions of our text, more courses devoted to the study of physics using computers have been introduced into the physics curriculum, and many more traditional courses are incorporating numerical examples. We are gratified to see that our text has helped shape these innovations. The purpose of our book includes the following: 1. To provide a means for students to do physics. 2. To give students an opportunity to gain a deeper understanding of the physics they have learned in other courses. 3. To encourage students to “discover” physics in a way similar to how physicists learn in the context of research. 4. To introduce numerical methods and new areas of physics that can be studied with these methods. 5. To give examples of how physics can be applied in a much broader context than is discussed in the traditional physics undergraduate curriculum. 6. To teach object-oriented programming in the context of doing science. Our overall goal is to encourage students to learn about science through experience and by asking questions. Our objective always is understanding, not the generation of numbers. The major change in this edition is the use of the Java programming language instead of True Basic, which was used in the first two editions. We chose Java for some of the same reasons we originally chose True Basic. Java is available for all popular operating systems, and is platform independent, contains built-in graphics capabilities, is freely available, and has all the features needed to write powerful computer simulations. There is an abundance of free open source tools available for Java programmers, including the Eclipse integrated development environment. Because Java is popular, it continues to evolve, and its speed is now comparable to other languages used in scientific programming. In addition, Java is object oriented, which has become the dominant paradigm in computer science and software engineering, and therefore learning Java is excellent preparation for students with interests in physics and computer science. Java programs can be easily adapted for delivery over the Web. Finally, as for True Basic, the nongraphical parts of our programs can easily be converted to other languages such as C/C++, whose syntax is similar to Java. i PREFACE ii When we chose True Basic for our first edition, introductory computer science courses were teaching Pascal. When we continued with True Basic in the second edition, computer science departments were experimenting with teaching C/C++. Finally, we are able to choose a language that is commonly taught and used in many contexts. Thus, it is likely that some of the students reading our text will already know Java and can contribute much to a class that uses our text. Java provides many powerful libraries for building a graphical user interface and incorporating audio, video, and other media. If we were to discuss these libraries, students would become absorbed in programming tasks that have little or nothing to do with physics. For this reason our text uses the Open Source Physics library which makes it easy to write programs that are simpler and more graphically oriented than those that we wrote in True Basic. In addition, the Open Source Physics library is useful for other computational physics projects which are not discussed in this text, as well as general programming tasks. This library provides for easy graphical input of parameters, tabular output of data, plots, visualizations and animations, and the numerical solution of ordinary differential equations. It also provides several useful data structures. The Open Source Physics library was developed by Wolfgang Christian, with the contributions and assistance of many others. The book Open Source Physics: A User’s Guide with Examples by Wolfgang Christian is available separately and discusses the Open Source Physics library in much more detail. A CD that comes with the User’s Guide contains the source code for the Open Source Physics library, the programs in this book, as well as ready-to-run versions of these programs. The source code and the library can also be downloaded freely from . The ease of doing visualizations is a new and important aspect of Java and the Open Source Physics library, giving Java an advantage over other languages such as C++ and Fortran, which do not have built-in graphics capabilities. For example, when debugging a program, it is frequently much quicker to detect when the program is not working by looking at a visual representation of the data rather than by scanning the data as lists of numbers. Also, it is easier to choose the appropriate values of the parameters by varying them and visualizing the results. Finally, more insight is likely to be gained by looking at a visualization than a list of numbers. Because animations and the continuous plotting of data usually cause a program to run more slowly, we have designed our programs so that the graphical output can be turned off or implemented infrequently during a simulation. Java provides support for interacting with a program during runtime. The Open Source Physics library makes this interaction even easier, so that we can write programs that use a mouse to input data, such as the location of a charge, or toggle the value of a cell in a lattice. We also do not need to input how long a simulation should run and can stop the program at any time to change parameters. As with our previous editions, we assume no background in computer programming. Much of the text can be understood by students with only a semester each of physics and calculus. Chapter 2 introduces Java and the Open Source Physics library. In Chapter 3 we discuss the concept of interfaces and how to use some of the important interfaces in the Open Source Physics library. Later chapters introduce more Java and Open Source Physics constructs as needed, but essentially all of the chapters after Chapter 3 can be studied independently and in any order. We include many topics that are sometimes considered too advanced for undergraduates, such as random walks, chaos, fractals, percolation, simulations of many particle systems, and topics in the theory of complexity, but we introduce these topics so that very little background is required. Other chapters discuss optics, electrodynamics, relativity, rigid body motion, and quantum mechanics, which require knowledge of the physics found in the corresponding stan- PREFACE iii dard undergraduate courses. This text is written so that the physics drives the choice of algorithms and the programming syntax that we discuss. We believe that students can learn how to program more quickly with this approach because they have an immediate context, namely doing simulations, in which to hone their skills. In the beginning most of the programming tasks involve modifying the programs in the text. Students should then be given some assignments that require them to write their own programs by following the format of those in the text. The students may later develop their own style as they work on their projects. Our text is most appropriately used in a project-oriented course that lets students with a wide variety of backgrounds and abilities work at their own pace. The courses that we have taught using this text have a laboratory component. From our experience we believe that active learning where students are directly grappling with the material in this text is the most efficient. In a laboratory context students who already know a programming language can help those who do not. Also, students can further contribute to a course by sharing their knowledge from various backgrounds in physics, chemistry, computer science, mathematics, biology, economics, and other subjects. Although most of our text is at the undergraduate level, many of the topics are considered to be graduate level and thus would be of interest to graduate students. One of us regularly teaches a laboratory-based course on computer simulation with both undergraduate and graduate students. Because the course is project oriented, students can go at their own pace and work on different problems. In this context, graduate and undergraduate students can learn much from each other. Some instructors who might consider using our text in a graduate-level context might think that our text is not sufficiently rigorous. For example, in the suggested problems we usually do not explicitly ask students to do an extensive data analysis. However, we do discuss how to estimate errors in Chapter 11. We encourage instructors to ask for a careful data analysis on at least one assignment, but we believe that it is more important for students to spend most of their time in an exploratory mode where the focus is on gaining physical insight and obtaining numerical results that are qualitatively correct. There are four types of suggested student activities. The exercises, which are primarily found in the beginning of the text, are designed to help students learn specific programming techniques. The problems, which are scattered throughout each chapter, are open ended and require students to run, analyze, and modify programs given in the text, or write new, but similar programs. Students will soon learn that the format for most of the programs is very similar. Starred problems require either significantly more background or work and may require the writing of a program from scratch. However, the programs for these problems still follow a similar format. The projects at the end of most of the chapters are usually more time consuming and would be appropriate for term projects or independent student research. Many new problems and projects have been added to this edition, while others have been improved or eliminated. Instructors and students should view the problem descriptions and questions as starting points for thinking about the system of interest. It is important that students read the problems even if they do not plan to do them. We encourage instructors to ask students to write laboratory reports for at least some of the problems. The Appendix to Chapter 1 provides guidance on what these reports should include. Part of the beauty and fun of doing computer simulations is that one is forced to think about the choice of algorithm, its implementation, the choice of parameters, what to measure, and the results. Do the results make sense? What happens if you change a parameter? What if you change the algorithm? Much physics can be learned in this way. PREFACE iv Although all of the programs discussed in the text can be downloaded freely, most are listed in the text to encourage students to read them carefully. Students might find some useful techniques that they can use elsewhere, and the discussion in the text frequently refers to the listings. A casual perusal of the text might suggest that the text is bereft of figures. One reason that we have not included more figures is that most of the programs in the text have an important visual component in color. Black and white figures pale in comparison. Much of the text is meant to be read while working on the programs. Thus, students can easily see the plots and animations produced by the programs while they are reading the text. As new technologies become available and the backgrounds and expectations of students change, the question of what is worth knowing needs to be reconsidered. Today, calculators not only do arithmetic and numerical operations, but most can do algebra, calculus, and plotting. Students have lost the sense of number and most can only do the simplest mathematical manipulations in their head. On the other hand, most students feel comfortable using computers and gathering information off the Web. Because there exist programs and applets that can perform many of the simulations in this text, why should students learn how to write their own programs? We have at least two answers. First, most innovative scientific research involves writing programs that do not fit into the domains of existing software. More importantly, we believe that students obtain a deeper understanding of the physics and the algorithms themselves by writing and modifying their own programs. Just as we need to insure that students can carry out basic mathematical operations without a calculator so that they understand what these operations mean, we must do the same when it comes to computational physics. The recommended readings at the end of each chapter have been selected for their pedagogical value rather than for completeness or for historical accuracy. We apologize to our colleagues whose work has been inadvertently omitted, and we would appreciate suggestions for new and additional references. Because students come with a different skill set than most of their instructors, it is important that instructors realize that certain aspects of this text might be easier for their students than for them. Some instructors might be surprised that much of the code for organizing the simulations is “hidden” in the Open Source Physics library (although the source code is freely available). Some instructors will initially think that Chapter 2 contains too much material. However, from the student’s perspective this material is not that difficult to learn. They are used to downloading files, using various software environments, and learning how to make software do what they want. The difficult parts of the text, where instructor input is most needed, is understanding the physics and the algorithms. Converting algorithms to programs is also difficult for many students, and we spend much time in the text explaining the programs that implement various algorithms. In some cases instructors will find it difficult to set up an environment to use Java and the Open Source Physics library. Because this task depends on the operating system, we have placed instructions on how to set up an environment for Java and Open Source Physics at . This website also contains links to updates of the evolving Open Source Physics library as well as other resources for this text including the source code for the programs in the text. We acknowledge generous support from the National Science Foundation which has allowed us to work on many ideas that have found their way into this textbook. We also thank Kipton Barros, Mario Belloni, Doug Brown, Francisco Esquembre, and Joshua Gould for their advice, suggestions, and contributions to the Open Source Physics library and to the text. We thank Anne Cox for suggesting numerous improvements to the narrative and for hosting an Open Source Physics developer’s workshop at Eckerd College. PREFACE v We are especially grateful to Louis Colonna-Romano for drawing almost all of the figures. Lou writes programs in postscript the way others write programs in Java or Fortran. We are especially thankful to students and faculty at Clark University, Davidson College, and Kalamazoo College who have generously commented on the Open Source Physics project as they class tested early versions of this manuscript. Carlos Ortiz helped prepare the index for this book. Many individuals reviewed parts of the text and we thank them for their assistance. They include Lowell M. Boone, Roger Cowley, Shamanthi Fernando, Alejandro L. Garcia, Alexander L. Godunov, Rubin Landau, Donald G. Luttermoser, Cristopher Moore, Anders Sandvik, Ross Spencer, Dietrich Stauffer, Jutta Luettmer–Strathmann, Daniel Suson, Matthias Troyer, Slavomir Tuleja, and Michael T. Vaughn. We thank all our friends and colleagues for their encouragement and support. We are grateful to our wives, Patti Gould, Andrea Moll Tobochnik, and Barbara Christian, and to our children, Joshua, Emily, and Evan Gould, Steven and Howard Tobochnik, and Katherine, Charlie, and Konrad Christian for their encouragement and understanding during the course of this work. It takes a village to raise a child and a community to write a textbook. No book of this length can be free of typos and errors. We encourage readers to email us about errors that they find and suggestions for improvements. Our plan is to continuously revise our book so that the next edition will be more timely. Harvey Gould Clark University Worcester, MA 01610-1477 hgould@clarku.edu Jan Tobochnik Kalamazoo College Kalamazoo, MI 49006-3295 jant@kzoo.edu Wolfgang Christian Davidson College Davidson, NC 28036-6926 wochristian@davidson.edu Chapter 1 Introduction The importance of computers in physics and the nature of computer simulation is discussed. The nature of object-oriented programming and various computer languages is also considered. 1.1 Importance of computers in physics Computation is now an integral part of contemporary science and is having a profound effect on the way we do physics, on the nature of the important questions, and on the physical systems we choose to study. Developments in computer technology are leading to new ways of thinking about physical systems. Asking “How can I formulate this problem on a computer?” has led to the understanding that it is practical and natural to formulate physical laws as rules for a computer rather than only in terms of differential equations. For the purposes of discussion, we will divide the use of computers in physics into the following categories: numerical analysis, symbolic manipulation, visualization, simulation, and the collection and analysis of data. Numerical analysis refers to the solution of well-defined mathematical problems to produce numerical (in contrast to symbolic) solutions. For example, we know that the solution of many problems in physics can be reduced to the solution of a set of simultaneous linear equations. Consider the equations 2x + 3y = 18 x − y = 4. It is easy to find the analytical solution x = 6, y = 2 using the method of substitution. Suppose we wish to solve a set of four simultaneous equations. We again can find an analytical solution, perhaps using a more sophisticated method. However, if the number of simultaneous equations becomes much larger, we would need to use a computer to find a solution. In this mode the computer is a tool of numerical analysis. Because it is often necessary to compute multidimensional integrals, manipulate large matrices, or solve nonlinear differential equations, this use of the computer is important in physics. One of the strengths of mathematics is its ability to use the power of abstraction, which allows us to solve many similar problems simultaneously by using symbols. Computers can be used to do much of the symbolic manipulation. As an example, suppose we want to know the solution of the quadratic equation, ax2 + bx + c = 0. A symbolic manipulation program can give the solution as x = [−b ± √ b2 − 4ac]/2a. In addition, such a program can give the usual numerical solutions for specific values of a, b, and c. Mathematical operations such as differentiation, 1 CHAPTER 1. INTRODUCTION 2 sin x x Figure 1.1: What is the meaning of the sine function? integration, matrix inversion, and power series expansion can be performed using symbolic manipulation programs. The calculation of Feynman diagrams, which represent multidimensional integrals of importance in quantum electrodynamics, has been a major impetus to the development of computer algebra software that can manipulate and simplify symbolic expressions. Maxima, Maple, and Mathematica are examples of software packages that have symbolic manipulation capabilities as well as many tools for numerical analysis. Matlab and Octave are examples of software packages that are convenient for computations involving matrices and related tasks. As the computer plays an increasing role in our understanding of physical phenomena, the visual representation of complex numerical results is becoming even more important. The human eye in conjunction with the visual processing capacity of the brain is a very sophisticated device. Our eyes can determine patterns and trends that might not be evident from tables of data and can observe changes with time that can lead to insight into the important mechanisms underlying a system’s behavior. The use of graphics can also increase our understanding of the nature of analytical solutions. For example, what does a sine function mean to you? We suspect that your answer is not the series, sinx = x − x3/3! + x5/5! + ··· , but rather a periodic, constant amplitude curve (see Figure 1.1). What is most important is the mental image gained from a visualization of the form of the function. Traditional modes of presenting data include two- and three-dimensional plots including contour and field line plots. Frequently, more than three variables are needed to understand the behavior of a system, and new methods of using color and texture are being developed to help researchers gain greater insights into their data. An essential role of science is to develop models of nature. To know whether a model is consistent with observation, we have to understand the behavior of the model and its predictions. One way to do so is to implement the model on a computer. We call such an implementation a computer simulation or simulation for short. For example, suppose a teacher gives $10 to each student in a class of 100. The teacher, who also begins with $10 in her pocket, chooses a student at random and flips a coin. If the coin is heads, the teacher gives $1 to the student; otherwise, the student gives $1 to the teacher. If either the teacher or the student would go into debt by CHAPTER 1. INTRODUCTION 3 this transaction, the transaction is not allowed. After many exchanges, what is the probability that a student has s dollars? What is the probability that the teacher has t dollars? Are these two probabilities the same? Although these particular questions can be answered by analytical methods, many problems of this nature cannot be solved in this way (see Problem 1.1). One way to determine the answers to these questions is to do a classroom experiment. However, such an experiment would be difficult to arrange, and it would be tedious to do a sufficient number of transactions. A more practical way to proceed is to convert the rules of the model into a computer program, simulate many exchanges, and estimate the quantities of interest. Knowing the results might help us gain more insight into the nature of an analytical solution if one exists. We can also modify the rules and ask “what if?” questions. For example, would the probabilities change if the students could exchange money with one another? What would happen if the teacher was allowed to go into debt? Simulations frequently use the computational tools of numerical analysis and visualization, and occasionally symbolic manipulation. The difference is one of emphasis. Simulations are usually done with a minimum of analysis. Because simulation emphasizes an exploratory mode of learning, we will stress this approach. Computers are also involved in all phases of a laboratory experiment, from the design of the apparatus to the collection and analysis of data. LabView is an example of a data acquisition program. Some of the roles of the computer in laboratory experiments, such as the varying of parameters and the analysis of data, are similar to those encountered in simulations. However, the tasks involved in real-time control and interactive data analysis are qualitatively different and involve the interfacing of computer hardware to various types of instrumentation. We will not discuss this use of the computer. 1.2 The importance of computer simulation Why is computation becoming so important in physics? One reason is that most of our analytical tools such as differential calculus are best suited to the analysis of linear problems. For example, you probably have analyzed the motion of a particle attached to a spring by assuming a linear restoring force and solving Newton’s second law of motion. In this case a small change in the displacement of the particle leads to a small change in the force. However, many natural phenomena are nonlinear, and a small change in a variable might produce a large change in another. Because relatively few nonlinear problems can be solved by analytical methods, the computer gives us a new tool to explore nonlinear phenomena. Another reason for the importance of computation is the growing interest in systems with many variables or with many degrees of freedom. The money exchange model described in Section 1.1 is a simple example of a system with many variables. A similar problem is given at the end of this chapter. Computer simulations are sometimes referred to as computer experiments because they share much in common with laboratory experiments. Some of the analogies are shown in Table 1.1. The starting point of a computer simulation is the development of an idealized model of a physical system of interest. We then need to specify a procedure or algorithm for implementing the model on a computer and decide what quantities to measure. The results of a computer simulation can serve as a bridge between laboratory experiments and theoretical calculations. In some cases we can obtain essentially exact results by simulating an idealized model that has no CHAPTER 1. INTRODUCTION 4 Laboratory Experiment Computer Simulation sample model physical apparatus computer program calibration testing of program measurement computation data analysis data analysis Table 1.1: Analogies between a computer simulation and a laboratory experiment. laboratory counterpart. The results of the idealized model can serve as a stimulus to the development of the theory. On the other hand, we sometimes can do simulations of a more realistic model than can be done theoretically, and hence make a more direct comparison with laboratory experiments. Computation has become a third way of doing physics and complements both theory and experiment. Computer simulations, like laboratory experiments, are not substitutes for thinking, but are tools that we can use to understand natural phenomena. The goal of all our investigations of fundamental phenomena is to seek explanations of natural phenomena that can be stated concisely. 1.3 Programming languages There is no single best programming language any more than there is a best natural language. Fortran is the oldest of the more popular scientific programming languages and was developed by John Backus and his colleagues at IBM between 1954 and 1957. Fortran is commonly used in scientific applications and continues to evolve. Fortran 90/95/2000 has many modern features that are similar to C/C++. The Basic programming language was developed in 1965 by John Kemeny and Thomas Kurtz at Dartmouth College as a language for introductory courses in computer science. In 1983 Kemeny and Kurtz extended the language to include platform independent graphics and advanced control structures necessary for structured programming. The programs in the first two editions of our textbook were written in this version of Basic, known as True Basic. C was developed by Dennis Ritchie at Bell Laboratories around 1972 in parallel with the Unix operating system. C++ is an extension of C designed by Bjarne Stroustrup at Bell laboratories in the mid-eighties. C++ is considerably more complex than C and has object oriented features, as well as and other extensions. In general, programs written in C/C++ have high performance, but can be difficult to debug. C and C++ are popular choices for developing operating systems and software applications because they provide direct access to memory and other system resources. Python, like Basic, was designed to be easy to learn and use. Python enthusiasts like to say that C and C++ were written to make life easier for the computer, but Python was designed to be easier for the programmer. Guido van Rossum created Python in the late 80’s and early 90’s. It is an interpreted, object-oriented, general-purpose programming language that is also good for prototyping. Because Python is interpreted, its performance is significantly less than optimized languages like C or Fortran. Java is an object-oriented language that was created by James Gosling and others at Sun Microsystems. Since Java was introduced in late 1995, it has rapidly evolved and is the language of choice in most introductory computer science courses. Java borrows much of its syntax from CHAPTER 1. INTRODUCTION 5 C++ but has a simpler structure. Although the language contains only fifty keywords, the Java platform adds a rich library that enables a Java program to connect to the internet, render images, and perform other high-level tasks. Most modern languages incorporate object-oriented features. The idea of object-oriented programming is that functions and data are grouped together in an object, rather than treated separately. A program is a structured collection of objects that communicate with each other causing the internal state within a given object to change. A fundamental goal of object-oriented design is to increase the understandability and reusability of program code by focusing on what an object does and how it is used, rather than how an object is implemented. Our choice of Java for this text is motivated in part by its platform independence, flexible standard graphics libraries, good performance, and its no cost availability. The popularity of Java ensures that the language will continue to evolve, and that programming experience in Java is a valuable and marketable skill. The Java programmer can leverage a vast collection of third-party libraries, including those for numerical calculations and visualization. Java is also relatively simple to learn, especially the subset of Java that we will need to simulate physical systems. Java can be thought of as a platform in itself, similar to the Macintosh and Windows, because it has an application programming interface (API) that enables cross-platform graphics and user interfaces. Java programs are compiled to a platform neutral byte code so that they can run on any computer that has a Java Virtual Machine. Despite the high level of abstraction and platform independence, the performance of Java is becoming comparable with native languages. If a project requires more speed, the computationally demanding parts of the program can be converted to C/C++ or Fortran. Readers who wish to use another programming language should find the algorithmic components of the Java program listings in the text to be easily converted into a language of their choice. 1.4 Object oriented techniques If you already know how to program, try reading a program that you wrote several years or even several weeks ago. Many of us would not be able to follow the logic of our own program and would have to rewrite it. And your program would probably be of little use to a friend who needs to solve a similar problem. If you are learning programming for the first time, it is important to learn good programming habits to minimize this problem. One way is to employ object-oriented techniques such as encapsulation, inheritance, and polymorphism. Encapsulation refers to the way that an object’s essential information is exposed through a well-documented interface, but unnecessary details of the code are hidden. For example, we can model a particle as an object. Whenever a particle moves, it calculates its acceleration from the total force on it. Someone who wishes to use the trajectory of the particle, for example to animate the particle’s trajectory, needs to refer only to the interface and does not need to know how the trajectory is calculated. Inheritance allows a programmer to add capabilities to existing code without having to rewrite it or even know the details of how the code works. For example, you will write programs that show the evolution of planetary systems, quantum mechanical wave functions, and molecular models. Many of these programs will use (extend) code in the Open Source Physics library known as an AbstractSimulation. This code has a timer that periodically executes code in your program and then refreshes the on-screen animation. Using the Open Source Physics CHAPTER 1. INTRODUCTION 6 library will let you focus your efforts on programming the physics, because it is not necessary to write the code to produce the timer or to refresh the screen. Similarly, we have designed a general purpose graphical user interface (GUI) by extending code written by Sun Microsystems known as a JFrame. Our GUI has the features of a standard user interface such as a menu bar, minimize button, and title, even though we did not write the code to implement these features. Polymorphism helps us to write reusable code. For example, it is easy to imagine many types of objects that are able to evolve over time. In Chapter 15 we will simulate a system of particles using random numbers rather than forces to move the particles. By using polymorphism, we can write general purpose code to do animations with both types of systems. Science students have a rich context in which to learn programming. The past several decades of doing physics with computers has given us numerous examples that we can use to learn physics, programming, and data analysis. Unlike many programming manuals, the emphasis of this book is on learning by example. We will not discuss all aspects of Java, and this text is not a substitute for a text on Java. Think of how you learned your native language. First you learned by example, and then you learned more systematically. Although using an object oriented language makes it easier to write well-structured programs, it does not guarantee that your programs will be well written or even correct. The single most important criterion of program quality is readability. If your program is easy to read and follow, it is probably a good program. There are many analogies between a good program and a well-written paper. Few papers and programs come out perfectly on their first draft, regardless of the techniques and rules we use to write them. Rewriting is an important part of program- ming. 1.5 How to use this book Most chapters in this text begin with a brief background summary of the nature of a system and the important questions. We then introduce the computer algorithms, new syntax as needed, and discuss a sample program. The programs are meant to be read as text on an equal basis with the discussions and are interspersed throughout the text. It is strongly recommended that all the problems be read, because many concepts are introduced after you have had a chance to think about the result of a simulation. It is a good idea to maintain a computer-based notebook to record your programs, results, graphical output, and analysis of the data. This practice will help you develop good habits for future research projects, prevent duplication, organize your thoughts, and save you time. After a while you will find that most of your new programs will use parts of your earlier programs. Ideally, you will use your files to write a laboratory report or a paper on your work. Guidelines for writing a laboratory report are given in Appendix 1A. Many of the problems in the text are open ended and do not lend themselves to simple “back of the book” answers. So how will you know if your results are correct? How will you know when you have done enough? There are no simple answers to either question, but we can give some guidelines. First, you should compare the results of your program to known results whenever possible. The known results might come from an analytical solution that exists in certain limits or from published results. You should also look at your numbers and graphs, and determine if they make sense. Do the numbers have the right sign? Are they the right order of magnitude? Do the trends make sense as you change the parameters? What is the statistical error in the data? What is the systematic error? Some of the problems explicitly ask you to do these checks, but you should make it a habit to do as many as you can whenever possible. CHAPTER 1. INTRODUCTION 7 How do you know when you are finished? The main guideline is whether you can tell a coherent story about your system of interest. If you have only a few numbers and do not know their significance, then you need to do more. Let your curiosity lead you to more explorations. Do not let the questions asked in the problems limit what you do. The questions are only starting points, and frequently you will be able to think of your own questions. The following problem is an example of the kind of problems that will be posed in the following chapters. Note its similarity to the questions posed on page 3. Although most of the simulations that we will do will be on the kind of physical systems that you will encounter in other physics courses, we will consider simulations in related areas, such as traffic flow, small world networks, and economics. Of course, unless you already know how to do simulations, you will have to study the following chapters so that you will able to do problems like the following. Problem 1.1. Distribution of money The distribution of income in a society f (m) behaves as f (m) ∝ m−1−α, where m is the income (money) and the exponent α is between 1 and 2. The quantity f (m) can be taken to be the number of people who have an amount of money between m and m + ∆m. This power law behavior of the income distribution is often referred to as Pareto’s law or the 80/20 rule (20% of the people have 80% of the income) and was proposed in the late 1800’s by Vilfredo Pareto, an economist and sociologist. In the following, we consider some simple models of a closed economy to determine the relation between the microdynamics and the resulting macroscopic distribution of money. a. Suppose that N agents (people) can exchange money in pairs. For simplicity, we assume that all the agents are initially assigned the same amount of money m0, and the agents are then allowed to interact. At each time step, a pair of agents i and j with money mi and mj is randomly chosen and a transaction takes place. Again for simplicity, let us assume that mi → mi and mj → mj by a random reassignment of their total amount of money, mi + mj, such that mi = (mi + mj) (1.1a) mj = (1 − )(mi + mj) (1.1b) where is a random number between 0 and 1. Note that this reassignment ensures that the agents have no debt after the transaction, that is, they are always left with an amount m ≥ 0. Simulate this model and determine the distribution of money among the agents after the system has relaxed to an equilibrium state. Choose N = 100 and m0 = 1000. b. Now let us ask what happens if the agents save a fraction λ of their money before the transaction. We write mi = mi + δm (1.2a) mj = mj − δm (1.2b) δm = (1 − λ)[ mj − (1 − )mi]. (1.2c) Modify your program so that this savings model is implemented. Consider λ = 0.25, 0.50, 0.75, and 0.9. For some of the values of λ, as many as 107 transactions will need to be considered. Does the form of f (m) change for λ > 0? The form of f (m) for the model in Problem 1.1a can be found analytically and is known to students who have had a course in statistical mechanics. However, the analytical form of f (m) CHAPTER 1. INTRODUCTION 8 in Problem 1.1b is not known. More information about this model can be found in the article by Patriarca, Chakraborti, and Kaski (see the references at the end of this chapter). Problem 1.1 illustrates some of the characteristics of simulations that we will consider in the following chapters. Implementing this model on a computer would help you to gain insight into its behavior and might encourage you to explore variations of the model. Note that the model lends itself to asking a relatively simple “what if” question, which in this case leads to qualitatively different behavior. Asking similar questions might require modifying only a few lines of code. However, such a change might convert an analytically tractable problem into one for which the solution is unknown. Problem 1.2. Questions to consider a. You are familiar with the fall of various objects near the earth’s surface. Suppose that a ball is in the earth’s atmosphere long enough for air resistance to be important. How would you simulate the motion of the ball? b. Suppose that you wish to model a simple liquid such as liquid Argon. Why is such a liquid simpler to simulate than water? What is the maximum number of atoms that can be simulated in a reasonable amount of time using present computer technology? What is the maximum real time that is possible to simulate? That is, if we run our program for a week of computer time, what would be the equivalent time that the liquid has evolved? c. Discuss some examples of systems that would be interesting to you to simulate. Can these systems be analyzed by analytical methods? Can they be investigated experimentally? d. An article by Post and Votta (see references) claims that “. . . (computers) have largely replaced pencil and paper as the theorist’s main tool.” Do you agree with this statement? Ask some of the theoretical physicists that you know for their opinions. Appendix 1A: Laboratory reports Laboratory reports should reflect clear writing style and obey proper rules of grammar and correct spelling. Write in a manner that can be understood by another person who has not done the research. In the following, we give a suggested format for your reports. Introduction. Briefly summarize the nature of the physical system, the basic numerical method or algorithm, and the interesting or relevant questions. Method. Describe the algorithm and how it is implemented in the program. In some cases this explanation can be given in the program itself. Give a typical listing of your program. Simple modifications of the program can be included in an appendix if necessary. The program should include your name and date and be annotated in a way that is as selfexplanatory as possible. Be sure to discuss any important features of your program. Verification of program. Confirm that your program is not incorrect by considering special cases and by giving at least one comparison to a hand calculation or known result. Data. Show the results of some typical runs in graphical or tabular form. Additional runs can be included in an appendix. All runs should be labeled, and all tables and figures must be referred to in the body of the text. Each figure and table should have a caption with complete information, for example, the value of the time step. CHAPTER 1. INTRODUCTION 9 Analysis. In general, the analysis of your results will include a determination of qualitative and quantitative relationships between variables and an estimation of numerical accuracy. Interpretation. Summarize your results and explain them in simple physical terms whenever possible. Specific questions that were raised in the assignment should be addressed here. Also give suggestions for future work or possible extensions. It is not necessary to answer every part of each question in the text. Critique. Summarize the important physical concepts for which you gained a better understanding and discuss the numerical or computer techniques you learned. Make specific comments on the assignment and suggestions for improvements or alternatives. Log. Keep a log of the time spent on each assignment and include it with your report. References and suggestions for further reading Programming We list some of our favorite Java programming books here. There are many useful online tuto- rials. Joshua Bloch, Effective Java (Addison–Wesley, 2001). This excellent book is for advanced Java programmers and should be read after you have become familiar with Java. Rogers Cadenhead and Laura Lemay Teach Yourself Java in 21 Days 4th ed. (Sams, 2004). An inexpensive self-study guide that uses a step by step tutorial approach to cover the basics. Stephen J. Chapman, Java for Engineers and Scientists, 2nd ed. (Prentice Hall, 2004). Wolfgang Christian, Open Source Physics: A User’s Guide with Examples (Addison–Wesley, 2006). This guide is a useful supplement to our text. Bruce Eckel, Thinking in Java, 3rd ed. (Prentice Hall. 2003). This text discusses the finer points of object-oriented programming and is recommended after you have become familiar with Java. See also . David Flanagan, Java in a Nutshell, 5th ed. (O’Reilly, 2005) and Java Examples in a Nutshell, 3rd ed. (O’Reilly, 2004). A fast-paced Java tutorial for those who already know another programming language. Brian D. Hahn and Katherine M. Malan, Essential Java for Scientists and Engineers (ButterworthHeinemann, 2002). Cay S. Horstmann and Gary Cornell, Core Java 2: Fundamentals and Core Java 2: Advanced Features, both in 7th ed. (Prentice Hall, 2005). A two-volume set that covers all aspects of Java programming. Patrick Niemeyer and Jonathan Knudsen, Learning Java, 2nd ed. (O’Reilly, 2002). A comprehensive introduction to Java that starts with HelloWorld and ends with a discussion of XML. The book contains many examples showing how the core Java API is used. This book is one of our favorites for beginning Java programmers. However, it might be intimidating to someone who does not have some familiarity with computers. CHAPTER 1. INTRODUCTION 10 Sherry Shavor, Jim D’Anjou, Pat McCarthy, John Kellerman, and Scott Fairbrothe, The Java Developer’s Guide to Eclipse (Addison–Wesley Professional, 2003). A good reference for the open source Eclipse development environment. Check for new versions because Eclipse is evolving rapidly. General References on Physics and Computers Richard E. Crandall, Projects in Scientific Computation (Springer–Verlag, 1994). Paul L. DeVries, A First Course in Computational Physics (John Wiley & Sons, 1994). Alejandro L. Garcia, Numerical Methods for Physics, 2nd ed. (Prentice Hall, 2000). Matlab, C++, and Fortran are used. Neil Gershenfeld, The Nature of Mathematical Modeling (Cambridge University Press, 1998). Nicholas J. Giordano and Hisao Nakanishi, Computational Physics. 2nd ed. (Prentice Hall, 2005). Dieter W. Heermann, Computer Simulation Methods in Theoretical Physics, 2nd ed. (Springer– Verlag, 1990). A discussion of molecular dynamics and Monte Carlo methods directed toward advanced undergraduate and beginning graduate students. David Landau and Kurt Binder, A Guide to Monte Carlo Simulations in Statistical Physics, 2nd ed. (Cambridge University Press, 2005). The authors emphasize the complementary nature of simulation to theory and experiment. Rubin H. Landau, A First Course in Scientific Computing (Princeton University Press, 2005). P. Kevin MacKeown, Stochastic Simulation in Physics (Springer, 1997). Tao Pang, Computational Physics (Cambridge University Press, 1997). Franz J. Vesely, Computational Physics, 2nd ed. (Plenum Press, 2002). Michael M. Woolfson and Geoffrey J. Perl, Introduction to Computer Simulation (Oxford University Press, 1999). Other References Ruth Chabay and Bruce Sherwood, Matter & Interactions (John Wiley & Sons, 2002). This two-volume text uses computer models written in VPython to present topics not typically discussed in introductory physics courses. H. Gould, “Computational physics and the undergraduate curriculum,” Computer Physics Communications 127 (1), 6–10 (2000). Brian Hayes, “g-OLOGY,” Am. Scientist 92 (3), 212–216 (2004) discusses the g-factor of the electron and the importance of algebraic and numerical calculations. Problem 1.1 is based on a paper by Marco Patriarca, Anirban Chakraborti, and Kimmo Kaski, “Gibbs versus non-Gibbs distributions in money dynamics,” Physica A 340, 334–339 (2004). An interesting article on the future of computational science by Douglass E. Post and Lawrence G. Votta, “Computational science demands a new paradigm,” Physics Today 58 (1), 35–41 (2005) raises many interesting questions. CHAPTER 1. INTRODUCTION 11 Ross L. Spencer, “Teaching computational physics as a laboratory sequence,” Am. J. Phys. 73, 151–153 (2005). Chapter 2 Tools for Doing Simulations We introduce some of the core syntax of Java in the context of simulating the motion of falling particles near the Earth’s surface. A simple algorithm for solving first-order differential equations numerically is also discussed. 2.1 Introduction If you were to take a laboratory-based course in physics, you would soon be introduced to the oscilloscope. You would learn the function of many of the knobs, how to read the display, and how to connect various devices so that you could measure various quantities. If you did not know already, you would learn about voltage, current, impedance, and AC and DC signals. Your goal would be to learn how to use the oscilloscope. In contrast, you would learn only a little about the inner workings of the oscilloscope. The same approach can be easily adopted with an object-oriented language such as Java. If you are new to programming, you will learn how to make Java do what you want, but you will not learn everything about Java. In this chapter, we will present some of the essential syntax of Java and introduce the Open Source Physics library, which will facilitate writing programs with a graphical user interface and visual output such as plots and animations. One of the ways that science progresses is by making models. If the model is sufficiently detailed, we can determine its behavior and then compare the behavior with experiment. This comparison might lead to verification of the model, changes in the model, and further simulations and experiments. In the context of computer simulation, we usually begin with a set of initial conditions, determine the dynamical behavior of the model numerically, and generate data in the form of tables of numbers, plots, and animations. We begin with a simple example to see how this process works. Imagine a particle such as a ball near the surface of the Earth subject to a single force, the force of gravity. We assume that air friction is negligible, and the gravitational force is given by Fg = −mg (2.1) where m is the mass of the ball and g = 9.8 N/kg is the gravitational field (force per unit mass) near the Earth’s surface. To make our example as simple as possible, we first assume that there is only vertical motion. We use Newton’s second law to find the motion of the ball, m d2y dt2 = F (2.2) 12 CHAPTER 2. TOOLS FOR DOING SIMULATIONS 13 where y is the vertical coordinate defined so that up is positive, t is the time, F is the total force on the ball, and m is the inertial mass [which is the same as the gravitational mass in (2.1)]. If we set F = Fg, (2.1) and (2.2) lead to d2y dt2 = −g. (2.3) Equation (2.3) is a statement of a model for the motion of the ball. In this case the model is in the form of a second-order differential equation. You are probably familiar with the model summarized in (2.3) and know the analytic solu- tion: y(t) = y(0) + v(0)t − 1 2 gt2 (2.4a) v(t) = v(0) − gt. (2.4b) Nevertheless, we will determine the motion of a freely falling particle numerically in order to introduce the tools that we will need in a familiar context. We begin by expressing (2.3) as two first-order differential equations: dy dt = v (2.5a) dv dt = −g (2.5b) where v is the vertical velocity of the ball. We next approximate the derivatives by small (finite) differences: y(t + ∆t) − y(t) ∆t = v(t) (2.6a) v(t + ∆t) − v(t) ∆t = −g. (2.6b) Note that in the limit ∆t → 0, (2.6) reduces to (2.5). We can rewrite (2.6) as y(t + ∆t) = y(t) + v(t)∆t (2.7a) v(t + ∆t) = v(t) − g∆t. (2.7b) The finite difference approximation we used to obtain (2.7) is an example of the Euler algorithm. Equation (2.7) is an example of a finite difference equation, and ∆t is the time step. Now we are ready to follow y(t) and v(t) in time. We begin with an initial value for y and v and then iterate (2.7). If ∆t is sufficiently small, we will obtain a numerical answer that is close to the solution of the original differential equations in (2.6). In this case we know the answer, and we can test our numerical results directly. Exercise 2.1. A simple example Consider the first-order differential equation dy dx = f (x) (2.8) where f (x) is a function of x. The approximate solution as given by the Euler algorithm is yn+1 = yn + f (xn)∆x. (2.9) Note that the rate of change of y has been approximated by its value at the beginning of the interval, f (xn). CHAPTER 2. TOOLS FOR DOING SIMULATIONS 14 (a) Suppose that f (x) = 2x and y(x = 0) = 0. The analytic solution is y(x) = x2, which we can confirm by taking the derivative of y(x). Convert (2.8) into a finite difference equation using the Euler algorithm. For simplicity, choose ∆x = 0.1. It would be a good idea to first use a calculator or pencil and paper to determine yn for the first several time steps. (b) Sketch the difference between the exact solution and the approximate solution given by the Euler algorithm. What condition would the rate of change f (x) have to satisfy for the Euler algorithm to give the exact answer? Problem 2.2. Invent your own numerical algorithm As we have mentioned, the Euler algorithm evaluates the rate of change of y by its value at the beginning of the interval, f (xn). The choice of where to approximate the rate of change of y during the interval from x to x + ∆x is arbitrary, although we will learn that some choices are better than others. All that is required is that the finite difference equation must reduce to the original differential equation in the limit ∆x → 0. Think of several other algorithms that are consistent with this condition. 2.2 Simulating Free Fall The source code for the class FirstFallingBallApp shown in Listing 2.1 is defined in a file named FirstFallingBallApp.java. The code consists of a sequence of statements that create variables and define methods. Each statement ends with a semicolon. Each source code file is compiled into byte code that can then be executed. The compiler places the byte code in a file with the same name as the Java source code file with the extension class. For example, the compiler converts FirstFallingBallApp.java into byte code and produces the FirstFallingBallApp.class file. One of the features of Java is that this byte code can be used by any computer that can run Java programs. A Java application is a class that contains a main method. The following application is an implementation of the Euler algorithm given in (2.7). The program also compares the numerical and analytic results. We will next describe the syntax used in each line of the program. Listing 2.1: First version of a simulation of a falling particle. 1 / / example of a s i n g l e l i n e comment statement ( ignored by compiler ) 2 package org . opensourcephysics . sip . ch02 ; / / l o c a t i o n of f i l e 3 / / beginning of c l a s s d e f i n i t i o n 4 public class FirstFallingBallApp { 5 / / beginning of method d e f i n i t i o n 6 public s t a t i c void main ( String [ ] args ) { 7 / / braces { } used to group statements . 8 / / indent statements within a block so that 9 / / they can be e a s i l y i d e n t i f i e d 10 / / f ol l o wi n g statements form the body of main method 11 / / example of d e c l a r a t i o n and assignment statement 12 double y0 = 10; 13 double v0 = 0; / / i n i t i a l v e l o c i t y 14 double t = 0; / / time 15 double dt = 0.01; / / time st ep 16 double y = y0 ; 17 double v = v0 ; 18 double g = 9 . 8 ; / / g r a v i t a t i o n a l f i e l d CHAPTER 2. TOOLS FOR DOING SIMULATIONS 15 19 / / beginning of loop , n++ eq ui va le nt to n = n + 1 20 for ( int n = 0;n<100;n++) { 21 / / repeat f o ll o wi n g t h r e e statements 100 times 22 y = y+v dt ; / / indent statements in loop f o r c l a r i t y 23 v = v−g dt ; / / use Euler algorithm 24 t = t+dt ; 25 } / / end of f o r loop 26 System . out . println ( "Results" ) ; 27 System . out . println ( "final time = "+t ) ; 28 / / d i s p l a y numerical r e s u l t 29 System . out . println ( "y = "+y+" v = "+v ) ; 30 / / d i s p l a y a n a l y t i c r e s u l t 31 double yAnalytic = y0+v0 t −0.5 g t t ; 32 double vAnalytic = v0−g t ; 33 System . out . println ( "analytic y = "+yAnalytic+" v = "+vAnalytic ) ; 34 } / / end of method d e f i n i t i o n 35 } / / end of c l a s s d e f i n i t i o n The first line in Listing 2.1 is an example of a single line comment statement. Comment statements are ignored by the computer but can be very important for the user. Multiple line comments begin with /* and end with */. Javadoc comments begin with /**, but have been removed from the code listings in the book to save space. Download the source code from comPADRE to view the complete code with documentation. The second line in Listing 2.1 declares a package name, which corresponds to the location (directory) of the source and byte code files. According to the package declaration, the file FirstFallingBallApp.java is in the directory org/opensourcephysics/sip/ch02. The package statement must be the first noncomment statement in the source file. For organizational convenience, it is a good idea to put related files in the same package. When executing a Java program, the Java Virtual Machine (the run-time environment) will search a specific set of directories (called the classpath) for the relevant class files. The documentation for your local development environment will describe how to specify the classpath. The third line in Listing 2.1 declares the class name, FirstFallingBallApp. The Java convention is to begin a class name with an uppercase letter. If a name consists of more than one word, the words are joined together, and each succeeding word begins with an uppercase letter (another Java convention). The keyword public means that this class can be used by any other Java class. Braces are used to delimit a block of code. The left brace {, after the name of the class, begins the body of the class definition, and the corresponding right brace }, inserted at the end of the code listing on line 31 ends the class definition. The fourth line in Listing 2.1 begins the definition of the main method. A method describes a sequence of actions that use the associated data and can be called (invoked) within the class or by other classes. The main method has a special status in Java. To run a class as a stand-alone program (an application), the class must define the main method. (In contrast, a Java applet runs inside a browser and does not require a main method; instead, it has methods such as init and start.) The main method is the application’s starting point. The argument of the main method will always be the same, and understanding its syntax is not necessary here. Because the code for this book contains hundreds of classes, we will adopt our own convention that classes that define main methods have names that end with App. We sometimes refer to an application that we are about to run as the target class. Familiarize yourself with your Java development environment by doing Exercise 2.3. CHAPTER 2. TOOLS FOR DOING SIMULATIONS 16 Exercise 2.3. Our first application (a) Enter the listing of FirstFallingBallApp into a source file named FirstFallingBallApp.java. (Java programs can be written using any text editor that supports standard ASCII characters.) Be sure to pay attention to capitalization because Java is case sensitive. In what directory should you place the source file? (b) Compile and run FirstFallingBallApp. Do the results look reasonable to you? In what directory did the compiler place the byte code? Digital computers represent numbers in base 2, that is, sequences of ones and zeros. Each one or zero is called a bit. For example, the number 13 is equivalent to 1101 or (1×23)+(1×22)+ (0 × 21) + (1 × 20). It would be difficult to write a program if we had to write numbers in base 2. Computer languages allow us to reference memory locations using identifiers or variable names. A valid variable name is a series of characters consisting of letters, digits, underscores, and dollar signs ($) that does not begin with a digit nor contain any spaces. Because Java distinguishes between upper and lowercase characters, T and t are different variable names. The Java convention is that variable names begin with a lowercase letter, except in special cases, and each succeeding word in a variable name begins with an uppercase letter. In a purely object-oriented language, all variables would be objects that would be introduced by their class definitions. However, there are certain variable types that are so common that they have a special status and are especially easy to create and access. These types are called primitive data types and represent integer, floating point, boolean, and character variables. An example that illustrates that classes are effectively new programmer-defined types is given in Appendix 2A. An integer variable, a floating point variable, and a boolean variable are created and initialized by the following statements: int n = 10; double y0 = 10.0; boolean inert = true ; char c = ’A’ ; / / used f o r s i n g l e c h a r a c t e r s There are four types of integers, byte, short, int, and long, and two types of floating point numbers; the differences are the range of numbers that these types can store. We will almost always use type int because it does not require as much memory as type long. There are two types of floating point numbers, but we will always use type double, the type with greater precision, to minimize roundoff error and to avoid having to provide multiple versions of various algorithms. A variable must be declared before it can be used, and it can be initialized at the same time that its type is declared as is done in Listing 2.1. Integer arithmetic is exact, in contrast to floating point arithmetic which is limited by the maximum number of decimal places that can be stored. Important uses of integers are as counters in loops and as indices of arrays. An example of the latter is on page 38, where we discuss the motion of many balls. A subtle and common error is to use integers in division when a floating point number is needed. For example, suppose we flip a coin 100 times and find 53 heads. What is the percentage of heads? In the following we show an unintended side effect of integer division and several ways of obtaining a floating point number from an integer. CHAPTER 2. TOOLS FOR DOING SIMULATIONS 17 int heads = 53; int tosses = 100; double percentage = heads/ tosses ; / / percentage w i l l equal 0 percentage = ( double ) heads/ tosses ; / / percentage w i l l equal 0.53 percentage = (1.0 heads )/ tosses ; / / percentage w i l l equal 0.53 These statements indicate that if at least one number is a double, the result of the division will be a double. The expression (double)heads is called a cast and converts heads to a double. Because a number with a decimal point is treated as a double, we can also do this conversion by first multiplying heads by 1.0 as is done in the last statement. Note that we have used the assignment operator, which is the equal (=) sign. This operator assigns the value to the memory location that is associated with a variable, such as y0 and t. The following statements illustrate an important difference between the equal sign in mathematics and the assignment operator in most programming languages. int x = 10; x = x + 1; The equal sign replaces a value in memory and is not a statement of equality. The left and right sides of an assignment operator are usually not equal. A statement is analogous to a complete sentence, and an expression is similar to a phrase. The simplest expressions are identifiers or variables. More interesting expressions can be created by combining variables using operators, such as the following example of the plus (+) oper- ator: x + 3.0 Lines twelve through eighteen of Listing 2.1 declare and initialize variables. If a variable is declared but not initialized, for example, double dt ; then the default value of the variable is 0 for numbers and false for boolean variables. It is a good idea to initialize all variables explicitly and not rely on their default values. A very useful control structure is the for loop in line 15 of Listing 2.1. Loops are blocks of statements that are executed repeatedly until some condition is satisfied. They typically require the initialization of a counter variable, a test to determine if the counter variable has reached its terminal value, and a rule for changing the counter variable. These three parts of the for loop are contained within parentheses and are separated by semicolons. It is common in Java to iterate from 0 to 99, as is done in Listing 2.1, rather than from 1 to 100. Note the use of the ++ operator in the loop construct rather than the equivalent statement n = n + 1. It is important to indent all the statements within a block so that they can be easily identified. Java ignores these spaces, but they are important visual cues to the structure of the program. After the program finishes the loop, the result is displayed using the System.out.println method. We will explain the meaning of this syntax later. The parameter passed to this method, which appears between the parentheses, is a String. A String is a sequence of characters and can be created by enclosing text in quotation marks as shown in the first println statement in Listing 2.1. We displayed our numerical results by using the + operator. When applied to a String and a number, the number is converted to the appropriate String and the two strings are concatenated (joined). This use is shown in the next three println statements in lines 27, 29, and 33 of Listing 2.1. Note the different outputs produced by the following two statements: CHAPTER 2. TOOLS FOR DOING SIMULATIONS 18 System . out . println ( ( "x = " + 2) + 3 ) ; / / d i s p l a y s x = 23 System . out . println ( "x = " + (2 + 3 ) ) ; / / d i s p l a y s x = 5 The parentheses in the second line force the compiler to treat the enclosed + operator as the addition operator, but both + operators in the first line are treated as concatenation operators. Exercise 2.4. Exploring FirstFallingBallApp (a) Run FirstFallingBallApp for various values of the time step ∆t. Do the numerical results become closer to the analytic results as ∆t is made smaller? (b) Use an acceptable value for ∆t and run the program for various values for the number of iterations. What criteria do you have for acceptable? At approximately what time does the ball hit the ground at y = 0? (c) What happens if you replace the System.out.println method by the System.out.print method? (d) What happens if you try to access the value of the counter variable n outside the for loop? The scope of n extends from its declaration to the end of the loop block; n is said to have block scope. If a loop variable is not needed outside the loop, it should be declared in the initialization expression so that its scope is limited. You might have found that doing Exercise 2.4 was a bit tedious and frustrating. To do Exercise 2.4(a) it would be desirable to change the number of iterations at the same time that the value of ∆t is changed so that we could compare the results for y and v at the same time. And it is difficult to do Exercise 2.4(b) because we don’t know in advance how many iterations are needed to reach the ground. For starters we can improve FirstFallingBallApp using a while statement instead of the for loop. while ( y > 0) { / / statements go here } In this example the boolean test for the while statement is done at the beginning of a loop. It is also possible to do the test at the end: do { / / statements go here } while ( y > 0 ) ; Exercise 2.5. Using while statements Modify FirstFallingBallApp so that the while statement is used and the program ends when the ball hits the ground at y = 0. Then repeat Exercise 2.4(b). Exercise 2.6. Summing a series (a) Write a program to sum the following series for a given value of N: S = N m=1 1 m2 . (2.10) The following statements may be useful: CHAPTER 2. TOOLS FOR DOING SIMULATIONS 19 double sum = 0; / / sum i s eq ui va le nt to S in (2.10) for ( int m = 1; m <= N; m++) { sum = sum + 1.0/(m m) ; / / put t h i s statement in f o r loop } Note that in this case it is more convenient to start the loop from m = 1 instead of m = 0. Also note that we have not followed the Java convention, because we have used the variable name N instead of n so that the Java statements look more like the mathematical equations. (b) First run your program with N = 10. Then run for larger values of N. Does the series converge as N → ∞? What value of N is needed to obtain S to within two decimal places? (c) Modify your program so that it uses a while loop so that the summation continues until the added term to the sum is less than some value . Run your program for = 10−2, 10−3, and 10−6. (d) Instead of using the = operator in the statement sum = sum + 1.0/(m m) ; use the equivalent operator: sum += 1.0/(m m) ; Check that you obtain the same results. Java provides several shortcut assignment operators that allow you to combine an arithmetic and an assignment operation. Table 2.1 shows the operators that we will use most often. Operator Operand Description Sample Expression Result ++, - - number increment, decrement x++; 8.0 stored in x +, - numbers addition, subtraction 3.5 + x 11.5 ! boolean logical complement !(x == y) true = any assignment y = 3; 3.0 stored in y *, /, % numbers multiplication, division, modulus 7/2 3.0 == any test for equality x == y false += numbers x += 3; equivalent to x = x + 3; x += 3; 14.5 stored in x -= numbers x -= 2; equivalent to x = x - 2; x -= 2.3; 12.2 stored in x *= numbers x *= 4; equivalent to x = 4*x; x *= 4; 48.8 stored in x /= numbers x /= 2; equivalent to x = x/2; x /= 2; 24.4 stored in x %= numbers x %= 5; equivalent to x = x % 5; x %= 5; 4.4 stored in x Table 2.1: Common operators. The result for each row assumes that the statements from previous rows have been executed with double x = 7, y = 3 declared initially. The mod or modulus operator % computes the remainder after the division by an integer has been performed. 2.3 Getting Started with Object-Oriented Programming The first step in making our program more object-oriented is to separate the implementation of the model from the implementation of other programming tasks such as producing output. In general, we will do so by creating two classes. The class that defines the model is shown in CHAPTER 2. TOOLS FOR DOING SIMULATIONS 20 Listing 2.2. The FallingBall class first declares several (instance) variables and one constant that can be used by any method in the class. To aid reusability, we need to be very careful about the accessibility of these class variables to other classes. For example, if we write private double dt, then the value of dt would only be available to the methods in FallingBall. If we wrote public double dt, then dt would be available to any class in any package that tried to access it. For our purposes we will use the default package protection, which means that the instance variables can be accessed by classes in the same package. Listing 2.2: FallingBall class. package org . opensourcephysics . sip . ch02 ; public class FallingBall { double y , v , t ; / / instance v a r i a b l e s double dt ; / / d e f a u l t package p r o t e c t i o n final s t a t i c double g = 9 . 8 ; public FallingBall ( ) { / / c o n s t r u c t o r System . out . println ( "A new FallingBall object is created." ) ; } public void step ( ) { y = y+v dt ; / / Euler algorithm f o r numerical s o l u t i o n v = v−g dt ; t = t+dt ; } public double analyticPosition ( double y0 , double v0 ) { return y0+v0 t −0.5 g t t ; } public double analyticVelocity ( double v0 ) { return v0−g t ; } } As we will see, a class is a blueprint for creating objects, not an object itself. Except for the constant g, all the variable declarations in Listing 2.2 are instance variables. Each time an object is created or instantiated from the class, a separate block of memory is set aside for the instance variables. Thus, two objects created from the same class will, in general, have different values of the instance variables. We can insure that the value of a variable is the same for all objects created from the class by adding the word static to the declaration. Such a variable is called a class variable and is appropriate for the constant g. In addition, you might not want the quantity referred to by an identifier to change. For example, g is a constant of nature. We can prevent a change by adding the keyword final to the declaration. Thus the statement final s t a t i c double g = 9 . 8 ; means that a single copy of the constant g will be created and shared among all the objects instantiated from the class. Without the final qualifier, we could change the value of a class variable in every instantiated object by changing it in any one object. Static variables and methods are accessed from another class using the class name without first creating an instance (see page 25). Another Java convention is that the names of constants should be in upper case. But in physics the meaning of g, the gravitational field, and G, the gravitational constant, have com- CHAPTER 2. TOOLS FOR DOING SIMULATIONS 21 pletely different meanings. So we will disregard this convention if doing so makes our programs more readable. We have used certain words such as double, false, main, static, and final. These reserved words cannot be used as variable names and are examples of keywords. In addition to the four instance variables y, v, t, and dt, and one class variable g, the FallingBall class has four methods. The first method is FallingBall and is a special method known as the constructor. A constructor must have the same name as the class and does not have an explicit return type. We will see that constructors allocate memory and initialize instance variables when an object is created. The second method is step, a name that we will frequently use to advance a system’s coordinates by one time step. The qualifier void means that this method does not return a value. The next two methods, analyticPosition and analyticVelocity, each return a double value and have arguments enclosed by parentheses, the parameter list. The list of parameters and their types must be given explicitly and be separated by commas. The parameters can be primitive data types or class types. When the method is invoked, the argument types must match that given in the definition or be convertible into the type given in the definition, but need not have the same names. (Convertible means that the given variable can be unambiguously converted into another data type. For example, an integer can always be converted into a double.) For example, we can write double y0 = 10; / / d e c l a r a t i o n and assignment int v0 = 0; / / note v0 i s an i n t e g e r / / v0 becomes a double b e f o r e method i s c a l l e d double y = analyticPosition ( y0 , v0 ) ; double v = analyticVelocity ( v0 ) ; but the following statements are incorrect: / / can ’ t convert String to double automatically double y = analyticPosition ( y0 , "0" ) ; / / method e x p e c t s only one argument double v = analyticVelocity ( v0 , 0 ) ; If a method does not receive any parameters, the parentheses are still required as in method step(). The FallingBall class in Listing 2.2 cannot be used in isolation because it does not contain a main method. Thus, we create a target class which we place in a separate file in the same package. This class will communicate with FallingBall and include the output statements. This class is shown in Listing 2.3. Listing 2.3: FallingBallApp class. / / package statement appears b e f o r e beginning of c l a s s d e f i n i t i o n package org . opensourcephysics . sip . ch02 ; / / beginning of c l a s s d e f i n i t i o n public class FallingBallApp { / / beginning of method d e f i n i t i o n public s t a t i c void main ( String [ ] args ) { / / d e c l a r a t i o n and i n s t a n t i a t i o n FallingBall ball = new FallingBall ( ) ; / / example of d e c l a r a t i o n and assignment statement double y0 = 10; CHAPTER 2. TOOLS FOR DOING SIMULATIONS 22 double v0 = 0; / / note use of dot operator to a c c e s s instance v a r i a b l e ball . t = 0; ball . dt = 0.01; ball . y = y0 ; ball . v = v0 ; while ( ball . y>0) { ball . step ( ) ; } System . out . println ( "Results" ) ; System . out . println ( "final time = "+ball . t ) ; / / d i s p l a y s numerical r e s u l t s System . out . println ( "y = "+ball . y+" v = "+ball . v ) ; / / d i s p l a y s a n a l y t i c r e s u l t s System . out . println ( "analytic y = "+ball . analyticPosition ( y0 , v0 ) ) ; System . out . println ( "analytic v = "+ball . analyticVelocity ( v0 ) ) ; System . out . println ( "acceleration = "+FallingBall . g ) ; } / / end of method d e f i n i t i o n } / / end of c l a s s d e f i n i t i o n Note how FallingBall is declared and instantiated by creating an object called ball and how the instance variables and the methods are accessed. The statement FallingBall ball = new FallingBall ( ) ; / / d e c l a r a t i o n and i n s t a n t i a t i o n is equivalent to two statements: FallingBall ball ; / / d e c l a r a t i o n ball = new FallingBall ( ) ; / / i n s t a n t i a t i o n The declaration statement tells the compiler that the variable ball is of type FallingBall. It is analogous to the statement int x for an integer variable. The new operator allocates memory for this object, initializes all the instance variables, and invokes the constructor. We can create two identical balls using the following statements: FallingBall ball1 = new FallingBall ( ) ; FallingBall ball2 = new FallingBall ( ) ; The variables and methods of an object are accessed by using the dot operator. For example, the variable t of object ball is accessed by the expression ball.t, and the method step is called as ball.step(). Because the methods, analyticPosition and analyticVelocity return values of type double, they can appear in any expression in which a double-valued constant or variable can appear. In the present context the values returned by these two methods will be displayed by the println statement. Note that the static variable g in class FallingBallApp is accessed through the class name. Exercise 2.7. Use of two classes (a) Enter the listing of FallingBall into a file named FallingBall.java and FallingBallApp into a file named FallingBallApp.java and put them in the same directory. Run your program and make sure your results are the same as those found in Exercise 2.5. (b) Modify FallingBallApp by adding a second instance variable ball2 of the same type as ball. Add the necessary code to initialize ball2, iterate ball2, and display the results for CHAPTER 2. TOOLS FOR DOING SIMULATIONS 23 both objects. Write your program so that the only difference between the two balls is the value of ∆t. How much smaller does ∆t have to be to reduce the error in the numerical results by a factor of two for the same final time? What about a factor of four? How does the error depend on ∆t? (c) Add the statement FallingBall.g = 2.0 to your program from part (b) and use the same value of dt for ball and ball2. What happens when you try to compile the program? (d) Delete the final qualifier for g in FallingBall and recompile and run your program. Is there any difference between the results for the two balls? Is there a difference between the results compared to what you found for g = 9.8? (e) Remove the qualifier static. Now g must be accessed using the object name, ball or ball2 instead of FallingBall. Recompile your program again, and run your program. How do the results for the two balls compare now? (f) Explain in your own words the meaning of the qualifiers static and final. It is possible for a class to have more than one constructor. For example, we could have a second constructor defined by public FallingBall ( double dt ) { / / " t h i s . dt " r e f e r s to an instance v a r i a b l e that has the / / same name as the argument this . dt = dt ; } Note the possible confusion of the variable name dt in the argument of the FallingBall constructor and the variable defined near the beginning of the FallingBall class. A variable that is passed to a method as an argument (parameter) or that is defined (created) within a method is known as a local variable. A variable that is defined outside of a method is known as an instance variable. Instance variables are more powerful than local variables because they can be referenced (used) anywhere within an object, and because their values are not lost when the execution of the method is finished. When a variable name conflict occurs, it is necessary to use the keyword this to access the instance variable. Otherwise, the program would access the variable in the argument (the local variable) with the same name. Exercise 2.8. Multiple constructors (a) Add a second constructor with the argument double dt to FallingBall, but make no other changes. Run your program. Nothing changed because you didn’t use this new constructor. (b) Now modify FallingBallApp to use the new constructor: / / d e c l a r a t i o n and i n s t a n t i a t i o n FallingBall ball = new FallingBall ( 0 . 0 1 ) ; What statement in FallingBallApp can now be removed? Run your program and make sure it works. How can you tell that the new constructor was used? (c) Show that the number of parameters and their type in the argument list determines which constructor is used in FallingBall. For example, show that the statements CHAPTER 2. TOOLS FOR DOING SIMULATIONS 24 double tau = 0.01; / / d e c l a r a t i o n and i n s t a n t i a t i o n FallingBall ball = new FallingBall ( tau ) ; are equivalent to the syntax used in part (b). It is easy to create additional models for other kinds of motion. Cut and paste the code in the FallingBall into a new file named SHO.java, and change the code to solve the following two first-order differential equations for a ball attached to a spring: dx dt = v (2.11a) dv dt = − k m x (2.11b) where x is the displacement from equilibrium and k is the spring constant. Note that the new class shown in Listing 2.4 has a structure similar to that of the class shown in Listing 2.2. Listing 2.4: SHO class. package org . opensourcephysics . sip . ch02 ; public class SHO { double x , v , t ; double dt ; double k = 1 . 0 ; / / spring constant double omega0 = Math . sqrt (k ) ; / / assume unit mass public SHO( ) { / / c o n s t r u c t o r System . out . println ( "A new harmonic oscillator object is created." ) ; } public void step ( ) { / / modified Euler algorithm v = v−k x dt ; x = x+v dt ; / / note that updated v i s used t = t+dt ; } public double analyticPosition ( double y0 , double v0 ) { return y0 Math . cos ( omega0 t )+v0/omega0 Math . sin ( omega0 t ) ; } public double analyticVelocity ( double y0 , double v0 ) { return −y0 omega0 Math . sin ( omega0 t )+v0 Math . cos ( omega0 t ) ; } } Exercise 2.9. Simple harmonic oscillator (a) Explain how the implementation of the Euler algorithm in the step method of class SHO differs from what we did previously. (b) The general form of the analytic solution of (2.11) can be expressed as y(t) = Acosω0t + Bsinω0t (2.12) CHAPTER 2. TOOLS FOR DOING SIMULATIONS 25 where ω2 0 = k/m. What is the form of v(t)? Show that (2.12) satisfies (2.11) with A = y(t = 0) and B = v(t = 0)/ω0. These analytic solutions are used in class SHO. (c) Write a target class called SHOApp that creates an SHO object and solves (2.11). Start the ball with displacements of x = 1, x = 2, and x = 4. Is the time it takes for the ball to reach x = 0 always the same? The methods that we have written so far have been nonstatic methods (except for main). As we have seen, these methods cannot be used without first creating or instantiating an object. In contrast, static methods can be used directly without first creating an object. A class that is included in the core Java distribution and that we will use often is the Math class, which provides many common mathematical methods, including trigonometric, logarithmic, exponential, and rounding operations, and predefined constants. Some examples of the use of the Math class include: double theta = Math . PI /4; / / constant pi defined in Math c l a s s double u = Math . sin ( theta ) ; / / sine of theta double v = Math . log ( 0 . 1 ) ; / / natural logarithm of 0.1 double w = Math .pow( 1 0 , 0 . 4 ) ; / / 10 to the 0.4 power double x = Math . atan ( 3 . 0 ) ; / / i n v e r s e tangent Note the use of the dot notation in these statements and the Java convention that constants such as the value of π are written in uppercase letters, that is, Math.PI. Exercise 2.10 asks you to read the Math class documentation to learn about the methods in the Math class. To use these methods we need only to know what mathematical functions they compute; we do not need to know about the details of how the methods are implemented. Exercise 2.10. The Math class The documentation for Java is a part of most development environments. It can also be downloaded from . Look for API docs and a link to the latest standard edition. (a) Read the documentation of the Math class and describe the difference between the two versions of the arctangent method. (b) Write a program to verify the output of several of the methods in the Math class. 2.4 Inheritance The falling ball and the simple harmonic oscillator have important features in common. Both are models of physical systems that represent a physical object as if all its mass was concentrated at a single point. Writing two separate classes by cutting and pasting is straightforward and reasonable because the programs are small and easy to understand. But this approach fails when the code becomes more complex. For example, suppose that you wish to simulate a model of a liquid consisting of particles that interact with one another according to some specified force law. Because such simulations are now standard (see Chapter 8), efficient code for such simulations is available. In principle, it would be desirable to use an already written program, assuming that you understood the nature of such simulations. However, in practice, using someone else’s program can require much effort if the code is not organized properly. Fortunately, this situation is changing as more programmers learn object- oriented techniques and CHAPTER 2. TOOLS FOR DOING SIMULATIONS 26 write their programs so that they can be used by others without needing to know the details of the implementation. For example, suppose that you decided to modify an already existing program by changing to a different force law. You change the code and save it under a new name. Later you discover that you need a different numerical algorithm to advance the particles’ positions and velocities. You again change the code and save the file under yet another name. At the same time the original author discovers a bug in the initialization method and changes her code. Your code is now out of date because it does not contain the bug fix. Although strict documentation and programming standards can minimize these types of difficulties, a better approach is to use object oriented features such as inheritance. Inheritance avoids duplication of code and makes it easier to debug a number of classes without needing to change each class separately. We now write a new class that encapsulates the common features of the falling ball and the simple harmonic oscillator. We name this new class Particle. The falling ball and harmonic oscillator that we will define later implement their distinguishing features. Listing 2.5: Particle class. package org . opensourcephysics . sip . ch02 ; abstract public class P a r t i c l e { double y , v , t ; / / instance v a r i a b l e s double dt ; / / time st ep public P a r t i c l e ( ) { / / c o n s t r u c t o r System . out . println ( "A new Particle is created." ) ; } abstract protected void step ( ) ; abstract protected double analyticPosition ( ) ; abstract protected double analyticVelocity ( ) ; } The abstract keyword allows us to define the Particle class without knowing how the step, analyticPosition, and analyticVelocity methods will be implemented. Abstract classes are useful in part because they serve as templates for other classes. The abstract class contains some but not all of what a user will need. By making the class abstract, we must express the abstract idea of “particle” explicitly and customize the abstract class to our needs. By using inheritance we now extend the Particle class (the superclass) to another class (the subclass). The FallingParticle class shown in Listing 2.6 implements the three abstract methods. Note the use of the keyword extends. We also have used a constructor with the initial position and velocity as arguments. Listing 2.6: FallingParticle class. package org . opensourcephysics . sip . ch02 ; public class F a l l i n g P a r t i c l e extends P a r t i c l e { final s t a t i c double g = 9 . 8 ; / / constant / / i n i t i a l p o s i t i o n and v e l o c i t y private double y0 = 0 , v0 = 0; public F a l l i n g P a r t i c l e ( double y , double v ) { / / c o n s t r u c t o r System . out . println ( "A new FallingParticle object is created." ) ; this . y = y ; / / instance value s e t equal to passed value this . v = v ; / / instance value s e t equal to passed value CHAPTER 2. TOOLS FOR DOING SIMULATIONS 27 y0 = y ; / / no need to use " t h i s " because t h e r e i s only one y0 v0 = v ; } public void step ( ) { y = y+v dt ; / / Euler algorithm v = v−g dt ; t = t+dt ; } public double analyticPosition ( ) { return y0+v0 t −(g t t ) / 2 . 0 ; } public double analyticVelocity ( ) { return v0−g t ; } } FallingParticle is a subclass of its superclass Particle. Because the methods and data of the superclass are available to the subclass (except those that are explicitly labeled private), FallingParticle inherits the variables y, v, t, and dt.1 We now write a target class to make use of our new abstraction. Note that we create a new FallingParticle, but assign it to a variable of type Particle. Listing 2.7: FallingParticleApp class. package org . opensourcephysics . sip . ch02 ; / / beginning of c l a s s d e f i n i t i o n public class FallingParticleApp { / / beginning of method d e f i n i t i o n public s t a t i c void main ( String [ ] args ) { / / d e c l a r a t i o n and i n s t a n t i a t i o n P a r t i c l e ball = new F a l l i n g P a r t i c l e (10 , 0 ) ; ball . t = 0; ball . dt = 0.01; while ( ball . y>0) { ball . step ( ) ; } System . out . println ( "Results" ) ; System . out . println ( "final time = "+ball . t ) ; / / numerical r e s u l t System . out . println ( "y = "+ball . y+" v = "+ball . v ) ; / / a n a l y t i c r e s u l t System . out . println ( "y analytic = "+ball . analyticPosition ( ) ) ; } / / end of method d e f i n i t i o n } / / end of c l a s s d e f i n i t i o n Problem 2.11. Inheritance (a) Run the FallingParticleApp class. How can you tell that the constructor of the superclass was called? 1In this case Particle and FallingParticle must be in the same package. If FallingParticle was in a different package, it would be able to access these variables only if they were declared protected or public. CHAPTER 2. TOOLS FOR DOING SIMULATIONS 28 (b) Rewrite the SHO class so that it is a subclass of Particle. Remove all unnecessary variables and implement the abstract methods. (c) Write the target class SHOParticleApp to use the new SHOParticle class. Use the analyticPosition and analyticVelocity methods to compare the accuracy of the numerical and analytic answers in both the falling particle and harmonic oscillator models. (d) Try to instantiate a Particle directly by calling the Particle constructor. Explain what happens when you compile this program. If you examine the console output in Problem 2.11a, you should find that whenever an object from the subclass is instantiated, the constructor of the superclass is executed as well as the constructor of the subclass. You will also find that an abstract class cannot be instantiated directly; it must be extended first. Exercise 2.12. Extending classes (a) Extend the FallingParticle and SHOParticle classes and give them names such as FallingParticleEC and SHOParticleEC, respectively. These subclasses should redefine the step method so that it first calculates the new velocity and then calculates the new position using the new velocity, that is, public void step ( ) { v = v − g dt ; / / f a l l i n g b a l l y = y + v dt ; f t = t + dt ; } public void step ( ) { v = v − k x dt ; / / harmonic o s c i l l a t o r x = x + v dt ; t = t + dt ; } Methods can be redefined (overloaded) in the subclass by writing a new method in the subclass definition with the same name and parameter list as the superclass definition. (b) Confirm that your new step method is executed instead of the one in the superclass. (c) The algorithm that is implemented in the redefined step method is known as the Euler– Cromer algorithm. Compare the accuracy of this algorithm to the original Euler algorithm for both the falling particle and the harmonic oscillator. We will explore the Euler–Cromer algorithm in more detail in Problem 3.1. The falling particle and harmonic oscillator programs are simple, but they demonstrate important object-oriented concepts. However, we typically will not build our models using inheritance because our focus is on the physics and not on producing a software library, and also because readers will not use our programs in the same order. We will find that our main use of inheritance will be to extend abstract classes in the Open Source Physics library to implement calculations and simulations by customizing a small number of methods. So far our target classes have only included one method, main. We could have used more than one method, but for the short demonstration and test programs we have written so far, CHAPTER 2. TOOLS FOR DOING SIMULATIONS 29 Figure 2.1: An Open Source Physics control that is used to input parameter values and display results. such a practice is unnecessary. When you send a short email to a friend, you are not likely to break up your message into paragraphs. But when you write a paper longer than about a half a page, it makes sense to use more than one paragraph. The same sensitivity to the need for structure should be used in programming. Most of the programs in the following chapters will consist of two classes, each of which will have several instance variables and methods. 2.5 The Open Source Physics Library For each exercise in this chapter, you have had to change the program, compile it, and then run it. It would be much more convenient to input initial conditions and values for the parameters without having to recompile. However, a discussion of how to make input fields and buttons using Java would distract us from our goal of learning how to simulate physical systems. Moreover, the code we would use for input (and output) would be almost the same in every program. For this reason input and output should be in separate classes so that we can easily use them in all our programs. Our emphasis will be to describe how to use the Open Source Physics library as a tool for writing graphical interfaces, plotting graphs, and doing visualizations. If you are interested, you can read the source code of the many Open Source Physics classes and can modify or subclass them to meet your special needs. We first introduce the Open Source Physics library in several simple contexts. Download the Open Source Physics library from and include the library in your development environment. The following program illustrates how to make a simple plot. Listing 2.8: An example of a simple plot. package org . opensourcephysics . sip . ch02 ; import org . opensourcephysics . frames . PlotFrame ; public class PlotFrameApp { public s t a t i c void main ( String [ ] args ) { PlotFrame frame = new PlotFrame ( "x" , "sin(x)/x" , "Plot example" ) ; CHAPTER 2. TOOLS FOR DOING SIMULATIONS 30 for ( int i = 1; i <=100; i ++) { double x = i 0 . 2 ; frame . append (0 , x , Math . sin ( x )/ x ) ; } frame . s e t V i s i b l e ( true ) ; frame . setDefaultCloseOperation ( javax . swing . JFrame .EXIT_ON_CLOSE ) ; } } The import statement tells the Java compiler where to find the Open Source Physics classes that are needed. A frame is often referred to as a window and can include a title and a menu bar as well as objects such as buttons, graphics, and text information. The Open Source Physics frames package defines several frames that contain data visualization and analysis tools. We will use the PlotFrame class to plot x-y data. The constructor for PlotFrame has three arguments corresponding to the name of the horizontal axis, the name of the vertical axis, and the title of the plot. To add data to the plot, we use the append method. The first argument of append is an integer that labels a particular set of data points, the second argument is the horizontal (x) value of the data point, and the third argument is the vertical (y) value. The setVisible(true) method makes a frame appear on the screen or brings it to the front. The last statement makes the program exit when the frame is closed. What happens when this statement is not included? The example from the Open Source Physics library in Listing 2.9 illustrates how to control a calculation with two buttons, determine the value of an input parameter, and display the result in the text message area. Listing 2.9: An example of a Calculation. package org . opensourcephysics . sip . ch02 ; / / means get a l l c l a s s e s in c o n t r o l s s u b d i r e c t o r y import org . opensourcephysics . controls . ; public class CalculationApp extends AbstractCalculation { public void calculate ( ) { / / Does a c a l c u l a t i o n control . println ( "Calculation button pressed." ) ; / / String must match argument of setValue double x = control . getDouble ( "x value" ) ; control . println ( "x*x = "+(x x ) ) ; control . println ( "random = "+Math . random ( ) ) ; } public void reset ( ) { / / d e s c r i b e s parameter and s e t s i t s value control . setValue ( "x value" , 1 0 . 0 ) ; } / / Creates a c a l c u l a t i o n c o n t r o l s t r u c t u r e using t h i s c l a s s public s t a t i c void main ( String [ ] args ) { CalculationControl . createApp (new CalculationApp ( ) ) ; } } AbstractCalculation is an abstract class, which as we have seen means that it cannot be instantiated directly and must be extended in order to implement the calculate method, that CHAPTER 2. TOOLS FOR DOING SIMULATIONS 31 is, you must write (implement) the calculate method. You can also write an optional reset method, which is called whenever the Reset button is clicked. Finally, we need to create a graphical user interface that will invoke methods when the Calculate and Reset buttons are clicked. This user interface is an object of type CalculationControl: CalculationControl . createApp (new CalculationApp ( ) ) ; The method createApp is a static method that instantiates an object of type CalculationControl and returns this object. We could have written CalculationControl control = CalculationControl . createApp (new CalculationApp ( ) ) ; which shows explicitly the returned object which we gave the name control. However, because we do not use the object control explicitly in the main method, we do not need to actually declare an object name for it. Exercise 2.13. CalculationApp Compile and run CalculationApp. Describe what the graphical user interface looks like and how it works by clicking the buttons (see Figure 2.1). The reset method is called automatically when a program is first created and whenever the Reset button is clicked. The purpose of this method is to clear old data and recreate the initial state with the default values of the parameters and instance variables. The default values of the parameters are displayed in the control window so that they can be changed by the user. An example of how to show values in a control follows: public void reset ( ) { / / d e s c r i b e s parameter and s e t s the value control . setValue ( "x value" , 1 0 . 0 ) ; } The string appearing in the setValue method must be identical to the one appearing in the getDouble method. If you write your own reset method, it will override the reset method that is already defined in the AbstractCalculation superclass. After the reset method stores the parameters in the control, the user can edit the parameters and we can later read these parameters using the calculate method: public void calculate ( ) { / / String must match argument of setValue double x = control . getDouble ( "x value" ) ; control . println ( "x*x = " + ( x x ) ) ; } Exercise 2.14. Changing parameters (a) Run CalculateApp to see how the control window can be used to change a program’s parameters. What happens if the string in the getDouble method does not match the string in the setValue method? (b) Incorporate the plot statements in Listing 2.8 into a class that extends the AbstractCalculation class and plots the function sinkx for various values of the input parameter k. CHAPTER 2. TOOLS FOR DOING SIMULATIONS 32 When you run the modified CalculationApp in Exercise 2.14, you should see a window with two buttons and an input parameter and its default value. Also, there should be a text area below the buttons where messages can appear. When the Calculate button is clicked, the calculate method is executed. The control.getDouble method reads in values from the control window. These values can be changed by the user. Then the calculation is performed and the result displayed in the message area using the control.println method, similar to the way we used System.out.println earlier. If the Reset button is clicked, the message area is cleared and the reset method is called. We will now use a CalculationControl to change the input parameters for a falling particle. The modified FallingParticleApp is shown in Listing 2.10. Listing 2.10: FallingParticleCalcApp class. package org . opensourcephysics . sip . ch02 ; import org . opensourcephysics . controls . ; / / beginning of c l a s s d e f i n i t i o n public class FallingParticleCalcApp extends AbstractCalculation { public void calculate ( ) { / / g e t s i n i t i a l c o n d i t i o n s double y0 = control . getDouble ( "Initial y" ) ; double v0 = control . getDouble ( "Initial v" ) ; / / s e t s i n i t i a l c o n d i t i o n s P a r t i c l e ball = new F a l l i n g P a r t i c l e ( y0 , v0 ) ; / / reads parameters and s e t s dt ball . dt = control . getDouble ( "dt" ) ; while ( ball . y>0) { ball . step ( ) ; } control . println ( "final time = "+ball . t ) ; / / d i s p l a y s numerical r e s u l t s control . println ( "y = "+ball . y+" v = "+ball . v ) ; / / d i s p l a y s a n a l y t i c p o s i t i o n control . println ( "analytic y = "+ball . analyticPosition ( ) ) ; / / d i s p l a y s a n a l y t i c v e l o c i t y control . println ( "analytic v = "+ball . analyticVelocity ( ) ) ; } public void reset ( ) { control . setValue ( "Initial y" , 10); control . setValue ( "Initial v" , 0 ) ; control . setValue ( "dt" , 0 . 0 1 ) ; } / / c r e a t e s a c a l c u l a t i o n c o n t r o l s t r u c t u r e using t h i s c l a s s public s t a t i c void main ( String [ ] args ) { CalculationControl . createApp (new FallingParticleCalcApp ( ) ) ; } } / / end of c l a s s d e f i n i t i o n Exercise 2.15. Input of parameters and initial conditions (a) Run FallingParticleCalcApp and make sure you understand how the control works. Try inputting different values of the parameters and the initial conditions. CHAPTER 2. TOOLS FOR DOING SIMULATIONS 33 (b) Vary ∆t and find the value of t when y = 0 to two decimal places. Exercise 2.16. Displaying floating point numbers Double precision numbers store 16 significant digits and every digit is included when the number is converted to a string. We can reduce the number of digits that are displayed using the DecimalFormat class in the java.text package. A formatter is created using a pattern, such as #0.00 or #0.00E0, and this format is applied to a number to produce a string. DecimalFormat decimal2 = new DecimalFormat ( "#0.00" ) ; double x = 1 . 0 / 3 . 0 ; System . out . println ( "x = "+decimal2 . format ( x ) ) ; / / d i s p l a y s 3.33 (a) Use the DecimalFormat class to modify the output from FallingParticleCalcApp so that it matches the output shown in Figure 2.1. (b) Modify the output so that results are shown using scientific notation with three decimal places. (c) The Open Source Physics ControlUtils class in the controls package contains a static method f3 that formats a floating point number using three decimal places. Use this method to format the output from FallingParticleCalcApp. You probably have found that it is difficult to write a program so that it ends exactly when the falling ball is at y = 0. We could write the program so that ∆t keeps changing near y = 0 so that the last value computed is at y = 0. Another limitation of our programs that we have written so far is that we have shown the results only at the end of the calculation. We could put println statements inside the while loop, but it would be better to plot the results and have a table of the data. An example is shown in Listing 2.11. Listing 2.11: FallingParticlePlotApp class. package org . opensourcephysics . sip . ch02 ; import org . opensourcephysics . controls . ; import org . opensourcephysics . frames . ; public class FallingParticlePlotApp extends AbstractCalculation { PlotFrame plotFrame = new PlotFrame ( "t" , "y" , "Falling Ball" ) ; public void calculate ( ) { / / data not c l e a r e d at beginning of each c a l c u l a t i o n plotFrame . setAutoclear ( false ) ; / / g e t s i n i t i a l c o n d i t i o n s double y0 = control . getDouble ( "Initial y" ) ; double v0 = control . getDouble ( "Initial v" ) ; / / s e t s i n i t i a l c o n d i t i o n s P a r t i c l e ball = new F a l l i n g P a r t i c l e ( y0 , v0 ) ; / / g e t s parameters ball . dt = control . getDouble ( "dt" ) ; double t = ball . t ; / / g e t s value of time from b a l l o b j e c t while ( ball . y>0) { ball . step ( ) ; plotFrame . append (0 , ball . t , ball . y ) ; CHAPTER 2. TOOLS FOR DOING SIMULATIONS 34 plotFrame . append (1 , ball . t , ball . analyticPosition ( ) ) ; } } public void reset ( ) { control . setValue ( "Initial y" , 10); control . setValue ( "Initial v" , 0 ) ; control . setValue ( "dt" , 0 . 0 1 ) ; } / / s e t s up c a l c u l a t i o n c o n t r o l s t r u c t u r e using t h i s c l a s s public s t a t i c void main ( String [ ] args ) { CalculationControl . createApp (new FallingParticlePlotApp ( ) ) ; } } The two data sets, indexed by 0 and 1, correspond to the numerical data and the analytic results, respectively. The default action in the Open Source Physics library is to clear the data and redraw data frames when the Calculate button is clicked. This automatic clearing of data can be disabled using the setAutoclear method. We have disabled it here to allow the user to compare the results of multiple calculations. Data is automatically cleared when the Reset button is clicked. Exercise 2.17. Data output (a) Run FallingParticlePlotApp. Under the Views menu choose DataTable to see a table of data corresponding to the plot. You can copy this data and use it in another program for further analysis. (b) Your plotted results probably look like one set of data because the numerical and analytic results are similiar. Let dt = 0.1 and click the Calculate button. Does the discrepancy between the numerical and analytic results become larger with increasing time? Why? (c) Run the program for two different values of dt. How do the plot and the table of data differ when two runs are done, first separated without clicking Reset, and then done by clicking Reset between calculations? Make sure you look at the entire table to see the difference. When is the data cleared? What happens if you eliminate the plotFrame.setAutoclear(false) statement? When is the data cleared now? (d) Modify your program so that the velocity is shown in a separate window from the position. 2.6 Animation and Simulation The AbstractCalculation class provides a structure for doing a single computation for a fixed amount of time. However, frequently we do not know how long we want to run a program, and it would be desirable if the user could intervene at any time. In addition, we would like to be able to visualize the results of a simulation and do an animation. To do so involves a programming construct called a thread. Threads enable a program to execute statements independently of each other as if they were run on separate processors (which would be the case on a multiprocessor computer). We will use one thread to update the model and display the results. The CHAPTER 2. TOOLS FOR DOING SIMULATIONS 35 other thread, the event thread, will monitor the keyboard and mouse so that we can stop the computation whenever we desire. The AbstractSimulation class provides a structure for doing simulations by performing a series of computations (steps) that can be started and stopped by the user using a graphical user interface. You will need to know nothing about threads because their use is “hidden” in the AbstractSimulation class. However, it is good to know that the Open Source Physics library is written so that the graphical user interface does not let us change a program’s input parameters while the simulation is running. Most of the programs in the text will be done by extending the AbstractSimulation class and implementing the doStep method as shown in Listing 2.12. Just as the AbstractCalculation class uses the graphical user interface of type CalculationControl, the AbstractSimulation class uses one of type SimulationControl. This graphical user interface has three buttons whose labels change depending on the user’s actions. As was the case with CalculationControl, the buttons in SimulationControl invoke specific methods. Listing 2.12: A simple example of the extension of the AbstractSimulation class. package org . opensourcephysics . sip . ch02 ; import org . opensourcephysics . controls . AbstractSimulation ; import org . opensourcephysics . controls . SimulationControl ; public class SimulationApp extends AbstractSimulation { int counter = 0; public void doStep ( ) { / / does a simulation st ep control . println ( "Counter = "+( counter − −)); } public void i n i t i a l i z e ( ) { counter = control . getInt ( "counter" ) ; } public void reset ( ) { / / invoked when r e s e t button i s pressed / / allows dt to be changed a f t e r i n i t i a l i z a t o n control . setAdjustableValue ( "counter" , 100); } public s t a t i c void main ( String [ ] args ) { / / c r e a t e s a simulation s t r u c t u r e using t h i s c l a s s SimulationControl . createApp (new SimulationApp ( ) ) ; } } Exercise 2.18. AbstractSimulation class Run SimulationApp and see how it works by clicking the buttons. Explain the role of the various buttons. How many times per second is the doStep method invoked when the simulation is running? The buttons in the SimulationControl that were used in SimulationApp in Listing 2.12 invoke methods in the AbstractSimulation class. These methods start and stop threads and perform other housekeeping chores. When the user clicks the Initialize button, the simulation’s Initialize method is executed. When the Reset button is clicked, the reset method is executed. If you don’t write your own versions of these two methods, their default versions will CHAPTER 2. TOOLS FOR DOING SIMULATIONS 36 be used. After the Initialize button is clicked, it becomes the Start button. After the Start button is clicked, it is replaced by a Stop button, and the doStep method is invoked continually until the Stop button is clicked. The default is that the frames are redrawn every time doStep is executed. Clicking the Step button will cause the doStep method to be executed once. The New button changes the Start button to an Initialize button, which forces the user to initialize a new simulation before restarting. Later we will learn how to add other buttons that give the user even more control over the simulation. A typical simulation needs to (1) specify the initial state of the system in the initialize method, (2) tell the computer what to execute while the thread is running in the doStep method, and (3) specify what state the system should return to in the reset method. We could modify the falling particle model to use AbstractSimulation, but such a modification would not be very interesting because there is only one particle and all motion takes place in one dimension. Instead, we will define a new class that models a ball moving in two dimensions, and we will allow the ball to bounce off the ground and off of the walls. Listing 2.13: BouncingBall class. package org . opensourcephysics . sip . ch02 ; import org . opensourcephysics . display . Circle ; / / C i r c l e i s a c l a s s that can draw i t s e l f public class BouncingBall extends Circle { final s t a t i c double g = 9 . 8 ; final s t a t i c double WALL = 10; / / i n i t i a l p o s i t i o n and v e l o c i t y private double x , y , vx , vy ; public BouncingBall ( double x , double vx , double y , double vy ) { this . x = x ; / / s e t s instance value equal to passed value this . vx = vx ; / / s e t s instance value equal to passed value this . y = y ; this . vy = vy ; / / s e t s the p o s i t i o n using setXY in C i r c l e s u p e r c l a s s setXY ( x , y ) ; } public void step ( double dt ) { x = x+vx dt ; / / Euler algorithm f o r numerical s o l u t i o n y = y+vy dt ; vy = vy−g dt ; i f ( x>WALL) { vx = −Math . abs ( vx ) ; / / bounce o f f r i g h t wall } else i f ( x<−WALL) { vx = Math . abs ( vx ) ; / / bounce o f f l e f t wall } i f ( y<0) { vy = Math . abs ( vy ) ; / / bounce o f f f l o o r } setXY ( x , y ) ; } } To model the bounce of the ball off a wall, we have added statements such as CHAPTER 2. TOOLS FOR DOING SIMULATIONS 37 i f ( y < 0) vy = Math . abs ( vy ) ; This statement insures that the ball will move up if y < 0, and is a crude implementation of an elastic collision. (The Math.abs method returns the absolute value of its argument.) Note our first use of the if statement. The general form of an if statement is as follows: i f ( boolean_expression ) { / / code executed i f boolean e x p r e s s i o n i s true } else { / / code executed i f boolean e x p r e s s i o n i s f a l s e } We can test multiple conditions by chaining if statements: i f ( boolean_expression ) { / / code goes here } else i f ( boolean_expression ) { / / code goes here } else { / / code goes here } If the first boolean expression is true, then only the statements within the first brace will be executed. If the first boolean expression is false, then the second boolean expression in the else if expression will be tested, and so forth. If there is an else expression, then the statements after it will be executed if all the other boolean expressions are false. If there is only one statement to execute, the braces are optional. The BouncingBall class is similar to the FallingBall class except that it extends Circle. We inherit from the Circle class because this class includes a simple method that allows the object to draw itself in an Open Source Physics frame called DisplayFrame, which we will use in BouncingBallApp. In the latter we instantiate BouncingBall and DisplayFrame objects so that the circle will be drawn at its x-y location when the frame is displayed or while a simulation is running. To make the animation more interesting, we will animate the motion of many noninteracting balls with random initial velocities. BouncingBallApp creates an arbitrary number of noninteracting bouncing balls by creating an array of BouncingBall objects. Listing 2.14: BouncingBallApp class. package org . opensourcephysics . sip . ch02 ; import org . opensourcephysics . controls . ; import org . opensourcephysics . frames . ; public class BouncingBallApp extends AbstractSimulation { / / d e c l a r e s and i n s t a n t i a t e s a window to draw b a l l s DisplayFrame frame = new DisplayFrame ( "x" , "y" , "Bouncing Balls" ) ; BouncingBall [ ] ball ; / / d e c l a r e s an array of BouncingBall o b j e c t s double time , dt ; public void i n i t i a l i z e ( ) { / / s e t s boundaries of window in world c o o r d i n a t e s frame . setPreferredMinMax ( −10.0 , 10.0 , 0 , 10); time = 0; frame . clearDrawables ( ) ; / / removes old p a r t i c l e s CHAPTER 2. TOOLS FOR DOING SIMULATIONS 38 int n = control . getInt ( "number of balls" ) ; int v = control . getInt ( "speed" ) ; / / i n s t a n t i a t e s array of n BouncingBall o b j e c t s ball = new BouncingBall [n ] ; for ( int i = 0; i =0) { return real+" + i"+Math . abs ( imag ) ; } else { return real+" - i"+Math . abs ( imag ) ; } } } The Complex class defines two constructors that are distinguished by their parameter list. The constructor with two arguments allows us to initialize the values of the instance variables. Notice how the class encapsulates (hides) both the data and the methods that characterize a complex number. That is, we can use the Complex class without any knowledge of how its methods are implemented or how its data is stored. The general features of this class definition are as before. The variables real and imag are the instance variables of class Complex. In contrast, the variable sum in method add is a local variable because it can be accessed only within the method in which it is defined. The most important new feature of the Complex class is that the add and multiply methods return new Complex objects. One reason we need to return a variable of type Complex is that a method returns (at most) a single value. For this reason we cannot return both sum.real and sum.imag. More importantly, we want the sum of two complex numbers to be also of type Complex so that we can add a third complex number to the result. Note also that we have defined add and multiply so that they do not change the values of the instance variables of the numbers to be added, but create a new complex number that stores the sum. Exercise 2.22. Complex numbers Another way to represent complex numbers is by their magnitude and phase, |z|eıθ. If z = a+ıb, then |z| = √ a2 + b2 (2.13a) and θ = arctan b a . (2.13b) (a) Write methods to get the magnitude and phase of a complex number, getMagnitude and CHAPTER 2. TOOLS FOR DOING SIMULATIONS 44 getPhase, respectively. Add test code to invoke these methods. Be sure to check the phase in all four quadrants. (b) Create a new class named ComplexPolar that stores a complex number as a magnitude and phase. Define methods for this class so that it behaves the same as the Complex class. Test this class using the code for ComplexApp. This example of the Complex class illustrates the nature of objects, their limitations, and the tradeoffs that enter into design choices. Because accessing an object requires more computer time than accessing primitive variables, it is faster to represent a complex number by two doubles, corresponding to its real and imaginary parts. Thus N complex data points could be represented by an array of 2N doubles, with the first N values corresponding to the real values. Considerations of computational speed are important only if complex data types are used extensively. References and Suggestions for Further Reading By using the Open Source Physics library, we have hidden most of the Java code needed to use threads, and have only touched on the graphical capabilities of Java. See the Open Source Physics: A User’s Guide with Examples for a description of additional details on how threads and the other Open Source Physics tools are implemented and used. The source code for all the programs in the text and the Open Source Physics library can be downloaded from . There are many good books on Java graphics and Java threads. We list a few of our favorites in the following. David M. Geary, Graphic Java: Vol. 2, Swing, 3rd ed. (Prentice Hall, 1999). Jonathan Knudsen, Java 2D Graphics (O’Reilly, 1999). Scott Oaks and Henry Wong, Java Threads, 3rd ed. (O’Reilly, 2004). Chapter 3 Simulating Particle Motion We discuss several numerical methods needed to simulate the motion of particles using Newton’s laws and introduce interfaces, an important Java construct that makes it possible for unrelated objects to declare that they perform the same methods. 3.1 Modified Euler algorithms To motivate the need for a general differential equation solver, we discuss why the simple Euler algorithm is insufficient for many problems. The Euler algorithm assumes that the velocity and acceleration do not change significantly during the time step ∆t. Thus, to achieve an acceptable numerical solution, the time step ∆t must be chosen to be sufficiently small. However, if we make ∆t too small, we run into several problems. As we do more and more iterations, the round-off error due to the finite precision of any floating point number will accumulate, and eventually the numerical results will become inaccurate. Also, the greater the number of iterations, the greater the computer time required for the program to finish. In addition to these problems, the Euler algorithm is unstable for many systems, which means that the errors accumulate exponentially, and thus the numerical solution becomes inaccurate very quickly. For these reasons more accurate and stable numerical algorithms are necessary. To illustrate why we need algorithms other than the simple Euler algorithm, we make a very simple change in the Euler algorithm and write v(t + ∆t) = v(t) + a(t)∆t (3.1a) y(t + ∆t) = y(t) + v(t + ∆t)∆t (3.1b) where a is the acceleration. The only difference between this algorithm and the simple Euler algorithm, v(t + ∆t) = v(t) + a(t)∆t (3.2a) y(t + ∆t) = y(t) + v(t)∆t (3.2b) is that the computed velocity at the end of the interval, v(t + ∆t), is used to compute the new position, y(t + ∆t) in (3.1b). As we found in Problem 2.12 and will see in more detail in Problem 3.1, this modified Euler algorithm is significantly better for oscillating systems. We refer to this algorithm as the Euler–Cromer algorithm. 45 CHAPTER 3. SIMULATING PARTICLE MOTION 46 Problem 3.1. Comparing Euler algorithms (a) Write a class that extends Particle and models a simple harmonic oscillator for which F = −kx. For simplicity, choose units such that k = 1 and m = 1. Determine the numerical error in the position of the simple harmonic oscillator after the particle has evolved for several cycles. Is the original Euler algorithm stable for this system? What happens if you run for longer times? (b) Repeat part (a) using the Euler–Cromer algorithm. Does this algorithm work better? If so, in what way? (c) Modify your program so that it computes the total energy, Esho = v2/2 + x2/2. How well is the total energy conserved for the two algorithms? Also consider the quantity ˜E = Esho + (∆t/2)xp. What is the behavior of this quantity for the Euler–Cromer algorithm? Perhaps it has occurred to you that it would be better to compute the velocity at the middle of the interval rather than at the beginning or at the end. The Euler–Richardson algorithm is based on this idea. This algorithm is particularly useful for velocity-dependent forces, but does as well as other simple algorithms for forces that do not depend on the velocity. The algorithm consists of using the Euler algorithm to find the intermediate position ymid and velocity vmid at a time tmid = t +∆t/2. We then compute the force, F(ymid,vmid,tmid) and the acceleration amid at t = tmid. The new position yn+1 and velocity vn+1 at time tn+1 are found using vmid and amid and the Euler algorithm. We summarize the Euler–Richardson algorithm as: an = F(yn,vn,tn)/m (3.3a) vmid = vn + 1 2 an∆t (3.3b) ymid = yn + 1 2 vn∆t (3.3c) amid = F ymid,vmid,t + 1 2 ∆t /m (3.3d) and vn+1 = vn + amid∆t (3.4a) yn+1 = yn + vmid∆t (Euler–Richardson algorithm). (3.4b) Although we need to do twice as many computations per time step, the Euler–Richardson algorithm is much faster than the Euler algorithm because we can make the time step larger and still obtain better accuracy than with either the Euler or Euler–Cromer algorithms. A derivation of the Euler–Richardson algorithm is given in Appendix 3A. Exercise 3.2. The Euler–Richardson algorithm (a) Extend FallingParticle in Listing 2.6 to a new class that implements the Euler–Richardson algorithm. All you need to do is write a new step method. (b) Use ∆t = 0.08, 0.04, 0.02, and 0.01 and determine the error in the computed position when the particle hits the ground. How do your results compare with the Euler algorithm? How does the error in the velocity depend on ∆t for each algorithm? CHAPTER 3. SIMULATING PARTICLE MOTION 47 (c) Repeat part (b) for the simple harmonic oscillator and compute the error after several cycles. As we gain more experience simulating various physical systems, we will learn that no single algorithm for solving Newton’s equations of motion numerically is superior under all conditions. The Open Source Physics library includes classes that can be used to solve systems of coupled first-order differential equations using different algorithms. To understand how to use this library, we first discuss interfaces and then arrays. 3.2 Interfaces We have seen how to combine data and methods into a class. A class definition encapsulates this information in one place, thereby simplifying the task of the programmer who needs to modify the class and the user who needs to understand or use the class. Another tool for data abstraction is known as an interface. An interface specifies methods that an object performs but does not implement these methods. In other words, an interface describes the behavior or functionality of any class that implements it. Because an interface is not tied to a given class, any class can implement any particular interface as long as it defines all the methods specified by the interface. An important reason for interfaces is that a class can inherit from only one superclass, but it can implement more than one interface. An example of an interface is the Function interface in the numerics package: public interface Function { public double evaluate ( double x ) ; } The interface contains one method, evaluate, with one argument, but no body. Notice that the definition uses the keyword interface rather then the keyword class. We can define a class that encapsulates a quadratic polynomial as follows: public class QuadraticPolynomial implements Function { double a , b , c ; public QuadraticPolynomial ( double a , double b , double c ) { this . a = a ; this . b = b ; this . c = c ; } public double evaluate ( double x ) { return a x x + b x + c ; } } Quadratic polynomials can now be instantiated and used as needed. Function f = new QuadraticPolynomial ( 1 , 0 , 2 ) ; for ( int x = 0; x < 10; x++) { System . out . println ( "x = " + x + " f(x)" + f . evaluate ( x ) ) ; } CHAPTER 3. SIMULATING PARTICLE MOTION 48 By using the Function interface, we can write methods that use this mathematical abstraction. For example, we can program a simple plot as follows: public void plotFunction ( Function f , double xmin , double xmax) { PlotFrame frame = new PlotFrame ( "x" ,"y" , "Function" ) ; double n = 100; / / number of points in p l o t double x = xmin , dx = (xmax − xmin ) / ( n−1); for ( int i = 0; i < 100; i ++) { frame . append (0 , x , f . evaluate ( x ) ) ; x += dx ; } frame . s e t V i s i b l e ( true ) ; / / d i s p l a y frame on screen } We can also compute a numerical derivative based on the definition of the derivative found in calculus textbooks. public double derivative ( Function f , double x , double dx ) { return ( f . evaluate ( x+dx ) − f . evaluate ( x ) ) / dx ; } This way of approximating a derivative is not optimum, but that is not the point here. (A better approximation is given in Problem 3.8.) The important point is that the interface enables us to define the abstract concept y = f (x) and to write code that uses this abstraction. Exercise 3.3. Function interface (a) Define a class that encapsulates the function f (u) = ae−bu2 . (b) Write a test program that plots f (u) with b = 1 and b = 4. Choose a = 1 for simplicity. (c) Write a test program that plots the derivatives of the functions used in part (b) without using the analytic expression for the derivative. Although interfaces are very useful for developing large scale software projects, you will not need to define interfaces to do the problems in this book. However, you will use several interfaces, including the Function interface, that are defined in the Open Source Physics library. We describe two of the more important interfaces in the following sections. 3.3 Drawing An interface that we will use often is the Drawable interface: package org . opensourcephysics . display ; import java . awt . ; public interface Drawable { public void draw ( DrawingPanel panel , Graphics g ) ; } Notice that this interface contains only one method, draw. Objects that implement this interface are rendered in a DrawingPanel after they have been added to a DisplayFrame. As we saw in Chapter 2, a DisplayFrame consists of components including a title bar, menu, and buttons CHAPTER 3. SIMULATING PARTICLE MOTION 49 for minimizing and closing the frame. The DisplayFrame contains a DrawingPanel on which graphical output will be displayed. The Graphics class contains methods for drawing simple geometrical objects such as lines, rectangles, and ovals on the panel. In Listing 3.1 we define a class that draws a rectangle using pixel-based coordinates. Listing 3.1: PixelRectangle. package org . opensourcephysics . sip . ch03 ; import java . awt . ; / / uses Abstract Window T o o l k i t import org . opensourcephysics . display . ; public class PixelRectangle implements Drawable { int left , top ; / / p o s i t i o n of r e c t a n g l e in p i x e l s int width , height ; / / s i z e of r e c t a n g l e in p i x e l s PixelRectangle ( int left , int top , int width , int height ) { this . l e f t = l e f t ; / / l o c a t i o n of l e f t edge this . top = top ; / / l o c a t i o n of top edge this . width = width ; this . height = height ; } public void draw ( DrawingPanel panel , Graphics g ) { / / t h i s method implements the Drawable i n t e r f a c e g . setColor ( Color .RED) ; / / s e t drawing c o l o r to red g . f i l l R e c t ( left , top , width , height ) ; / / draws r e c t a n g l e } } In method draw we used fillRect, a primitive method in the Graphics class. This method draws a filled rectangle using pixel coordinates with the origin at the top left corner of the panel. To use PixelRectangle, we instantiate an object and add it to a DisplayFrame as shown in Listing 3.2. Listing 3.2: Listing of DrawingApp. package org . opensourcephysics . sip . ch03 ; import org . opensourcephysics . controls . ; import org . opensourcephysics . display . ; import org . opensourcephysics . frames . ; public class DrawingApp extends AbstractCalculation { DisplayFrame frame = new DisplayFrame ( "x" , "y" , "Graphics" ) ; public DrawingApp ( ) { frame . setPreferredMinMax (0 , 10 , 0 , 10); } public void calculate ( ) { / / g e t s r e c t a n g l e l o c a t i o n int l e f t = control . getInt ( "xleft" ) ; int top = control . getInt ( "ytop" ) ; / / g e t s r e c t a n g l e dimensions int width = control . getInt ( "width" ) ; CHAPTER 3. SIMULATING PARTICLE MOTION 50 int height = control . getInt ( "height" ) ; Drawable rectangle = new PixelRectangle ( left , top , width , height ) ; frame . addDrawable ( rectangle ) ; / / frame i s automatically rendered a f t e r Calculate button / / i s c l i c k e d } public void reset ( ) { / / removes drawables added by the user frame . clearDrawables ( ) ; / / s e t s d e f a u l t input values control . setValue ( "xleft" , 60); control . setValue ( "ytop" , 70); control . setValue ( "width" , 100); control . setValue ( "height" , 150); } / / c r e a t e s a c a l c u l a t i o n c o n t r o l s t r u c t u r e using t h i s c l a s s public s t a t i c void main ( String [ ] args ) { CalculationControl . createApp (new DrawingApp ( ) ) ; } } Note that multiple rectangles are drawn in the order that they are added to the drawing panel. Rectangles or portions of rectangles may be hidden because they are outside the drawing panel. Although it is possible to use pixel-based drawing methods to produce visualizations, creating even a simple graph in such an environment would require much tedious programming. The DrawingPanel object passed to the draw method simplifies this task by defining a system of world coordinates that enable us to specify the location and size of various objects in physical units rather than pixels. In the WorldRectangle class in Listing 3.3, methods from the DrawingPanel class are used to convert pixel coordinates to world coordinates. The range of the world coordinates in the horizontal and vertical directions is defined in the frame.setPreferredMinMax method in DrawingApp. (This method is not needed if pixel coordinates are used.) Listing 3.3: WorldRectangle illustrates the use of world coordinates. package org . opensourcephysics . sip . ch03 ; import java . awt . ; import org . opensourcephysics . display . ; public class WorldRectangle implements Drawable { double left , top ; / / p o s i t i o n of r e c t a n g l e in world c o o r d i n a t e s double width , height ; / / s i z e of r e c t a n g l e in world units public WorldRectangle ( double left , double top , double width , double height ) { this . l e f t = l e f t ; / / l o c a t i o n of l e f t edge this . top = top ; / / l o c a t i o n of top edge this . width = width ; this . height = height ; } public void draw ( DrawingPanel panel , Graphics g ) { / / This method implements the Drawable i n t e r f a c e CHAPTER 3. SIMULATING PARTICLE MOTION 51 g . setColor ( Color .RED) ; / / s e t drawing c o l o r to red / / converts from world to p i x e l c o o r d i n a t e s int l e f t P i x e l s = panel . xToPix ( l e f t ) ; int topPixels = panel . yToPix ( top ) ; int widthPixels = ( int ) ( panel . getXPixPerUnit ( ) width ) ; int heightPixels = ( int ) ( panel . getYPixPerUnit ( ) height ) ; / / draws r e c t a n g l e g . f i l l R e c t ( l e f t P i x e l s , topPixels , widthPixels , heightPixels ) ; } } Exercise 3.4. Simple graphics (a) Run DrawingApp and test how the different inputs change the size and location of the rectangle. Note that the pixel coordinates that are obtained from the control window are not the same as the world coordinates that are displayed. (b) Read the documentation at for the Graphics class, and modify the WorldRectangle class to draw lines, filled ovals, and strings of characters. Also play with different colors. (c) Modify DrawingApp to use the WorldRectangle class and repeat part (a). Note that the coordinates that are displayed and the inputs are now consistent. (d) Define and test a TextMessage class to display text messages in a drawing panel using world coordinates to position the text. In the draw method use the syntax g.drawString("string to draw",x,y), where (x,y) are the pixel coordinates. Although simple geometric shapes such as circles and rectangles are often all that are needed to visualize many physical models, Java provides a drawing environment based on the Java 2D Application Programming Interface (API) which can render arbitrary geometric shapes, images, and text using composition and matrix-based transformations. We will use a subset of these features to define the DrawableShape and InteractiveShape classes in the display package of Open Source Physics, which we will introduce in Chapter 9. (See also the Open Source Physics User’s Guide.) So far we have created rectangles using two different classes. Each implementation of a Drawable rectangle defined a different draw method. Notice that in the display frame’s definition of addDrawable in DrawingApp, the argument is specified to be the interface Drawable rather than a specific class. Any class that implements Drawable can be an argument of addDrawable. Without the interface construct, we would need to write an addDrawable method for each type of class. 3.4 Specifying The State of a System Using Arrays Imagine writing the code for the numerical solution of the motion of three particles in three dimensions using the Euler–Richardson algorithm. The resulting code would be tedious to write. In addition, for each problem we would need to write and debug new code to implement the numerical algorithm. The complications become worse for better algorithms, most of which are algebraically more complex. Moreover, the numerical solution of simple first-order differential equations is a well- developed part of numerical analysis, and thus there is little reason to worry CHAPTER 3. SIMULATING PARTICLE MOTION 52 about the details of these algorithms now that we know how they work. In Section 3.5 we will introduce an interface for solving the differential equations associated with Newton’s equations of motion. Before we do so we discuss a few features of arrays that we will need. As we discussed on page 38, ordered lists of data are most easily stored in arrays. For example, if we have an array variable named x, then we can access its first element as x[0], its second element as x[1], etc. All elements must be of the same data type, but they can be just about anything: primitive data types such as doubles or integers, objects, or even other arrays. The following statements show how arrays of primitive data types are defined and instantiated: / / x defined to be an array of doubles double [ ] x ; double x [ ] ; / / same meaning as double [ ] x / / x array c r e a t e d with 32 elements x = new double [ 3 2 ] ; / / y array defined and c r e a t e d in one statement double [ ] y = new double [ 3 2 ] ; int [ ] num = new int [1 00 ]; / / array of 100 i n t e g e r s double [ ] x , y / / p r e f e r r e d notation / / same meaning as double [ ] x , y double x [ ] , y [ ] / / array of doubles s p e c i f i e d by two i n d i c e s double [ ] [ ] sigma = new double [ 3 ] [ 3 ] ; / / r e f e r e n c e to f i r s t row of sigma array double [ ] row = sigma [ 0 ] ; We will adopt the syntax double[] x instead of double x[]. The array index starts at zero, and the largest index is one less than the number of elements. Note that Java supports multiple array indices by creating arrays of arrays. Although sigma[0][0] refers to a single value of type double in the sigma object, we can refer to an entire row of values in the sigma object using the syntax sigma[i]. As shown in Chapter 2, arrays can contain objects such as bouncing balls. / / array of two BouncingBall o b j e c t s BouncingBall [ ] ball = new BouncingBall [ 2 ] ; ball [0 ] = new BouncingBall ( 0 , 1 0 . 0 , 0 , 5 . 0 ) ; / / c r e a t e s f i r s t b a l l ball [1 ] = new BouncingBall (0 , −13.0 ,0 ,7.0); / / c r e a t e s second b a l l The first statement allocates an array of BouncingBall objects, each of which is initialized to null. We need to create each object in the array using the new operator. The numerical solution of an ordinary differential equation (frequently called an ODE) begins by expressing the equation as several first-order differential equations. If the highest derivative in the ODE is order n (for example, dnx/dtn), then it can be shown that the ODE can be written equivalently as n first-order differential equations. For example, Newton’s equation of motion is a second-order differential equation and can be written as two first-order differential equations for the position and velocity in each spatial dimension. For example, in one dimension we can write dy dt = v(t) (3.5a) dv dt = a(t) = F(t)/m. (3.5b) CHAPTER 3. SIMULATING PARTICLE MOTION 53 If we have more than one particle, there are additional first-order differential equations for each particle. It is convenient to have a standard way of handling all these cases. Let us assume that each differential equation is of the form dxi dt = ri(x0,xi,x2,...,xn−1,t) (3.6) where xi is a dynamical variable such as a position or a velocity. The rate function ri can depend on any of the dynamical variables including the time t. We will store the values of the dynamical variables in the state array and the values of the corresponding rates in the rate array. In the following we show some examples: / / one p a r t i c l e in one dimension : s t a t e [ 0 ] / / s t o r e s x s t a t e [ 1 ] / / s t o r e s v s t a t e [ 2 ] / / s t o r e s t ( time ) / / one p a r t i c l e in two dimensions : s t a t e [ 0 ] / / s t o r e s x s t a t e [ 1 ] / / s t o r e s vx s t a t e [ 2 ] / / s t o r e s y s t a t e [ 3 ] / / s t o r e s vy s t a t e [ 4 ] / / s t o r e s t / / two p a r t i c l e s in one dimension : s t a t e [ 0 ] / / s t o r e s x1 s t a t e [ 1 ] / / s t o r e s v1 s t a t e [ 2 ] / / s t o r e s x2 s t a t e [ 3 ] / / s t o r e s v2 s t a t e [ 4 ] / / s t o r e s t Although the Euler algorithm does not assume any special ordering of the state variables, we adopt the convention that a velocity rate follows every position rate in the state array so that we can efficiently code the more sophisticated numerical algorithms that we discuss in Appendix 3A and in later chapters. To solve problems for which the rate contains an explicit time dependence, such as a driven harmonic oscillator (see Section 4.4), we store the time variable in the last element of the state array. Thus, for one particle in one dimension, the time is stored in state[2]. In this way we can treat all dynamical variables on an equal footing. Because arrays can be arguments of methods, we need to understand how Java passes variables from the class that calls a method to the method being called. Consider the following method: public void example ( int r , int s [ ] ) { r = 20; s [ 0] = 20; } What do you expect the output of the following statements to be? int x = 10; int [ ] y = { 1 0 } ; / / array of one element i n i t i a l i z e d to y [0] = 10 example ( x , y ) ; System . out . println ( "x = " + x + " y[0] = " + y [ 0 ] ) ; The answer is that the output will be x = 10, y[0] = 20. Java parameters are “passed-byvalue,” which means that the values are copied. The method cannot modify the value of the CHAPTER 3. SIMULATING PARTICLE MOTION 54 x variable because the method received only a copy of its value. In contrast, when an object or an array is in a method’s parameter list, Java passes a copy of the reference to the object or the array. The method can use the reference to read or modify the data in the array or object. For this reason the step method of the ODE solvers, discussed in Section 3.6, does not need to explicitly return an updated state array, but implicity changes the contents of the state array. Exercise 3.5. Pass by value As another example of how Java handles primitive variables differently from arrays and objects, consider the statements int x = 10; int y = x ; x = 20; What is y? Next consider / / d e c l a r e s an array of one element i n i t i a l i z e d to the value 10 int [ ] x = { 1 0 } ; int [ ] y = x ; x [ 0 ] = 20; What is y[0]? We are now ready to discuss the classes and interfaces from the Open Source Physics library for solving ordinary differential equations. 3.5 The ODE Interface To introduce the ODE interface, we again consider the equations of motion for a falling particle. We use a state array ordered as s = (y,v,t), so that the dynamical equations can be written as: ˙s0 = s1 (3.7a) ˙s1 = −g (3.7b) ˙s2 = 1. (3.7c) The ODE interface enables us to encapsulate (3.7) in a class. The interface contains two methods, getState and getRate, as shown in Listing 3.4. Listing 3.4: The ODE interface. package org . opensourcephysics . numerics ; public interface ODE { public double [ ] getState ( ) ; public void getRate ( double [ ] state , double [ ] rate ) ; } The getState method returns the state array (s0,s1,...,sn). The getRate method evaluates the derivatives using the given state array and stores the result in the rate array, (˙s0, ˙s1,..., ˙sn). An example of a Java class that implements the ODE interface for a falling particle is shown in Listing 3.5. CHAPTER 3. SIMULATING PARTICLE MOTION 55 Listing 3.5: Example of the implementation of the ODE interface for a falling particle. package org . opensourcephysics . sip . ch03 ; import org . opensourcephysics . numerics . ; public class FallingParticleODE implements ODE { final s t a t i c double g = 9 . 8 ; double [ ] s t a t e = new double [ 3 ] ; public FallingParticleODE ( double y , double v ) { s t a t e [ 0 ] = y ; s t a t e [ 1 ] = v ; s t a t e [ 2 ] = 0; / / i n i t i a l time } / / required to implement ODE i n t e r f a c e public double [ ] getState ( ) { return s t a t e ; } public void getRate ( double [ ] state , double [ ] rate ) { rate [ 0] = s t a t e [ 1 ] ; / / r a t e of change of y i s v rate [ 1] = −g ; rate [ 2] = 1; / / r a t e of change of time i s 1 } } 3.6 The ODESolver Interface There are many possible numerical algorithms for advancing a system of first-order ODEs from an initial state to a final state. The Open Source Physics library defines ODE solvers such as Euler and EulerRichardson, as well as RK4, a fourth-order algorithm that is discussed in Appendix 3. You can write additional classes for other algorithms if they are needed. Each of these classes implements the ODESolver interface, which is defined in Listing 3.6. Listing 3.6: The ODE solver interface. Note the four methods that must be defined. package org . opensourcephysics . numerics ; public interface ODESolver { public void i n i t i a l i z e ( double stepSize ) ; public double step ( ) ; public void setStepSize ( double stepSize ) ; public double getStepSize ( ) ; } A system of first-order differential equations is now solved by creating an object that implements a particular algorithm and repeatedly invoking the step method for that solver class. The argument for the solver class constructor must be a class that implements the ODE interface. As an example of the use of ODESolver, we again consider the dynamics of a falling particle. Listing 3.7: A falling particle program that uses an ODESolver. package org . opensourcephysics . sip . ch03 ; CHAPTER 3. SIMULATING PARTICLE MOTION 56 import org . opensourcephysics . controls . ; import org . opensourcephysics . numerics . ; public class FallingParticleODEApp extends AbstractCalculation { public void calculate ( ) { / / g e t s i n i t i a l c o n d i t i o n s double y0 = control . getDouble ( "Initial y" ) ; double v0 = control . getDouble ( "Initial v" ) ; / / c r e a t e s b a l l with i n i t i a l c o n d i t i o n s FallingParticleODE ball = new FallingParticleODE ( y0 , v0 ) ; / / note how p a r t i c u l a r algorithm i s chosen ODESolver solver = new Euler ( ball ) ; / / s e t s time st ep dt in the s o l v e r solver . setStepSize ( control . getDouble ( "dt" ) ) ; while ( ball . s t a t e [0] >0) { solver . step ( ) ; } control . println ( "final time = "+ball . s t a t e [ 2 ] ) ; control . println ( "y = "+ball . s t a t e [0]+ " v = "+ball . s t a t e [ 1 ] ) ; } public void reset ( ) { / / s e t s d e f a u l t input values control . setValue ( "Initial y" , 10); control . setValue ( "Initial v" , 0 ) ; control . setValue ( "dt" , 0 . 0 1 ) ; } / / c r e a t e s a c a l c u l a t i o n c o n t r o l s t r u c t u r e f o r t h i s c l a s s public s t a t i c void main ( String [ ] args ) { CalculationControl . createApp (new FallingParticleODEApp ( ) ) ; } } The ODE classes are located in the numerics package, and thus we need to import this package as done in the third statement of FallingParticleODEApp. We declare and instantiate the variables ball and solver in the calculate method. Note that ball, an instance of FallingParticleODE, is the argument of the Euler constructor. The object ball can be an argument because FallingParticleODE implements the ODE interface. It would be a good idea to look at the source code of the ODE Euler class in the numerics package. The Euler class gets the state of the system using getState and then sends this state to getRate which stores the rates in the rate array. The state array is then modified using the rate array in the Euler algorithm. You don’t need to know the details, but you can read the step method of the various classes that implement ODESolver if you are interested in how the different algorithms are programmed. Because FallingParticleODE appears to be more complicated than FallingParticle, you might ask what we have gained. One answer is that it is now much easier to use a different numerical algorithm. The only modification we need to make is to change the statement ODESolver solver = new Euler ( ball ) ; to, for example, ODESolver solver = new EulerRichardson ( ball ) ; CHAPTER 3. SIMULATING PARTICLE MOTION 57 We have separated the physics (in this case a freely falling particle) from the implementation of the numerical method. Exercise 3.6. ODE solvers Run FallingParticleODEApp and compare your results with our previous implementation of the Euler algorithm in FallingParticleApp. How easy is it to use a different algorithm? 3.7 Effects of Drag Resistance We have introduced most of the programming concepts that we will use in the remainder of this text. If you are new to programming, you will likely feel a bit confused at this point by all the new concepts and syntax. However, it is not necessary to understand all the details to continue and begin to write your own programs. A prototypical simulation program is given in Listings 3.8 and 3.9. These classes simulate a projectile on the surface of the Earth with no air friction, including a plot of position versus time and an animation of a projectile moving through the air. In the following, we discuss more realistic models that can be simulated by modifying the projectile classes. Listing 3.8: A simple projectile simulation that is useful as a template for other simulations. package org . opensourcephysics . sip . ch03 ; import java . awt . ; import org . opensourcephysics . display . ; import org . opensourcephysics . numerics . ; public class P r o j e c t i l e implements Drawable , ODE { s t a t i c final double g = 9 . 8 ; double [ ] s t a t e = new double [ 5 ] ; / / {x , vx , y , vy , t } / pixel radius for drawing of p r o j e c t i l e int pixRadius = 6; / EulerRichardson odeSolver = new EulerRichardson ( this ) ; public void setStepSize ( double dt ) { odeSolver . setStepSize ( dt ) ; } public void step ( ) { odeSolver . step ( ) ; / / do one time st ep using s e l e c t e d algorithm } public void s e t S t a t e ( double x , double vx , double y , double vy ) { s t a t e [ 0 ] = x ; s t a t e [ 1 ] = vx ; s t a t e [ 2 ] = y ; s t a t e [ 3 ] = vy ; s t a t e [ 4 ] = 0; } public double [ ] getState ( ) { return s t a t e ; } CHAPTER 3. SIMULATING PARTICLE MOTION 58 public void getRate ( double [ ] state , double [ ] rate ) { rate [ 0] = s t a t e [ 1 ] ; / / r a t e of change of x rate [ 1] = 0; / / r a t e of change of vx rate [ 2] = s t a t e [ 3 ] ; / / r a t e of change of y rate [ 3] = −g ; / / r a t e of change of vy rate [ 4] = 1; / / dt / dt = 1 } public void draw ( DrawingPanel drawingPanel , Graphics g ) { int xpix = drawingPanel . xToPix ( s t a t e [ 0 ] ) ; int ypix = drawingPanel . yToPix ( s t a t e [ 2 ] ) ; g . setColor ( Color . red ) ; g . f i l l O v a l ( xpix−pixRadius , ypix−pixRadius , 2 pixRadius , 2 pixRadius ) ; g . setColor ( Color . green ) ; int xmin = drawingPanel . xToPix ( −100); int xmax = drawingPanel . xToPix (100); int y0 = drawingPanel . yToPix ( 0 ) ; / / draw a l i n e to r e p r e s e n t the ground g . drawLine (xmin , y0 , xmax , y0 ) ; } } Listing 3.9: A target class for projectile motion simulation. package org . opensourcephysics . sip . ch03 ; import org . opensourcephysics . controls . ; import org . opensourcephysics . frames . ; public class ProjectileApp extends AbstractSimulation { PlotFrame plotFrame = new PlotFrame ( "Time" , "x,y" , "Position versus time" ) ; P r o j e c t i l e p r o j e c t i l e = new P r o j e c t i l e ( ) ; PlotFrame animationFrame = new PlotFrame ( "x" , "y" , "Trajectory" ) ; public ProjectileApp ( ) { animationFrame . addDrawable ( p r o j e c t i l e ) ; plotFrame . setXYColumnNames (0 , "t" , "x" ) ; plotFrame . setXYColumnNames (1 , "t" , "y" ) ; } public void i n i t i a l i z e ( ) { double dt = control . getDouble ( "dt" ) ; double x = control . getDouble ( "initial x" ) ; double vx = control . getDouble ( "initial vx" ) ; double y = control . getDouble ( "initial y" ) ; double vy = control . getDouble ( "initial vy" ) ; p r o j e c t i l e . s e t S t a t e ( x , vx , y , vy ) ; p r o j e c t i l e . setStepSize ( dt ) ; / / estimate of s i z e needed f o r d i s p l a y double size = ( vx vx+vy vy )/10; animationFrame . setPreferredMinMax ( −1 , size , −1, size ) ; } public void doStep ( ) { / / x vs time data added CHAPTER 3. SIMULATING PARTICLE MOTION 59 plotFrame . append (0 , p r o j e c t i l e . s t a t e [ 4 ] , p r o j e c t i l e . s t a t e [ 0 ] ) ; / / y vs time data added plotFrame . append (1 , p r o j e c t i l e . s t a t e [ 4 ] , p r o j e c t i l e . s t a t e [ 2 ] ) ; / / t r a j e c t o r y data added animationFrame . append (0 , p r o j e c t i l e . s t a t e [ 0 ] , p r o j e c t i l e . s t a t e [ 2 ] ) ; p r o j e c t i l e . step ( ) ; / / advance the s t a t e by one time st ep } public void reset ( ) { control . setValue ( "initial x" , 0 ) ; control . setValue ( "initial vx" , 10); control . setValue ( "initial y" , 0 ) ; control . setValue ( "initial vy" , 10); control . setValue ( "dt" , 0 . 0 1 ) ; enableStepsPerDisplay ( true ) ; } public s t a t i c void main ( String [ ] args ) { SimulationControl . createApp (new ProjectileApp ( ) ) ; } } The analytic solution for free fall near the Earth’s surface, (2.4), is well known, and thus finding a numerical solution is useful only as an introduction to numerical methods. It is not difficult to think of more realistic models of motion near the Earth’s surface for which the equations of motion do not have simple analytic solutions. For example, if we take into account the variation of the Earth’s gravitational field with the distance from the center of the Earth, then the force on a particle is not constant. According to Newton’s law of gravitation, the force due to the Earth on a particle of mass m is given by F = GMm (R + y)2 = GMm R2(1 + y/R)2 = mg 1 − 2 y R + ··· (3.8) where y is measured from the Earth’s surface, R is the radius of the Earth, M is the mass of the Earth, G is the gravitational constant, and g = GM/R2. Problem 3.7. Position-dependent force Extend FallingParticleODE to simulate the fall of a particle with the position-dependent force law (3.8). Assume that a particle is dropped from a height h with zero initial velocity and compute its impact velocity (speed) when it hits the ground at y = 0. Determine the value of h for which the impact velocity differs by one percent from its value with a constant acceleration g = 9.8 m/s2. Take R = 6.37 × 106 m. Make sure that the one percent difference is due to the physics of the force law and not the accuracy of your algorithm. For particles near the Earth’s surface, a more important modification is to include the drag force due to air resistance. The direction of the drag force Fd(v) is opposite to the velocity of the particle (see Figure 3.1). For a falling body, Fd(v) is upward as shown in Figure 3.1(b). Hence, the total force F on the falling body can be expressed as F = −mg + Fd. (3.9) The velocity dependence of Fd(v) is known theoretically in the limit of very low speeds for small objects. In general, it is necessary to determine the velocity dependence of Fd(v) empirically over a limited range of velocities. One way to obtain the form of Fd(v) is to measure y as CHAPTER 3. SIMULATING PARTICLE MOTION 60 y (a) mg Fd (b) mg Fd (c) Figure 3.1: (a) Coordinate system with y measured positive upward from the ground. (b) The force diagram for upward motion. (c) The force diagram for downward motion. a function of t and then compute v(t) by calculating the numerical derivative of y(t). Similarly, we can use v(t) to compute a(t) numerically. From this information, it is possible in principle to find the acceleration as a function of v and to extract Fd(v) from (3.9). However, this procedure introduces errors (see Problem 3.8b) because the accuracy of the derivatives will be less than the accuracy of the measured position. An alternative is to reverse the procedure, that is, assume an explicit form for the v dependence of Fd(v), and use it to solve for y(t). If the calculated values of y(t) are consistent with the experimental values of y(t), then the assumed v dependence of Fd(v) is justified empirically. The two common assumed forms of the velocity dependence of Fd(v) are F1,d(v) = C1v (3.10a) and F2,d(v) = C2v2 (3.10b) where the parameters C1 and C2 depend on the properties of the medium and the shape of the object. In general, (3.10a) and (3.10b) are useful phenomenological expressions that yield approximate results for Fd(v) over a limited range of v. Because Fd(v) increases as v increases, there is a limiting or terminal velocity (speed) at which the net force on a falling object is zero. This terminal speed can be found from (3.9) and (3.10) by setting Fd = mg and is given by v1,t = mg C1 (linear drag) (3.11a) v2,t = mg C2 1/2 (quadratic drag) (3.11b) for the linear and quadratic cases, respectively. It is often convenient to express velocities in terms of the terminal velocity. We can use (3.10) and (3.11) to write Fd in the linear and quadratic cases as F1,d = C1v1,t v v1,t = mg v v1,t (3.12a) F2,d = C2v2,t 2 v v2,t 2 = mg v v2,t 2 . (3.12b) CHAPTER 3. SIMULATING PARTICLE MOTION 61 t (s) Position (m) t (s) Position (m) t (s) Position (m) 0.2055 0.4188 0.4280 0.3609 0.6498 0.2497 0.2302 0.4164 0.4526 0.3505 0.6744 0.2337 0.2550 0.4128 0.4773 0.3400 0.6990 0.2175 0.2797 0.4082 0.5020 0.3297 0.7236 0.2008 0.3045 0.4026 0.5266 0.3181 0.7482 0.1846 0.3292 0.3958 0.5513 0.3051 0.7728 0.1696 0.3539 0.3878 0.5759 0.2913 0.7974 0.1566 0.3786 0.3802 0.6005 0.2788 0.8220 0.1393 0.4033 0.3708 0.6252 0.2667 0.8466 0.1263 Table 3.1: Results for the vertical fall of a coffee filter. Note that the initial time is not zero. The time difference is ≈ 0.0247. This data is also available in the falling.txt file in the ch03 package. Hence, we can write the net force (per unit mass) on a falling object in the convenient forms F1(v)/m = −g 1 − v v1,t , (3.13a) F2(v)/m = −g 1 − v2 v2,t 2 . (3.13b) To determine if the effects of air resistance are important during the fall of ordinary objects, consider the fall of a pebble of mass m = 10−2 kg. To a good approximation, the drag force is proportional to v2. For a spherical pebble of radius 0.01 m, C2 is found empirically to be approximately 10−2 kg/m. From (3.11b) we find the terminal velocity to be about 30 m/s. Because this speed would be achieved by a freely falling body in a vertical fall of approximately 50 m in a time of about 3 s, we expect that the effects of air resistance would be appreciable for comparable times and distances. Data often is stored in text files, and it is convenient to be able to read this data into a program for analysis. The ResourceLoader class in the Open Source Physics tools package makes reading these files easy. This class can read many different data types including images and sound. An example of how to use the ResourceLoader class to read string data is given in DataLoaderApp. Listing 3.10: Example of the use of the ResourceLoader class to read data into a program. package org . opensourcephysics . sip . ch03 ; import org . opensourcephysics . tools . ; public class DataLoaderApp { public s t a t i c void main ( String [ ] args ) { / / reads from d i r e c t o r y where DataLoaderApp i s l o c a t e d String fileName = "falling.txt" ; / / g e t s the data f i l e Resource res = ResourceLoader . getResource ( fileName , DataLoaderApp . class ) ; String data = res . getString ( ) ; / / s p l i t s t r i n g on newline c h a r a c t e r String [ ] l in e s = data . s p l i t ( "\n" ) ; / / e x t r a c t x−y data from every l i n e CHAPTER 3. SIMULATING PARTICLE MOTION 62 Figure 3.2: A falling coffee filter does not fall with constant acceleration due to the effects of air resistance. The motion sensor below the filter is connected to a computer which records position data and stores it in a text file. for ( int i = 0 , n = l i ne s . length ; i Math . abs ( amplitude ) ) { amplitude = Math . abs ( x ) ; control . println ( "new amplitude = " + amplitude ) ; } (e) Measure the amplitude and phase shift to verify that the steady state behavior of x(t) is given by x(t) = A(ω)cos(ωt + δ). (4.19) The quantity δ is the phase difference between the applied force and the steady state motion. Compute A(ω) and δ(ω) for ω0 = 3, γ = 0.5, and ω = 0, 1.0, 2.0, 2.2, 2.4, 2.6, 2.8, 3.0, 3.2, and 3.4. Choose the initial condition x(t = 0) = 0,v(t = 0) = 0. Repeat the simulation for γ = 3.0, and plot A(ω) and δ(ω) versus ω for the two values of γ. Discuss the qualitative behavior of A(ω) and δ(ω) for the two values of γ. If A(ω) has a maximum, determine the angular frequency ωmax at which the maximum of A occurs. Is the value of ωmax close to the natural angular frequency ω0? Compare ωmax to ω0 and to the frequency of the damped linear oscillator in the absence of an external force. (f) Compute x(t) and A(ω) for a damped linear oscillator with the amplitude of the external force A0 = 4. How do the steady state results for x(t) and A(ω) compare to the case A0 = 1? Does the transient behavior of x(t) satisfy the same relation as the steady state behavior? (g) What is the shape of the phase space trajectory for the initial condition x(t = 0) = 1,v(t = 0) = 0? Do you find a different phase space trajectory for other initial conditions? (h) Why is A(ω = 0) < A(ω) for small ω? Why does A(ω) → 0 for ω ω0? (i) Does the mean kinetic energy resonate at the same frequency as does the amplitude? Compute the mean kinetic energy over one cycle once steady state conditions have been reached. Choose ω0 = 3 and γ = 0.5. In Problem 4.8 we found that the response of the damped harmonic oscillator to an external driving force is linear. For example, if the magnitude of the external force is doubled, then the magnitude of the steady state motion is also doubled. This behavior is a consequence of the linear nature of the equation of motion. When a particle is subject to nonlinear forces, the response can be much more complicated (see Section 6.8). CHAPTER 4. OSCILLATIONS 95 For many problems, the sinusoidal driving force in (4.18) is not realistic. Another example of an external force can be found by observing someone pushing a child on a swing. Because the force is nonzero for only short intervals of time, this type of force is impulsive. In the following problem, we consider the response of a damped linear oscillator to an impulsive force. CHAPTER 4. OSCILLATIONS 96 F(t) t Figure 4.3: A half-wave driving force corresponding to the positive part of a cosine function. ∗Problem 4.9. Response of a damped linear oscillator to nonsinusoidal external forces (a) Assume a swing can be modeled by a dampled linear oscillator. The effect of an impulse is to change the velocity. For simplicity, let the duration of the push equal the time step ∆t. Introduce an integer variable for the number of time steps and use the % operator to ensure that the impulse is nonzero only at the time step associated with the period of the external impulse. Determine the steady state amplitude A(ω) for ω = 1.0, 1.3, 1.4, 1.5, 1.6, 2.5, 3.0, and 3.5. The corresponding period of the impulse is given by T = 2π/ω. Choose ω0 = 3 and γ = 0.5. Are your results consistent with your experience of pushing a swing and with the comparable results of Problem 4.8? (b) Consider the response to a half-wave external force consisting of the positive part of a cosine function (see Figure 4.3). Compute A(ω) for ω0 = 3 and γ = 0.5. At what values of ω does A(ω) have a relative maxima? Is the half-wave cosine driving force equivalent to a sum of cosine functions of different frequencies? For example, does A(ω) have more than one resonance? (c) Compute the steady state response x(t) to the external force 1 m F(t) = 1 π + 1 2 cost + 2 3π cos2t − 2 15π cos4t. (4.20) How does a plot of F(t) versus t compare to the half-wave cosine function? Use your results to conjecture a principle of superposition for the solutions to linear equations. In many of the problems in this chapter, we have asked you to draw a phase space plot for a single oscillator. This plot provides a convenient representation of both the position and velocity. When we study chaotic phenomena, such plots will become almost indispensable (see Chapter 6). Here we will consider an important feature of phase space trajectories for conservative systems. If there are no external forces, the undamped simple harmonic oscillator and undamped pendulum are examples of conservative systems; that is, systems for which the total energy is a constant. In Problems 4.10 and 4.11, we will study two general properties of conservative systems, the nonintersecting nature of their trajectories in phase space and the preservation of area in phase space. These concepts will become more important when we study the properties of conservative systems with more than one degree of freedom. CHAPTER 4. OSCILLATIONS 97 x v Figure 4.4: What happens to a given area in phase space for conservative systems? Problem 4.10. Trajectory of a simple harmonic oscillator in phase space (a) We explore the phase space behavior of a single harmonic oscillator by simulating N initial conditions simultaneously. Write a program to simulate N identical simple harmonic oscillators each of which is represented by a small circle centered at its position and velocity in phase space as shown in Figure 4.4. One way to do so is to adapt the BouncingBallApp class introduced in Section 2.6. Choose N = 16 and consider random initial positions and velocities. Do the phase space trajectories for different initial conditions ever cross? Explain your answer in terms of the uniqueness of trajectories in a deterministic system. (b) Choose a set of initial conditions that form a rectangle (see Figure 4.4). Does the shape of this area change with time? What happens to the total area in comparison to the original area? Problem 4.11. Trajectory of a pendulum in phase space (a) Modify your program from Problem 4.10 so that the phase space trajectories (ω versus θ) of N = 16 pendula with different initial conditions can be compared. Plot several phase space trajectories for different values of the total energy. Are the phase space trajectories closed? Does the shape of the trajectory depend on the total energy? (b) Choose a set of initial conditions that form a rectangle in phase space and plot the state of each pendulum as a circle. Does the shape of this area change with time? What happens to the total area? 4.5 Electrical Circuit Oscillations In this section we discuss several electrical analogues of the mechanical systems that we have considered. Although the equations of motion are similar in form, it is convenient to consider electrical circuits separately, because the nature of the questions of interest is somewhat differ- ent. The starting point for electrical circuit theory is Kirchhoff’s loop rule, which states that the sum of the voltage drops around a closed path of an electrical circuit is zero. This law is a CHAPTER 4. OSCILLATIONS 98 Element Voltage Drop Symbol Units resistor VR = IR resistance R ohms (Ω) capacitor VC = Q/C capacitance C farads (F) inductor VL = LdI/dt inductance L henries (H) Table 4.1: The voltage drops across the basic electrical circuit elements. Q is the charge (coulombs) on one plate of the capacitor, and I is the current (amperes). Vs C R L Figure 4.5: A simple series RLC circuit with a voltage source Vs. consequence of conservation of energy, because a voltage drop represents the amount of energy that is lost or gained when a unit charge passes through a circuit element. The relations for the voltage drops across each circuit element are summarized in Table 4.1. Imagine an electrical circuit with an alternating voltage source Vs(t) attached in series to a resistor, inductor, and capacitor (see Figure 4.5). The corresponding loop equation is VL + VR + VC = Vs(t). (4.21) The voltage source term Vs in (4.21) is the emf and is measured in units of volts. If we substitute the relationships shown in Table 4.1, we find L d2Q dt2 + R dQ dt + Q C = Vs(t) (4.22) where we have used the definition of current I = dQ/dt. We see that (4.22) for the series RLC circuit is identical in form to the damped harmonic oscillator (4.17). The analogies between ideal electrical circuits and mechanical systems are summarized in Table 4.2. Although we are already familiar with (4.22), we first consider the dynamical behavior of an RC circuit described by RI(t) = R dQ dt = Vs(t) − Q C . (4.23) Two RC circuits corresponding to (4.23) are shown in Figure 4.6. Although the loop equation (4.23) is identical regardless of the order of placement of the capacitor and resistor in Figure 4.6, the output voltage measured by the oscilloscope in Figure 4.6 is different. We will see in Problem 4.12 that these circuits act as filters that pass voltage components of certain frequencies while rejecting others. An advantage of a computer simulation of an electrical circuit is that the measurement of a voltage drop across a circuit element does not affect the properties of the circuit. In fact, digital CHAPTER 4. OSCILLATIONS 99 Electric Circuit Mechanical System charge Q displacement x current I = dQ/dt velocity v = dx/dt voltage drop force inductance L mass m inverse capacitance 1/C spring constant k resistance R damping γ Table 4.2: Analogies between electrical parameters and mechanical parameters. Vs (b) R C Osc. Vout Vs (a) C R Osc. Vout Figure 4.6: Examples of RC circuits used as low and high pass filters. Which circuit is which? computers are often used to optimize the design of circuits for special applications. The RCApp program is not shown here because it is similar to PendulumApp, but this program is available in the Chapter 4 package. The RCApp program simulates an RC circuit with an alternating current (AC) voltage source of the form Vs(t) = cosωt and plots the time dependence of the charge on the capacitor. You are asked to modify this program in Problem 4.12. Problem 4.12. Simple filter circuits (a) Modify the RCApp program to simulate the voltages in an RC filter. Your program should plot the voltage across the resistor VR and the voltage across the source Vs, in addition to the voltage across the capacitor VC. Run this program with R = 1000Ω and C = 1.0µF (10−6 farads). Find the steady state amplitude of the voltage drops across the resistor and across the capacitor as a function of the angular frequency ω of the source voltage Vs = cosωt. Consider the frequencies f = 10, 50, 100, 160, 200, 500, 1000, 5000, and 10000 Hz. (Remember that ω = 2πf .) Choose ∆t to be no more than 0.0001 s for f = 10 Hz. What is a reasonable value of ∆t for f = 10000 Hz? (b) The output voltage depends on where the digital oscilloscope is connected. What is the output voltage of the oscilloscope in Figure 4.6a? Plot the ratio of the amplitude of the output voltage to the amplitude of the input voltage as a function of ω. Use a logarithmic scale for ω. What range of frequencies is passed? Does this circuit act as a high pass or a low pass filter? Answer the same questions for the oscilloscope in Figure 4.6b. Use your results to explain the operation of a high and low pass filter. Compute the value of the cutoff frequency for which the amplitude of the output voltage drops to 1/ √ 2 (half-power) of the input value. How is the cutoff frequency related to RC? CHAPTER 4. OSCILLATIONS 100 V(t) T 1 t Figure 4.7: Square wave voltage with period T and unit amplitude. (c) Plot the voltage drops across the capacitor and resistor as a function of time. The phase difference φ between each voltage drop and the source voltage can be found by finding the time tm between the corresponding maxima of the voltages. Because φ is usually expressed in radians, we have the relation φ/2π = tm/T , where T is the period of the oscillation. What is the phase difference φC between the capacitor and the voltage source and the phase difference φR between the resistor and the voltage source? Do these phase differences depend on ω? Does the current lead or lag the voltage; that is, does the maxima of VR(t) come before or after the maxima of Vs(t)? What is the phase difference between the capacitor and the resistor? Does the latter difference depend on ω? (d) Modify your program to find the steady state response of an LR circuit with a source voltage Vs(t) = cosωt. Let R = 100Ω and L = 2 × 10−3 H. Because L/R = 2 × 10−5 s, it is convenient to measure the time and frequency in units of T0 = L/R. We write t∗ = t/T0, ω∗ = ωT0, and rewrite the equation for an LR circuit as I(t∗ ) + dI(t∗) dt∗ = 1 R cosω∗ t∗ . (4.24) Because it will be clear from the context, we now simply write t and ω rather than t∗ and ω∗. What is a reasonable value of the step size ∆t? Compute the steady state amplitude of the voltage drops across the inductor and the resistor for the input frequencies f = 10, 20, 30, 35, 50, 100, and 200 Hz. Use these results to explain how an LR circuit can be used as a low pass or a high pass filter. Plot the voltage drops across the inductor and resistor as a function of time and determine the phase differences φR and φL between the resistor and the voltage source and the inductor and the voltage source. Do these phase differences depend on ω? Does the current lead or lag the voltage? What is the phase difference between the inductor and the resistor? Does the latter difference depend on ω? Problem 4.13. Square wave response of an RC circuit Modify your program so that the voltage source is a periodic square wave as shown in Figure 4.7. Use a 1.0µF capacitor and a 3000Ω resistor. Plot the computed voltage drop across the capacitor as a function of time. Make sure the period of the square wave is long enough so that the capacitor is fully charged during one half-cycle. What is the approximate time dependence of VC(t) while the capacitor is charging (discharging)? We now consider the steady state behavior of the series RLC circuit shown in Figure 4.5 and represented by (4.22). The response of an electrical circuit is the current rather than the charge CHAPTER 4. OSCILLATIONS 101 on the capacitor. Because we have simulated the analogous mechanical system, we already know much about the behavior of driven RLC circuits. Nonetheless, we will find several interesting features of AC electrical circuits in the following two problems. Problem 4.14. Response of an RLC circuit (a) Consider an RLC series circuit with R = 100Ω, C = 3.0µF, and L = 2 mH. Modify the simple harmonic oscillator program or the RC filter program to simulate an RLC circuit and compute the voltage drops across the three circuit elements. Assume an AC voltage source of the form V (t) = V0 cosωt. Plot the current I as a function of time and determine the maximum steady state current Imax for different values of ω. Obtain the resonance curve by plotting Imax(ω) as a function of ω and compute the value of ω at which the resonance curve is a maximum. This value of ω is the resonant frequency. (b) The sharpness of the resonance curve of an AC circuit is related to the quality factor or Q value. (Q should not be confused with the charge on the capacitor.) The sharper the resonance, the larger the value of Q. Circuits with high Q (and hence, a sharp resonance) are useful for tuning circuits in a radio so that only one station is heard at a time. We define Q = ω0/∆ω, where the width ∆ω is the frequency interval between points on the resonance curve Imax(ω) that are √ 2/2 of Imax at its maximum. Compute Q for the values of R, L, and C given in part (a). Change the value of R by 10% and compute the corresponding percentage change in Q. What is the corresponding change in Q if L or C is changed by 10%? (c) Compute the time dependence of the voltage drops across each circuit element for approximately fifteen frequencies ranging from 1/10 to 10 times the resonant frequency. Plot the time dependence of the voltage drops. (d) The ratio of the amplitude of the sinusoidal source voltage to the amplitude of the current is called the impedance Z of the circuit; that is, Z = Vmax/Imax. This definition of Z is a generalization of the resistance that is defined by the relation V = IR for direct current circuits. Use the plots of part (d) to determine Imax and Vmax for different frequencies and verify that the impedance is given by Z(ω) = R2 + (ωL − 1/ωC)2. (4.25) For what value of ω is Z a minimum? Note that the relation V = IZ holds only for the maximum values of I and V and not for I and V at any time. (e) Compute the phase difference φR between the voltage drop across the resistor and the voltage source. Consider ω ω0, ω = ω0, and ω ω0. Does the current lead or lag the voltage in each case; that is, does the current reach a maxima before or after the voltage? Also compute the phase differences φL and φC and describe their dependence on ω. Do the relative phase differences between VC, VR, and VL depend on ω? (f) Compute the amplitude of the voltage drops across the inductor and the capacitor at the resonant frequency. How do these voltage drops compare to the voltage drop across the resistor and to the source voltage? Also compare the relative phases of VC and VL at resonance. Explain how an RLC circuit can be used to amplify the input voltage. CHAPTER 4. OSCILLATIONS 102 4.6 Accuracy and Stability Now that we have learned how to use numerical methods to find numerical solutions to simple first-order differential equations, we need to develop some practical guidelines to help us estimate the accuracy of the various methods. Because we have replaced a differential equation by a difference equation, our numerical solution is not identically equal to the true solution of the original differential equation, except for special cases. The discrepancy between the two solutions has two causes. One cause is that computers do not store numbers with infinite precision, but rather to a maximum number of digits that is hardware and software dependent. As we have seen, Java allows the programmer to distinguish between floating point numbers; that is, numbers with decimal points, and integer numbers. Arithmetic with numbers represented by integers is exact, but we cannot solve a differential equation using integer arithmetic. Arithmetic operations involving floating point numbers, such as addition and multiplication, introduce roundoff error. For example, if a computer only stored floating point numbers to two significant figures, the product 2.1×3.2 would be stored as 6.7 rather than 6.72. The significance of roundoff errors is that they accumulate as the number of mathematical operations increases. Ideally, we should choose algorithms that do not significantly magnify the roundoff error; for example, we should avoid subtracting numbers that are nearly the same in magnitude. The other source of the discrepancy between the true answer and the computed answer is the error associated with the choice of algorithm. This error is called the truncation error. A truncation error would exist even on an idealized computer that stored floating point numbers with infinite precision and hence had no roundoff error. Because the truncation error depends on the choice of algorithm and can be controlled by the programmer, you should be motivated to learn more about numerical analysis and the estimation of truncation errors. However, there is no general prescription for the best algorithm for obtaining numerical solutions of differential equations. We will find in later chapters that the various algorithms have advantages and disadvantages, and the appropriate selection depends on the nature of the solution, which you might not know in advance, and on your objectives. How accurate must the answer be? Over how large an interval do you need the solution? What kind of computer(s) are you using? How much computer time and personal time do you have? In practice, we usually can determine the accuracy of a numerical solution by reducing the value of ∆t until the numerical solution is unchanged at the desired level of accuracy. Of course, we have to be careful not to make ∆t too small, because too many steps would be required and the computation time and roundoff error would increase. In addition to accuracy, another important consideration is the stability of an algorithm. As discussed in Appendix 3A, it might happen that the numerical results are very good for short times, but diverge from the true solution for longer times. This divergence might occur if small errors in the algorithm are multiplied many times, causing the error to grow geometrically. Such an algorithm is said to be unstable for the particular problem. We consider the accuracy and the stability of the Euler algorithm in Problems 4.15 and 4.16. Problem 4.15. Accuracy of the Euler algorithm (a) Use the Euler algorithm to compute the numerical solution of dy/dx = 2x with y = 0 at x = 0 and ∆x = 0.1, 0.05, 0.025, 0.01, and 0.005. Make a table showing the difference between the exact solution and the numerical solution. Is the difference between these solutions a decreasing function of ∆x? That is, if ∆x is decreased by a factor of two, how does the difference change? Plot the difference as a function of ∆x. If your points fall approximately on a straight line, then the difference is proportional to ∆x (for ∆x 1). The CHAPTER 4. OSCILLATIONS 103 numerical method is called nth order if the difference between the analytic solution and the numerical solution is proportional to (∆x)n for a fixed value of x. What is the order of the Euler algorithm? (b) One way to determine the accuracy of a numerical solution is to repeat the calculation with a smaller step size and compare the results. If the two calculations agree to p decimal places, we can reasonably assume that the results are correct to p decimal places. What value of ∆x is necessary for 0.1% accuracy at x = 2? What value of ∆x is necessary for 0.1% accuracy at x = 4? Problem 4.16. Stability of the Euler algorithm (a) Consider the differential equation (4.23) with Q = 0 at t = 0. This equation represents the charging of a capacitor in an RC circuit with a constant applied voltage V . Choose R = 2000Ω, C = 10−6 farads, and V = 10 volts. Do you expect Q(t) to increase with t? Does Q(t) increase indefinitely, or does it reach a steady-state value? Use a program to solve (4.23) numerically using the Euler algorithm. What value of ∆t is necessary to obtain three decimal accuracy at t = 0.005? (b) What is the nature of your numerical solution to (4.23) at t = 0.05 for ∆t = 0.005, 0.0025, and 0.001? Does a small change in ∆t lead to a large change in the computed value of Q? Is the Euler algorithm stable for reasonable values of ∆t? 4.7 Projects Project 4.17. Chemical oscillations The kinetics of chemical reactions can be modeled by a system of coupled first-order differential equations. As an example, consider the following reaction: A + 2B → 3B + C (4.26) where A,B, and C represent the concentrations of three different types of molecules. The corresponding rate equations for this reaction are dA dt = −kAB2 (4.27a) dB dt = kAB2 (4.27b) dC dt = kAB2 . (4.27c) The rate at which the reaction proceeds is determined by the reaction constant k. The terms on the right-hand side of (4.27) are positive if the concentration of the molecule increases in (4.26) as it does for B and C, and negative if the concentration decreases as it does for A. Note that the term 2B in the reaction (4.26) appears as B2 in the rate equation (4.27). In (4.27) we have assumed that the reactants are well stirred so that there are no spatial inhomogeneities. In Section 7.8 we will discuss the effects of spatial inhomogeneities due to molecular diffusion. Most chemical reactions proceed to equilibrium, where the mean concentrations of all molecules are constant. However, if the concentrations of some molecules are replenished, it is possible to observe oscillations and chaotic behavior (see Chapter 6). To obtain oscillations, it CHAPTER 4. OSCILLATIONS 104 is essential to have a series of chemical reactions such that the products of some reactions are the reactants of others. In the following, we consider a simple set of reactions that can lead to oscillations under certain conditions (see Lefever and Nicolis): A → X (4.28a) B + X → Y + D (4.28b) 2X + Y → 3X (4.28c) X → C. (4.28d) If we assume that the reverse reactions are negligible and A and B are held constant by an external source, the corresponding rate equations are dX dt = A − (B + 1)X + X2 Y (4.29a) dY dt = BX − X2 Y . (4.29b) For simplicity, we have chosen the rate constants to be unity. (a) The steady state solution of (4.29) can be found by setting dX/dt and dY /dt equal to zero. Show that the steady state values for (X,Y ) are (A,B/A). (b) Write a program to solve numerically the rate equations given by (4.29). Your program should input the initial values of X and Y and the fixed concentrations A and B, and plot X versus Y as the reactions evolve. (c) Systematically vary the initial values of X and Y for given values of A and B. Are their steady state behaviors independent of the initial conditions? (d) Let the initial value of (X,Y ) equal (A + 0.001,B/A) for several different values of A and B, that is, choose initial values close to the steady state values. Classify which initial values result in steady state behavior (stable) and which ones show periodic behavior (unstable). Find the relation between A and B that separates the two types of behavior. Project 4.18. Nerve impulses In 1952 Hodgkin and Huxley developed a model of nerve impulses to understand the nerve membrane potential of a giant squid nerve cell. The equations they developed are known as the Hodgkin-Huxley equations. The idea is that a membrane can be treated as a capacitor where CV = q, and thus the time rate of change of the membrane potential V is proportional to the current dq/dt flowing through the membrane. This current is due to the pumping of sodium and potassium ions through the membrane, a leakage current, and an external current stimulus. The model is capable of producing single nerve impulses, trains of nerve impulses, and other effects. The model is described by the following first-order differential equations: C dV dt = −gKn4 (V − VK) − gNam3 h(V − VNa) − gL(V − VL) + Iext(t) (4.30a) dn dt = αn(1 − n) − βnn (4.30b) dm dt = αm(1 − m) − βmm (4.30c) dh dt = αh(1 − h) − βhh (4.30d) CHAPTER 4. OSCILLATIONS 105 where V is the membrane potential in millivolts (mV), n, m, and h are time dependent functions that describe the gates that pump ions into or out of the cell, C is the membrane capacitance per unit area, the gi are the conductances per unit area for potassium, sodium, and the leakage current, Vi are the equilibrium potentials for each of the currents, and αj and βj are nonlinear functions of V . We use the notation n, m, and h for the gate functions because the notation is universally used in the literature. These gate functions are empirical attempts to describe how the membrane controls the flow of ions into and out of the nerve cell. Hodgkin and Huxley found the following empirical forms for αj and βj: αn = 0.01(V + 10)/[e(1+V /10) − 1] (4.31a) βn = 0.125eV /80 (4.31b) αm = 0.1(V + 25)/[e(2.5+V /10) − 1] (4.31c) βm = 4eV /18 (4.31d) αh = 0.07eV /20 (4.31e) βh = 1/[e(3+V /10) + 1]. (4.31f) The values of the parameters are C = 1.0µF/cm2, gK = 36 mmho/cm2, gNa = 120 mmho/cm2, gL = 0.3 mmho/cm2, VK = 12 mV, VNa = −115 mV, and VL = 10.6 mV. The unit mho represents ohm−1, and the unit of time is milliseconds (ms). These parameters assume that the resting potential of the nerve cell is zero; however, we now know that the resting potential is about −70 mV. We can use the ODE solver to solve (4.30) with the state vector {V ,n,m,h,t}; the rates are given by the right-hand side of (4.30). The following questions ask you to explore the properties of the model. (a) Write a program to plot n, m, and h as a function of V in the steady state (for which ˙n = ˙m = ˙h = 0). Describe how these gates are operating. (b) Write a program to simulate the nerve cell membrane potential and plot V (t). You can use a simple Euler algorithm with a time step of 0.01 ms. Describe the behavior of the potential when the external current is 0. (c) Consider a current that is zero except for a one millisecond interval. Try a current spike amplitude of 7µA (that is, the external current equals 7 in our units). Describe the resulting nerve impulse V (t). Is there a threshold value for the current below which there is no large spike but only a broad peak? (d) A constant current should produce a train of spikes. Try different amplitudes for the current and determine if there is a threshold current and how the spacing between spikes depends on the amplitude of the external current. (e) Consider a situation where there is a steady external current I1 for 20 ms and then the current increases to I2 = I1 + ∆I. There are three types of behavior depending on I2 and ∆I. Describe the behavior for the following four situations: (1) I1 = 2.0µA, ∆I = 1.5µA; (2) I1 = 2.0µA, ∆I = 5.0µA; (3) I1 = 7.0µA, ∆I = 1.0µA; and (4) I1 = 7.0µA, ∆I = 4.0µA. Try other values of I1 and ∆I as well. In which cases do you obtain a steady spike train? Which cases produce a single spike? What other behavior do you find? CHAPTER 4. OSCILLATIONS 106 (f) Once a spike is triggered, it is frequently difficult to trigger another spike. Consider a current pulse at t = 20 ms of 7µA that lasts for one millisecond. Then give a second current pulse of the same amplitude and duration at t = 25 ms. What happens? What happens if you add a third pulse at 30 ms? References and Suggestions for Further Reading F. S. Acton, Numerical Methods That Work (The Mathematical Association of America, 1999), Chapter 5. G. L. Baker and J. P. Gollub, Chaotic Dynamics: An Introduction, 2nd ed. (Cambridge University Press, 1996). A good introduction to the notion of phase space. Eugene I. Butikov, “Square-wave excitation of a linear oscillator,” Am. J. Phys. 72, 469–476 (2004). A. Douglas Davis, Classical Mechanics (Saunders College Publishing, 1986). The author gives simple numerical solutions of Newton’s equations of motion. Much emphasis is given to the harmonic oscillator problem. S. Eubank, W. Miner, T. Tajima, and J. Wiley, “Interactive computer simulation and analysis of Newtonian dynamics,” Am. J. Phys. 57, 457–463 (1989). Richard P. Feynman, Robert B. Leighton, and Matthew Sands, The Feynman Lectures on Physics, Vol. 1 (Addison–Wesley, 1963). Chapters 21 and 23–25 are devoted to various aspects of harmonic motion. A. P. French, Newtonian Mechanics (W. W. Norton & Company, 1971). An introductory level text with a good discussion of oscillatory motion. M. Gitterman, “Classical harmonic oscillator with multiplicative noise,” Physica A 352, 309– 334 (2005). The analysis is analytical and at the graduate level. However, it would be straightforward to reproduce most of the results after you learn about random processes in Chapter 7. A. L. Hodgkin and A. F. Huxley, “A quantitative description of ion currents and its applications to conduction and excitation in nerve membranes,” J. Physiol. (Lond.) 117, 500–544 (1952). Charles Kittel, Walter D. Knight, and Malvin A. Ruderman, Mechanics, 2nd ed., revised by A. Carl Helmholz and Burton J. Moyer (McGraw–Hill, 1973). R. Lefever and G. Nicolis, “Chemical instabilities and sustained oscillations,” J. Theor. Biol. 30, 267 (1971). Jerry B. Marion and Stephen T. Thornton, Classical Dynamics, 5th ed. (Harcourt, 2004). Excellent discussion of linear and nonlinear oscillators. M. F. McInerney, “Computer-aided experiments with the damped harmonic oscillator,” Am. J. Phys. 53, 991–996 (1985). CHAPTER 4. OSCILLATIONS 107 William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery, Numerical Recipes, 2nd ed. (Cambridge University Press, 1992). Chapter 16 discusses the integration of ordinary differential equations. Scott Hamilton, An Analog Electronics Companion (Cambridge University Press, 2003). A good discussion of the physics and mathematics of basic circuit design including an extensive introduction to circuit simulation using the PSpice simulation program. S. C. Zilio, “Measurement and analysis of large-angle pendulum motion,” Am. J. Phys. 50, 450–452 (1982). Chapter 5 Few-Body Problems: The Motion of the Planets We apply Newton’s laws of motion to planetary motion and other systems of a few particles and explore some of the counterintuitive consequences of Newton’s laws. 5.1 Planetary Motion Planetary motion is of special significance because it played an important role in the conceptual history of the mechanical view of the universe. Few theories have affected Western civilization as much as Newton’s laws of motion and the law of gravitation, which together relate the motion of the heavens to the motion of terrestrial bodies. Much of our knowledge of planetary motion is summarized by Kepler’s three laws, which can be stated as 1. Each planet moves in an elliptical orbit with the sun located at one of the foci of the ellipse. 2. The speed of a planet increases as its distance from the sun decreases such that the line from the sun to the planet sweeps out equal areas in equal times. 3. The ratio T 2/a3 is the same for all planets that orbit the sun, where T is the period of the planet and a is the semimajor axis of the ellipse. Kepler obtained these laws by a careful analysis of the observational data collected over many years by Tycho Brahe. Kepler’s first and third laws describe the shape of the orbit rather than the time dependence of the position and velocity of a planet. Because it is not possible to obtain this time dependence in terms of elementary functions, we will obtain the numerical solution of the equations of motion of planets and satellites in orbit. In addition, we will consider the effects of perturbing forces on the orbit and problems that challenge our intuitive understanding of Newton’s laws of motion. 108 CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 109 5.2 The Equations of Motion The motion of the Sun and Earth is an example of a two-body problem. We can reduce this problem to a one-body problem in one of two ways. The easiest way is to use the fact that the mass of the Sun is much greater than the mass of the Earth. Hence we can assume that, to a good approximation, the Sun is stationary and is a convenient choice of the origin of our coordinate system. If you are familiar with the concept of a reduced mass, you know that the reduction to a onebody problem is more general. That is, the motion of two objects of mass m and M, whose total potential energy is a function only of their relative separation, can be reduced to an equivalent one-body problem for the motion of an object of reduced mass µ given by µ = Mm m + M . (5.1) Because the mass of the Earth, m = 5.99 × 1024 kg, is so much smaller than the mass of the Sun, M = 1.99 × 1030 kg, we find that for most practical purposes, the reduced mass of the Sun and the Earth is that of the Earth alone. In the following, we consider the problem of a single particle of mass m moving about a fixed center of force, which we take as the origin of the coordinate system. Newton’s universal law of gravitation states that a particle of mass M attracts another particle of mass m with a force given by F = − GMm r2 ˆr = − GMm r3 r (5.2) where the vector r is directed from M to m (see Figure 5.1). The negative sign in (5.2) implies that the gravitational force is attractive; that is, it tends to decrease the separation r. The gravitational constant G is determined experimentally to be G = 6.67 × 10−11 m3 kg · s2 . (5.3) The force law (5.2) applies to the motion of the center of mass for objects of negligible spatial extent. Newton delayed publication of his law of gravitation for twenty years while he invented integral calculus and showed that (5.2) also applies to any uniform sphere or spherical shell of matter if the distance r is measured from the center of each mass. The gravitational force has two general properties: its magnitude depends only on the separation of the particles, and its direction is along the line joining the particles. Such a force is called a central force. The assumption of a central force implies that the orbit of the Earth is restricted to a plane (x-y), and the angular momentum L is conserved and lies in the third (z) direction. We write Lz in the form Lz = (r × mv)z = m(xvy − yvx) (5.4) where we have used the cross-product definition L = r×p and p = mv. An additional constraint on the motion is that the total energy E is conserved and is given by E = 1 2 mv2 − GMm r . (5.5) CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 110 M m y x r θ Fy Fx Figure 5.1: An object of mass m moves under the influence of a central force F. Note that cosθ = x/r and sinθ = y/r, which provide useful relations for writing the equations of motion in component form suitable for numerical solutions. If we fix the coordinate system at the mass M, the equation of motion of the particle of mass m is m d2r dt2 = − GMm r3 r. (5.6) It is convenient to write the force in Cartesian coordinates (see Figure 5.1): Fx = − GMm r2 cosθ = − GMm r3 x (5.7a) Fy = − GMm r2 sinθ = − GMm r3 y. (5.7b) Hence, the equations of motion in Cartesian coordinates are d2x dt2 = − GM r3 x (5.8a) d2y dt2 = − GM r3 y (5.8b) where r2 = x2 + y2. Equations (5.8a) and (5.8b) are examples of coupled differential equations because each equation contains both x and y. 5.3 Circular and Elliptical Orbits Because many planetary orbits are nearly circular, it is useful to obtain the condition for a circular orbit. The magnitude of the acceleration a is related to the radius r of the circular orbit by a = v2 r (5.9) where v is the speed of the object. The acceleration is always directed toward the center and is due to the gravitational force. Hence, we have mv2 r = GMm r2 (5.10) CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 111 y x O F1 F2 P ea a B A Figure 5.2: The characterization of an ellipse in terms of the semimajor axis a and the eccentricity e. The semiminor axis b is the distance OB. The origin O in Cartesian coordinates is at the center of the ellipse. and v = GM r 1/2 . (5.11) The relation (5.11) between the radius and the speed is the general condition for a circular orbit. We can also find the dependence of the period T on the radius of a circular orbit using the relation, T = 2πr v (5.12) in combination with (5.11) to obtain T 2 = 4π2 GM r3 . (5.13) The relation (5.13) is a special case of Kepler’s third law with the radius r corresponding to the semimajor axis of an ellipse. A simple geometrical characterization of an elliptical orbit is shown in Figure 5.2. The two foci of an ellipse, F1 and F2, have the property that for any point P , the distance F1P + F2P is a constant. In general, an ellipse has two perpendicular axes of unequal length. The longer axis is the major axis; half of this axis is the semimajor axis a. The shorter axis is the minor axis; the semiminor axis b is half of this distance. It is common to specify an elliptical orbit by a and by the eccentricity e, where e is the ratio of the distance between the foci to the length of the major axis. Because F1P + F2P = 2a, it is easy to show that e = 1 − b2 a2 (5.14) CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 112 with 0 < e < 1. (Choose the point P at x = 0,y = b.) A special case is b = a, for which the ellipse reduces to a circle and e = 0. 5.4 Astronomical Units It is convenient to choose a system of units in which the magnitude of the product GM is not too large and not too small. To describe the Earth’s orbit, the convention is to choose the length of the Earth’s semimajor axis as the unit of length. This unit of length is called the astronomical unit (AU) and is 1AU = 1.496 × 1011 m. (5.15) The unit of time is assumed to be one year or 3.15×107 s. In these units, the period of the Earth is T = 1 years and its semimajor axis is a = 1AU. Hence, from (5.13) GM = 4π2a3 T 2 = 4π2 AU3 /years2 (astronomical units). (5.16) As an example of the use of astronomical units, a program distance of 1.5 would correspond to 1.5 × (1.496 × 1011) = 2.244 × 1011 m. 5.5 Log-log and Semilog Plots The values of T and a for our solar system are given in Table 5.1. We first analyze these values and determine if T and a satisfy a simple mathematical relationship. Suppose we wish to determine whether two variables y and x satisfy a functional relationship, y = f (x). To simplify the analysis, we ignore possible errors in the measurements of y and x. The simplest relation between y and x is linear; that is, y = mx + b. The existence of such a relation can be seen by plotting y versus x and finding if the plot is linear. From Table 5.1 we see that T is not a linear function of a. For example, an increase in T from 0.24 to 1, an increase of approximately 4, yields an increase in a of approximately 2.5. For many problems, it is reasonable to assume an exponential relation y = C erx (5.17) or a power law relation y = C xn (5.18) where C, r, and n are unknown parameters. If we assume the exponential form (5.17), we can take the natural logarithm of both sides to find lny = lnC + rx. (5.19) Hence, if (5.17) is applicable, a plot of lny versus x would yield a straight line with slope r and intercept lnC. The natural logarithm of both sides of the power law relation (5.18) yields lny = lnC + nlnx. (5.20) If (5.18) applies, a plot of lny versus lnx yields the exponent n (the slope), which is the usual quantity of physical interest if a power law dependence holds. CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 113 Planet T (Earth years) a (AU) Mercury 0.241 0.387 Venus 0.615 0.723 Earth 1.0 1.0 Mars 1.88 1.523 Jupiter 11.86 5.202 Saturn 29.5 9.539 Uranus 84.0 19.18 Neptune 165 30.06 Pluto 248 39.44 Table 5.1: The period T and semimajor axis a of the planets. The unit of length is the astronomical unit (AU). The unit of time is one (Earth) year. We illustrate a simple analysis of the data in Table 5.1. Because we expect that the relation between T and a has the power law form T = Can, we plot lnT versus lna (see Figure 5.3). A visual inspection of the plot indicates that a linear relationship between lnT and lna is reasonable and that the slope is approximately 1.50 in agreement with Kepler’s second law. In Chapter 7 we will discuss the least squares method for fitting a straight line through a number of data points. With a little practice, you can do a visual analysis that is nearly as good. The PlotFrame class contains the axes and titles needed to produce linear, log-log, and semilog plots. It also contains the methods needed to display data in a table format. This table can be displayed programmatically or by right-clicking (control-clicking) at runtime. Listing 5.1 shows a short program that produces the log-log plot of the semimajor axis of the planets versus the orbital period. The arrays a and T contain the semimajor axis of the planets and their periods, respectively. Setting the log scale option causes the PlotFrame to transform the data as it is being plotted and causes the axis to change how labels are rendered. Note that the plot automatically adjusts itself to fit the data because the autoscale option is true by default. Also the grid and the tick-labels change as the window is resized. Listing 5.1: A simple program that producs a log-log plot to demonstrate Kepler’s second law. package org . opensourcephysics . sip . ch05 ; import org . opensourcephysics . frames . PlotFrame ; public class SecondLawPlotApp { public s t a t i c void main ( String [ ] args ) { PlotFrame frame = new PlotFrame ( "ln(a)" , "ln(T)" , "Kepler’s second law" ) ; frame . setLogScale ( true , true ) ; frame . setConnected ( false ) ; double [ ] period = { 0.241 , 0.615 , 1.0 , 1.88 , 11.86 , 29.50 , 84.0 , 165 , 248 } ; double [ ] a = { 0.387 , 0.723 , 1.0 , 1.523 , 5.202 , 9.539 , 19.18 , 30.06 , 39.44 } ; frame . append (0 , a , period ) ; frame . s e t V i s i b l e ( true ) ; / / d e f i n e s t i t l e s of t a b l e columns frame . setXYColumnNames (0 , "T (years)" , "a (AU)" ) ; CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 114 -1 0 1 2 3 4 -2 -1 0 1 2 3 4 5 6 lnT ln a Figure 5.3: Plot of lnT versus lna using the data in Table 5.1. Verify that the slope is 1.50. x y1(x) y2(x) y3(x) 0 0.00 0.00 2.00 0.5 0.75 1.59 5.44 1.0 3.00 2.00 14.78 1.5 6.75 2.29 40.17 2.0 12.00 2.52 109.20 2.5 18.75 2.71 296.83 Table 5.2: Determine the functional forms of y(x) for the three sets of data. There are no measurement errors, but there are roundoff errors. / / shows data t a b l e ; can a l s o be done from frame menu frame . showDataTable ( true ) ; frame . setDefaultCloseOperation ( javax . swing . JFrame .EXIT_ON_CLOSE ) ; } } Exercise 5.1. Simple functional forms (a) Run SecondLawPlotApp and convince yourself that you understand the syntax. (b) Modify SecondLawPlotApp so that the three sets of data shown in Table 5.2 are plotted. Generate linear, semilog, and log-log plots to determine the functional form of y(x) that best fits each data set. CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 115 5.6 Simulation of the Orbit We now develop a program to simulate the Earth’s orbit about the Sun. The PlanetApp class shown in Listing 5.2 organizes the startup process and creates the visualization. Because this class extends AbstractSimulation, it is sufficient to know that the superclass invokes the doStep method periodically when the thread is running or once each time the Step button is clicked. The preferred scale and the aspect ratio for the plot frame are set in the constructor. The statement frame.setSquareAspect(true) ensures that a unit of distance will equal the same number of pixels in both the horizontal and vertical directions; the statement planet.initialize(new double[]{x, vx, y, vy, 0}) in the initialize method is used to create an array on the fly as the argument to another method. Listing 5.2: PlanetApp. package org . opensourcephysics . sip . ch05 ; import org . opensourcephysics . controls . ; import org . opensourcephysics . frames . ; public class PlanetApp extends AbstractSimulation { PlotFrame frame = new PlotFrame ( "x (AU)" , "y (AU)" , "Planet Simulation" ) ; Planet planet = new Planet ( ) ; public PlanetApp ( ) { frame . addDrawable ( planet ) ; frame . setPreferredMinMax ( −5 , 5 , −5, 5 ) ; frame . setSquareAspect ( true ) ; } public void doStep ( ) { for ( int i = 0; i <5; i ++) { / / do 5 s t e p s between screen draws planet . doStep ( ) ; / / advances time } frame . setMessage ( "t = "+decimalFormat . format ( planet . s t a t e [ 4 ] ) ) ; } public void i n i t i a l i z e ( ) { planet . odeSolver . setStepSize ( control . getDouble ( "dt" ) ) ; double x = control . getDouble ( "x" ) ; double vx = control . getDouble ( "vx" ) ; double y = control . getDouble ( "y" ) ; double vy = control . getDouble ( "vy" ) ; / / c r e a t e an array on the f l y as the argument to another method planet . i n i t i a l i z e (new double [ ] { x , vx , y , vy , 0 } ) ; frame . setMessage ( "t = 0" ) ; } public void reset ( ) { control . setValue ( "x" , 1 ) ; control . setValue ( "vx" , 0 ) ; control . setValue ( "y" , 0 ) ; control . setValue ( "vy" , 6 . 2 8 ) ; control . setValue ( "dt" , 0 . 0 1 ) ; i n i t i a l i z e ( ) ; CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 116 } public s t a t i c void main ( String [ ] args ) { SimulationControl . createApp (new PlanetApp ( ) ) ; } } The Planet class in Listing 5.3 defines the physics and instantiates the numerical method. The latter is the Euler algorithm, which will be replaced in Problem 5.2. Note how the argument to the initialize method is used. The System.arraycopy(array1,index1,array2,index2,length method in the core Java API copies blocks of memory, such as arrays, and is optimized for particular operating systems. This method copies length elements of array1 starting at index1 into array2 starting at index2. In most applications index1 and index2 will be set equal to 0. Listing 5.3: Class that models the rate equation for a planet acted on by an inverse square law force. package org . opensourcephysics . sip . ch05 ; import java . awt . ; import org . opensourcephysics . display . ; import org . opensourcephysics . numerics . ; public class Planet implements Drawable , ODE { / / GM in units of (AU)^3/( yr )^2 final s t a t i c double GM = 4 Math . PI Math . PI ; Circle c i r c l e = new Circle ( ) ; Trail t r a i l = new Trail ( ) ; double [ ] s t a t e = new double [ 5 ] ; / / {x , vx , y , vy , t } Euler odeSolver = new Euler ( this ) ; / / c r e a t e s numerical method public void doStep ( ) { odeSolver . step ( ) ; / / advances time t r a i l . addPoint ( s t a t e [ 0 ] , s t a t e [ 2 ] ) ; / / x , y } void i n i t i a l i z e ( double [ ] i n i t S t a t e ) { System . arraycopy ( i ni t St a te , 0 , state , 0 , i n i t S t a t e . length ) ; / / r e i n i t i a l i z e s the s o l v e r in case the s o l v e r a c c e s s e s data / / from previous s t e p s odeSolver . i n i t i a l i z e ( odeSolver . getStepSize ( ) ) ; t r a i l . clear ( ) ; } public void getRate ( double [ ] state , double [ ] rate ) { / / s t a t e [ ] : x , vx , y , vy , t double r2 = ( s t a t e [ 0 ] s t a t e [ 0 ] ) + ( s t a t e [ 2 ] s t a t e [ 2 ] ) ; / / r squared double r3 = r2 Math . sqrt ( r2 ) ; / / r cubed rate [ 0] = s t a t e [ 1 ] ; / / x r a t e rate [ 1] = (−GM s t a t e [ 0 ] ) / r3 ; / / vx r a t e rate [ 2] = s t a t e [ 3 ] ; / / y r a t e rate [ 3] = (−GM s t a t e [ 2 ] ) / r3 ; / / vy r a t e rate [ 4] = 1; / / time r a t e } CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 117 public double [ ] getState ( ) { return s t a t e ; } public void draw ( DrawingPanel panel , Graphics g ) { c i r c l e . setXY ( s t a t e [ 0 ] , s t a t e [ 2 ] ) ; c i r c l e . draw ( panel , g ) ; t r a i l . draw ( panel , g ) ; } } The Planet class implements the Drawable interface and defines the draw method as described in Section 3.3. In this case we did not use graphics primitives such as fillOval to perform the drawing. Instead, the method calls the methods circle.draw and trail.draw to draw the planet and its trajectory, respectively. Invoking a method in another object that has the desired functionality is known as forwarding or delegating the method. One advantage of forwarding is that we can change the implementation of the drawing within the Planet class at any time and still be assured that the planet object is drawable. We could, for example, replace the circle by an image of the Earth. Note that we have created a composite object by combining the properties of the simpler circle and trace objects. These techniques of encapsulation and composition are common in object oriented programming. Problem 5.2. Verification of Planet and PlanetApp for circular orbits (a) Verify Planet and PlanetApp by considering the special case of a circular orbit. For example, choose (in astronomical units) x(t = 0) = 1, y(t = 0) = 0, and vx(t = 0) = 0. Use the relation (5.11) to find the value of vy(t = 0) that yields a circular orbit. How small a value of ∆t is needed so that a circular orbit is repeated over many periods? Your answer will depend on your choice of differential equation solver. Find the largest value of ∆t that yields an orbit that repeats for many revolutions using the Euler, Euler–Cromer, Verlet, and RK4 algorithms. Is it possible to choose a smaller value of ∆t, or are some algorithms, such as the Euler method, simply not stable for this dynamical system? (b) Write a method to compute the total energy [see (5.5)] and compute it at regular intervals as the system evolves. (It is sufficient to calculate the energy per unit mass, E/m.) For a given value of ∆t, which algorithm conserves the total energy best? Is it possible to choose a value of ∆t that conserves the energy exactly? What is the significance of the negative sign for the total energy? (c) Write a separate method to determine the numerical value of the period. (See Problem 3.9c for a discussion of a similar condition.) Choose different sets of values of x(t = 0) and vy(t = 0), consistent with the condition for a circular orbit. For each orbit determine the radius and the period and verify Kepler’s third law. Problem 5.3. Verification of Kepler’s second and third law (a) Set y(t = 0) = 0 and vx(t = 0) = 0 and find by trial and error several values of x(t = 0) and vy(t = 0) that yield elliptical orbits of a convenient size. Choose a suitable algorithm and plot the speed of the planet as the orbit evolves. Where is the speed a maximum (minimum)? CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 118 (b) Use the same initial conditions as in part (a) and compute the total energy, angular momentum, semimajor and semiminor axes, eccentricity, and period for each orbit. Plot your data for the dependence of the period T on the semimajor axis a and verify Kepler’s third law. Given the ratio of T 2/a3 that you found, determine the numerical value of this ratio in SI units for our solar system. (c) The force center is at (x,y) = (0,0) and is one focus. Find the second focus by symmetry. Compute the sum of the distances from each point on the orbit to the two foci and verify that the orbit is an ellipse. (d) According to Kepler’s second law, the orbiting object sweeps out equal areas in equal times. If we use an algorithm with a fixed time step ∆t, it is sufficient to compute the area of the triangle swept in each time step. This area equals one-half the base of the triangle times its height, or 1 2 ∆t (r × v) = 1 2 ∆t(xvy − yvx). Is this area a constant? This constant corresponds to what physical quantity? (e)∗ Show that algorithms with a fixed value of ∆t break down if the “planet” is too close to the sun. What is the cause of the failure of the method? What advantage might there be to using a variable time step? What are the possible disadvantages? (See Project 5.19 for an example where a variable time step is very useful.) Problem 5.4. Noninverse square forces (a) Consider the dynamical effects of a small change in the attractive inverse-square force law, for example, let the magnitude of the force equal Cm/r2+δ, where δ << 1. For simplicity, take the numerical value of the constant C to be 4π2 as before. Consider the initial conditions x(t = 0) = 1, y(t = 0) = 0, vx(t = 0) = 0, and vy(t = 0) = 5. Choose δ = 0.05 and determine the nature of the orbit. Does the orbit of the planet retrace itself? Verify that your result is not due to your choice of ∆t. Does the planet spiral away from or toward the sun? The path of the planet can be described as an elliptical orbit that slowly rotates or precesses in the same sense as the motion of the planet. A convenient measure of the precession is the angle between successive orientations of the semimajor axis of the ellipse. This angle is the rate of precession per revolution. Estimate the magnitude of this angle for your choice of δ. What is the effect of decreasing the semimajor axis for fixed δ? What is the effect of changing δ for fixed semimajor axis? (b) Einstein’s theory of gravitation (the general theory of relativity) predicts a correction to the force on a planet that varies as 1/r4 due to a weak gravitational field. The result is that the equation of motion for the trajectory of a particle can be written as d2r dt2 = − GM r2 1 + α GM c2 2 1 r2 ˆr, (5.21) where the parameter α is dimensionless. Take GM = 4π2 and assume α = 10−3. Determine the nature of the orbit for this potential. (For our solar system, the constant α is a maximum for the planet Mercury, but is much smaller than 10−3.) (c) Suppose that the attractive gravitational force law depends on the inverse cube of the distance, Cm/r3. What are the units of C? For simplicity, take the numerical value of C to be 4π2. Consider the initial condition x(t = 0) = 1, y(t = 0) = 0, vx(t = 0) = 0 and determine analytically the value of vy(t = 0) required for a circular orbit. How small a value of ∆t is needed so that the simulation yields a circular orbit over several periods? How does this value of ∆t compare with the value needed for the inverse-square force law? CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 119 (a) (b) Figure 5.4: (a) An impulse applied in the tangential direction. (b) An impulse applied in the radial direction. (d) Vary vy(t = 0) by approximately 2% from the circular orbit condition that you determined in part (c). What is the nature of the new orbit? What is the sign of the total energy? Is the orbit bound? Is it closed? Are all bound orbits closed? Problem 5.5. Effect of drag resistance on a satellite orbit Consider a satellite in orbit about the Earth. In this case it is convenient to measure distances in terms of the radius of the Earth, R = 6.37 × 106 m, and the time in terms of hours. Because the force on the satellite is proportional to Gm, where m = 5.99 × 1024 kg is the mass of the Earth, we need to evaluate the product Gm in Earth units (EU). In these units the value of Gm is given by Gm = 6.67 × 10−11 m3 kg · s2 1EU 6.37 × 106 m 3 3.6 × 103 s/h 2 5.99 × 1024 kg = 20.0EU3 /h2 (Earth units). (5.22) Modify the Planet class to incorporate the effects of drag resistance on the motion of an orbiting Earth satellite. Assume that the drag force is proportional to the square of the speed of the satellite. To be able to observe the effects of air resistance in a reasonable time, take the magnitude of the drag force to be approximately one-tenth of the magnitude of the gravitational force. Choose initial conditions such that a circular orbit would be obtained in the absence of drag resistance and allow at least one revolution before “switching on” the drag resistance. Describe the qualitative change of the orbit due to drag resistance. How does the total energy and the speed of the satellite change with time? 5.7 Impulsive Forces What happens to the orbit of an Earth satellite when it is hit by space debris? We now discuss the modifications we need to make in Planet and PlanetApp so that we can apply an impulsive force (a kick) by a mouse click. If we apply a vertical kick when the position of the satellite is as shown in Figure 5.4a, the impulse would be tangential to the orbit. A radial kick can be applied when the satellite is as shown in Figure 5.4b. CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 120 User actions, such as mouse clicks or keyboard entries, are passed from the operating system to Java event listeners. Although this standard Java framework is straightforward, we have simplified it to respond to mouse actions within the Open Source Physics panels and frames.1 In order for an Open Source Physics program to respond to mouse actions, the program implements the InteractiveMouseHandler interface and then registers its ability to process mouse actions with the PlotFrame. This procedure is demonstrated in the following test program. You can copy the handleMouseAction code into your program and replace the print statements with useful methods. Other mouse actions, such as MOUSE_CLICKED, MOUSE_MOVED, and MOUSE_ENTERED are defined in the InteractivePanel class. Listing 5.4: InteractiveMouseHandler interface test program. package org . opensourcephysics . sip . ch05 ; import java . awt . event . ; import javax . swing . ; import org . opensourcephysics . display . ; import org . opensourcephysics . frames . ; public class MouseApp implements InteractiveMouseHandler { PlotFrame frame = new PlotFrame ( "x" , "y" , "Interactive Handler" ) ; public MouseApp ( ) { frame . setInteractiveMouseHandler ( this ) ; frame . s e t V i s i b l e ( true ) ; frame . setDefaultCloseOperation ( JFrame .EXIT_ON_CLOSE ) ; } public void handleMouseAction ( InteractivePanel panel , MouseEvent evt ) { switch ( panel . getMouseAction ( ) ) { case InteractivePanel .MOUSE_DRAGGED : panel . setMessage ( "Dragged" ) ; break ; case InteractivePanel .MOUSE_PRESSED : panel . setMessage ( "Pressed" ) ; break ; case InteractivePanel .MOUSE_RELEASED : panel . setMessage ( null ) ; break ; } } public s t a t i c void main ( String [ ] args ) { new MouseApp ( ) ; } } The switch statement is used in Listing 5.4 instead of a chain of if statements. The panel’s getMouseAction method returns an integer. If this integer matches one of the named constants following the case label, then the statements following that constant are executed until a break statement is encountered. If a case does not include a break, then the execution continues with 1See the Open Source Physics User’s Guide for an extensive discussion of interactive drawing panels. CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 121 the next case. The equivalent of the else construct in an if statement is default followed by statements that are executed if none of the explicit cases occur. We now challenge your intuitive understanding of Newton’s laws of motion by considering several perturbations of the motion of an orbiting object. Modify your planet program to simulate the effects of the perturbations in Problem 5.6. In each case answer the questions before doing the simulation. Problem 5.6. Tangential and radial perturbations (a) Suppose that a small tangential “kick” or impulsive force is applied to a satellite in a circular orbit about the Earth (see Figure 5.4a.) Choose Earth units so that the numerical value of the product Gm is given by (5.22). Apply the impulsive force by stopping the program after the satellite has made several revolutions and click the mouse to apply the force. Recall that the impulse changes the momentum in the desired direction. In what direction does the orbit change? Is the orbit stable, for example, does a small impulse lead to a small change in the orbit? Does the orbit retrace itself indefinitely if no further perturbations are applied? Describe the shape of the perturbed orbit. (b) How does the change in the orbit depend on the strength of the kick and its duration? (c) Determine if the angular momentum and the total energy are changed by the perturbation. (d) Apply a radial kick to the satellite as in Figure 5.4b and answer the same questions as in parts (a)–(c). (e) Determine the stability of the inverse-cube force law (see Problem 5.4) to radial and tangential perturbations. Mouse actions are not the only possible way to affect the simulation. We can also add custom buttons to the control. These buttons are added when the program is instantiated in the main method. public s t a t i c void main ( String [ ] args ) { / / OSPControl i s a s u p e r c l a s s of SimulationControl OSPControl control = SimulationControl . createApp (new PlanetApp ( ) ) ; control . addButton ( "doRadialKick" , "Kick!" , "Perform a radial kick" ) ; } Note that SimulationControl (and CalculationControl) extend the OSPControl superclass and therefore support the addButton method where this method is defined. We assign the variable returned by the static createApp method to a variable of type OSPControl to highlight the object-oriented structure of the Open Source Physics library. The first parameter in the addButton method specifies the method that will be invoked when the button is clicked, the second parameter specifies the text label that will appear on the button, and the third parameter specifies the tool tip that will appear when the mouse hovers over the button. Custom buttons can be used for just about anything, but the corresponding method must be defined. Exercise 5.7. Custom buttons Use a custom button in Problem 5.6 rather than a mouse click to apply an impulsive force to the planet. CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 122 O vy vx u w Figure 5.5: The orbit of a particle in velocity space. The vector w points from the origin in velocity space to the center of the circular orbit. The vector u points from the center of the orbit to the point (vx,vy). 5.8 Velocity Space In Problem 5.6 your intuition might have been incorrect. For example, you might have thought that the orbit would elongate in the direction of the kick. In fact the orbit does elongate but in a direction perpendicular to the kick. Do not worry; you are in good company! Few students have a good qualitative understanding of Newton’s law of motion, even after taking an introductory course in physics. A qualitative way of stating Newton’s second law is Forces act on the trajectories of particles by changing velocity, not position. If we fail to take into account this property of Newton’s second law, we will encounter physical situations that appear counterintuitive. Because force acts to change velocity, it is reasonable to consider both velocity and position on an equal basis. In fact position and momentum are treated in such a manner in advanced formulations of classical mechanics and in quantum mechanics. In Problem 5.8 we explore some of the properties of orbits in velocity space in the context of the bound motion of a particle in an inverse-square force. Modify your program so that the path in velocity space of the Earth is plotted. That is, plot the point (vx,vy) the same way you plotted the point (x,y). The path in velocity space is a series of successive values of the object’s velocity vector. If the position space orbit is an ellipse, what is the shape of the orbit in velocity space? Problem 5.8. Properties of velocity space orbits (a) Modify your program to display the orbit in position space and in velocity space at the same time. Verify that the velocity space orbit is a circle, even if the orbit in position space CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 123 is an ellipse. Does the center of this circle coincide with the origin (vx,vy) = (0,0) in velocity space? Choose the same initial conditions that you considered in Problems 5.2 and 5.3. (b)∗ Let u denote the radius vector of a point on the velocity circle and w denote the vector from the origin in velocity space to the center of the velocity circle (see Figure 5.5). Then the velocity of the particle can be written as v = u + w. (5.23) Compute u and verify that its magnitude is given by u = GMm/L (5.24) where L is the magnitude of the angular momentum. Note that L is proportional to m so that it is not necessary to know the magnitude of m. (c)∗ Verify that at each moment in time, the planet’s position vector r is perpendicular to u. Explain why this relation holds. Problem 5.9. Effect of impulses in velocity space How does the velocity space orbit change when an impulsive kick is applied in the tangential or in the radial direction? How do the magnitude and direction of w change? From the observed change in the velocity orbit and the above considerations, explain the observed change of the orbit in position space. 5.9 A Mini-Solar System So far our study of planetary orbits has been restricted to two-body central forces. However, the solar system is not a two-body system, because the planets exert gravitational forces on one another. Although the interplanetary forces are small in magnitude in comparison to the gravitational force of the sun, they can produce measurable effects. For example, the existence of Neptune was conjectured on the basis of a discrepancy between the experimentally measured orbit of Uranus and the predicted orbit calculated from the known forces. The presence of other planets implies that the total force on a given planet is not a central force. Furthermore, because the orbits of the planets are not exactly in the same plane, an analysis of the solar system must be extended to three dimensions if accurate calculations are required. However, for simplicity, we will consider a model of a two-dimensional solar system with two planets in orbit about a fixed sun. The equations of motion of two planets of mass m1 and mass m2 can be written in vector form as (see Figure 5.6) m1 d2r1 dt2 = − GMm1 r1 3 r1 + Gm1m2 r21 3 r21 (5.25a) m2 d2r2 dt2 = − GMm2 r2 3 r2 − Gm1m2 r21 3 r21 (5.25b) where r1 and r2 are directed from the sun to planets 1 and 2 respectively, and r21 = r2 −r1 is the vector from planet 1 to planet 2. It is convenient to divide (5.25a) by m1 and (5.25b) by m2 and CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 124 y x r2 r1 r21 M m1 m2 Figure 5.6: The coordinate system used in (5.25). Planets of mass m1 and m2 orbit a sun of mass M. to write the equations of motion as d2r1 dt2 = − GM r1 3 r1 + Gm2 r21 3 r21 (5.26a) d2r2 dt2 = − GM r2 3 r2 − Gm1 r21 3 r21. (5.26b) A numerical solution of (5.26) can be obtained by the straightforward extension of the Planet class as shown in Listing 5.5. To simplify the drawing of the particle trajectories, the Planet2 class defines an inner class, Mass, which extends Circle and contains a Trail. Whenever a planet moves, a point is added to the trail so that its location and path are shown on the plot. Inner classes are an organizational convenience that save us the trouble of having to create another file, which in this case would be named Mass.java. When we compile the Planet2 class, we will produce a bytecode file named Planet2$Mass.class in addition to the file Planet2.class. Inner classes are most effective as short helper classes which work in conjuction with the containing class because they have access to all the data (including private variables) in the containing class. Listing 5.5: A class that implements the rate equation for two interacting planets acted on by an inverse-square law force. package org . opensourcephysics . sip . ch05 ; import java . awt . ; import org . opensourcephysics . display . ; import org . opensourcephysics . numerics . ; public class Planet2 implements Drawable , ODE { / / GM in units of (AU)^3/( yr )^2 final s t a t i c double GM = 4 Math . PI Math . PI ; final s t a t i c double GM1 = 0.04 GM; final s t a t i c double GM2 = 0.001 GM; double [ ] s t a t e = new double [ 9 ] ; ODESolver odeSolver = new RK45MultiStep ( this ) ; Mass mass1 = new Mass ( ) , mass2 = new Mass ( ) ; CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 125 public void doStep ( ) { odeSolver . step ( ) ; mass1 . setXY ( s t a t e [ 0 ] , s t a t e [ 2 ] ) ; mass2 . setXY ( s t a t e [ 4 ] , s t a t e [ 6 ] ) ; } public void draw ( DrawingPanel panel , Graphics g ) { mass1 . draw ( panel , g ) ; mass2 . draw ( panel , g ) ; } void i n i t i a l i z e ( double [ ] i n i t S t a t e ) { System . arraycopy ( i ni t St a te , 0 , state , 0 , i n i t S t a t e . length ) ; mass1 . clear ( ) ; / / c l e a r s data from the old t r a i l mass2 . clear ( ) ; mass1 . setXY ( s t a t e [ 0 ] , s t a t e [ 2 ] ) ; mass2 . setXY ( s t a t e [ 4 ] , s t a t e [ 6 ] ) ; } public void getRate ( double [ ] state , double [ ] rate ) { / / s t a t e [ ] : x1 , vx1 , y1 , vy1 , x2 , vx2 , y2 , vy2 , t double r1Squared = ( s t a t e [ 0 ] s t a t e [ 0 ] ) + ( s t a t e [ 2 ] s t a t e [ 2 ] ) ; double r1Cubed = r1Squared Math . sqrt ( r1Squared ) ; double r2Squared = ( s t a t e [ 4 ] s t a t e [ 4 ] ) + ( s t a t e [ 6 ] s t a t e [ 6 ] ) ; double r2Cubed = r2Squared Math . sqrt ( r2Squared ) ; double dx = s t a t e [4] − s t a t e [ 0 ] ; / / x12 separ ation double dy = s t a t e [6] − s t a t e [ 2 ] ; / / y12 separ ation double dr2 = ( dx dx )+( dy dy ) ; / / r12 squared double dr3 = Math . sqrt ( dr2 ) dr2 ; / / r12 cubed rate [ 0] = s t a t e [ 1 ] ; / / x1 r a t e rate [ 2] = s t a t e [ 3 ] ; / / y1 r a t e rate [ 4] = s t a t e [ 5 ] ; / / x2 r a t e rate [ 6] = s t a t e [ 7 ] ; / / y2 r a t e rate [ 1] = (( −GM s t a t e [ 0 ] ) / r1Cubed ) + ( (GM1 dx )/ dr3 ) ; / / vx1 r a t e rate [ 3] = (( −GM s t a t e [ 2 ] ) / r1Cubed ) + ( (GM1 dy )/ dr3 ) ; / / vy1 r a t e rate [ 5] = (( −GM s t a t e [ 4 ] ) / r2Cubed ) −((GM2 dx )/ dr3 ) ; / / vx2 r a t e rate [ 7] = (( −GM s t a t e [ 6 ] ) / r2Cubed ) −((GM2 dy )/ dr3 ) ; / / vy2 r a t e rate [ 8] = 1; / / time r a t e } public double [ ] getState ( ) { return s t a t e ; } class Mass extends Circle { Trail t r a i l = new Trail ( ) ; public void draw ( DrawingPanel panel , Graphics g ) { t r a i l . draw ( panel , g ) ; super . draw ( panel , g ) ; } void clear ( ) { CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 126 t r a i l . clear ( ) ; } public void setXY ( double x , double y ) { super . setXY ( x , y ) ; t r a i l . addPoint ( x , y ) ; } } } The target application, Planet2App, extends AbstractSimulation in the usual way. Because it is almost identical to Listing 5.2, it is not shown here. The complete program is available in the ch05 package. Problem 5.10. Planetary perturbations Use Planet2App with the initial conditions given in the program. For illustrative purposes, we have adopted the numerial values m1/M = 10−3 and m2/M = 4 × 10−2 and hence GM1 = (m2/M)GM = 0.04GM and GM2 = (m1/M)GM = 0.001GM. What would be the shape of the orbits and the periods of the two planets if they did not mutually interact? What is the qualitative effect of their mutual interaction? Describe the shape of the two orbits. Why is one planet affected more by their mutual interaction than the other? Are the angular momentum and the total energy of planet one conserved? Are the total energy and total angular momentum of the two planets conserved? A related but more time consuming problem is given in Project 5.18. Problem 5.11. Double stars Another interesting dynamical system consists of one planet orbiting about two fixed stars of equal mass. In this case there are no closed orbits, but the orbits can be classified as either stable or unstable. Stable orbits may be open loops that encircle both stars, figure eights, or orbits that encircle only one star. Unstable orbits will eventually collide with one of the stars. Modify Planet2 to simulate the double-star system, with the first star located at (−1,0) and the second star of equal mass located at (1,0). Place the planet at (0.1,1) and systematically vary the x and y components of the velocity to obtain different types of orbits. Then try other initial positions. 5.10 Two-Body Scattering Much of our understanding of the structure of matter comes from scattering experiments. In this section we explore one of the more difficult concepts in the theory of scattering, the differential cross section. A typical scattering experiment involves a beam with many incident particles all with the same kinetic energy. The coordinate system is shown in Figure 5.7. The incident particles come from the left with an initial velocity v in the +x direction. We take the center of the beam and the center of the target to be on the x-axis. The impact parameter b is the perpendicular distance from the initial trajectory to a parallel line through the center of the target (see Figure 5.7). We assume that the width of the beam is larger than the size of the target. The target contains many scattering centers, but for calculational purposes, we may consider scattering off only one particle if the target is sufficiently thin. When an incident particle comes close to the target, it is deflected. In a typical experiment, the scattered particles are counted in a detector that is far from the target. The final velocity of the scattered particles is v , and the angle between v and v is the scattering angle θ. CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 127 b θ 2πbdb ∝2π sinθ |dθ| target Figure 5.7: The coordinate system used to define the differential scattering cross section. Particles passing through the beam area 2πbdb are scattered into the solid angle dΩ. Let us assume that the scattering is elastic and that the target is much more massive than the beam particles so that the target can be considered to be fixed. (The latter condition can be relaxed by using center of mass coordinates.) We also assume that no incident particle is scattered more than once. These considerations imply that the initial speed and final speed of the incident particles are equal. The functional dependence of θ on b depends on the force on the beam particles due to the target. In a typical experiment, the number of particles in an angular region between θ and θ +dθ is detected for many values of θ. These detectors measure the number of particles scattered into the solid angle dΩ = sinθ dθ dφ centered about θ. The differential cross section σ(θ) is defined by the relation dN N = nσ(θ)dΩ (5.27) where dN is the number of particles scattered into the solid angle dΩ centered about θ and the azimuthal angle φ, N is the total number of particles in the beam, and n is the target density defined as the number of targets per unit area. The interpretation of (5.27) is that the fraction of particles scattered into the solid angle dΩ is proportional to dΩ and the density of the target. From (5.27) we see that σ(θ) can be interpreted as the effective area of a target particle for the scattering of an incident particle into the element of solid angle dΩ. Particles that are not scattered are ignored. Another way of thinking about σ(θ) is that it is the ratio of the area bdbdφ to the solid angle dΩ = sinθ dθ dφ, where bdbdφ is the infinitesimal cross-sectional area of the beam that scatters into the solid angle defined by θ to θ + dθ and φ to φ + dφ. The alternative notation for the differential cross section, dσ/dΩ, comes from this interpretation. To do an analytic calculation of σ(θ), we write σ(θ) = dσ dΩ = b sinθ db dθ . (5.28) We see from (5.28) that the analytic calculation of σ(θ) involves b as a function of θ, or more precisely, how b changes to give scattering through an infinitesimally larger angle θ + dθ. In a scattering experiment, particles enter from the left (see Figure 5.7) with random values of the impact parameter b and azimuthal angle φ, and the number of particles scattered into the various detectors is measured. In our simulation, we know the value of b, and we can integrate CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 128 Newton’s equations of motion to find the angle at which the incident particle is scattered. Hence, in contrast to the analytic calculation, a simulation naturally yields θ as a function of b. Because the differential cross section is usually independent of φ, we need to consider beam particles only at φ = 0. We have to take into account the fact that in a real beam, there are more particles at some values of b than at others. That is, the number of particles in a real beam is proportional to 2πb∆b, the area of the ring between b and b+∆b, where we have integrated over the values of φ to obtain the factor of 2π. Here ∆b is the interval between the values of b used in the program. Because there is only one target in the beam, the target density is n = 1/(πR2). The scattering program requires the Scatter, ScatterAnalysis, and ScatterApp classes. The ScatterApp class in Listing 5.6 organizes the startup process and creates the visualizations. As usual, it extends AbstractSimulation by overriding the doStep method. However, in this case a single step is not a time step. A step calculates a trajectory and scattering angle for the given impact parameter. After a trajectory is calculated, the impact parameter is incremented and the panel is repainted. If necessary, you can eliminate this visualization to increase the computational speed. If the new impact parameter exceeds the beam radius bmax, the animation is stopped and the accumulated data is analyzed. Note that the calculateTrajectory method returns true if the calculation succeeded and that an error message is printed if the calculation fails. Including a failsafe mechanism to stop a computation is good programming practice. Listing 5.6: A program that calculates the scattering trajectories and computes the differential cross section. public class ScatterApp extends AbstractSimulation { PlotFrame frame = new PlotFrame ( "x" , "y" , "Trajectories" ) ; ScatterAnalysis analysis = new ScatterAnalysis ( ) ; Scatter t r a j e c t o r y = new Scatter ( ) ; double vx ; / / speed of the i n c i d e n t p a r t i c l e double b , db ; / / impact parameter and increment double bmax ; / / maximum impact parameter / Constructs ScatterApp . / public ScatterApp ( ) { frame . setPreferredMinMax ( −5 , 5 , −5, 5 ) ; frame . setSquareAspect ( true ) ; } public void doStep ( ) { i f ( t r a j e c t o r y . calculateTrajectory ( frame , b , vx ) ) { analysis . d e t e c t P a r t i c l e (b , t r a j e c t o r y . getAngle ( ) ) ; } else { control . println ( "Trajectory did not converge at b = "+b ) ; } frame . setMessage ( "b = "+decimalFormat . format (b ) ) ; b += db ; / / i n c r e a s e s the impact parameter frame . repaint ( ) ; i f (b>bmax) { control . calculationDone ( "Maximum impact parameter reached" ) ; analysis . plotCrossSection (b ) ; } } CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 129 public void i n i t i a l i z e ( ) { vx = control . getDouble ( "vx" ) ; bmax = control . getDouble ( "bmax" ) ; db = control . getDouble ( "db" ) ; b = db/2; / / s t a r t s b at average value of f i r s t i n t e r v a l 0−>db / / b w i l l increment to 3 db /2 , 5 db /2 , 7 db /2 , . . . frame . setMessage ( "b = 0" ) ; frame . clearDrawables ( ) ; / / removes old t r a j e c t o r i e s analysis . clear ( ) ; } public void reset ( ) { control . setValue ( "vx" , 3 ) ; control . setValue ( "bmax" , 0 . 2 5 ) ; control . setValue ( "db" , 0 . 0 1 ) ; i n i t i a l i z e ( ) ; } public s t a t i c void main ( String [ ] args ) { SimulationControl . createApp (new ScatterApp ( ) ) ; } } The Scatter class shown in Listing 5.7 calculates the trajectories by expressing the equation of motion as a rate equation. The most important method is calculateTrajectory, which calculates a trajectory by stepping the differential equation solver and adding the resulting data to a trail to display the path. Because the beam source is far away, we stop the calculation when the distance of the scattered particle from the target exceeds the initial distance. Note the use of the ternary ?: operator. This very efficient and compact operator uses three expressions. The first expression evaluates to a boolean. If this expression is true, then the statement after the ? is executed. If this expression is false, then the statement after the : is executed. However, because some potentials may trap particles for long periods of time, we also stop the calculation after a predetermined number of time steps. Listing 5.7: A class that models particle scattering using a central force law. package org . opensourcephysics . sip . ch05 ; import java . awt . ; import org . opensourcephysics . display . ; import org . opensourcephysics . frames . ; import org . opensourcephysics . numerics . ; public class Scatter implements ODE { double [ ] s t a t e = new double [ 5 ] ; RK4 odeSolver = new RK4( this ) ; public Scatter ( ) { odeSolver . setStepSize ( 0 . 0 5 ) ; } boolean calculateTrajectory ( PlotFrame frame , double b , double vx ) { s t a t e [ 0 ] = −5.0; / / x s t a t e [ 1 ] = vx ; / / vx s t a t e [ 2 ] = b ; / / y CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 130 s t a t e [ 3 ] = 0; / / vy s t a t e [ 4 ] = 0; / / time Trail t r a i l = new Trail ( ) ; t r a i l . color = Color . red ; frame . addDrawable ( t r a i l ) ; double r2 = ( s t a t e [ 0 ] s t a t e [ 0 ] ) + ( s t a t e [ 2 ] s t a t e [ 2 ] ) ; double count = 0; while ( ( count <=1000)&&((2 r2 ) >(( s t a t e [ 0 ] s t a t e [ 0 ] ) + ( s t a t e [ 2 ] s t a t e [ 2 ] ) ) ) ) { t r a i l . addPoint ( s t a t e [ 0 ] , s t a t e [ 2 ] ) ; odeSolver . step ( ) ; count ++; } return count <1000; } private double force ( double r ) { / / Coulomb f o r c e law return ( r==0) ? 0 : (1/ r / r ) ; / / returns 0 i f r = 0 } public void getRate ( double [ ] state , double [ ] rate ) { double r = Math . sqrt ( ( s t a t e [ 0 ] s t a t e [ 0 ] ) + ( s t a t e [ 2 ] s t a t e [ 2 ] ) ) ; double f = force ( r ) ; rate [ 0] = s t a t e [ 1 ] ; rate [ 1] = ( f s t a t e [ 0 ] ) / r ; rate [ 2] = s t a t e [ 3 ] ; rate [ 3] = ( f s t a t e [ 2 ] ) / r ; rate [ 4] = 1; } public double [ ] getState ( ) { return s t a t e ; } double getAngle ( ) { return Math . atan2 ( s t a t e [ 3 ] , s t a t e [ 1 ] ) ; / / / Math . PI ; xx } } The ScatterAnalysis class performs the data analysis. This class creates an array of bins to sort and accumulate the trajectories according to the scattering angle. The values of the scattering angle between 0◦ and 180◦ are divided into bins of width dtheta. To compute the number of particles coming from a ring of radius b, we accumulate the value of b associated with each bin or “detector” and write bins[index] += b (see the detectParticle method), because the number of particles in a ring of radius b is proportional to b. The total number of scattered particles is computed in the same way: totalN += b ; You might want to increase the number of bins and the range of angles for better resolution. Listing 5.8: The ScatterAnalysis class accumulates the scattering data and plots the differential cross section. public class ScatterAnalysis { CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 131 int numberOfBins = 18; PlotFrame frame = new PlotFrame ( "angle" , "sigma" , "differential cross section" ) ; double [ ] bins = new double [ numberOfBins ] ; double dtheta = Math . PI /( numberOfBins ) ; double totalN = 0; / / t o t a l number of s c a t t e r e d p a r t i c l e s void clear ( ) { for ( int i = 0; i a. Because we do not count the beam particles that are not scattered, we set the beam radius equal to a. For forces that are not identically zero, we need to choose a minimum angle for θ such that particles whose scattering angle is less than this minimum are not counted as scattered (see Problem 5.14). Problem 5.13. Scattering from a model hydrogen atom CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 132 (a) Consider a model of the hydrogen atom for which a positively-charged nucleus of charge +e is surrounded by a uniformly distributed negative charge of equal magnitude. The spherically symmetric negative charge distribution is contained within a sphere of radius a. It is straightforward to show that the force between a positron of charge +e and this model hydrogen atom is given by f (r) =    1/r2 − r/a3 r ≤ a 0 r > a. (5.30) We have chosen units such that e2/(4π 0) = 1, and the mass of the positron is unity. What is the ionization energy in these units? Modify the Scatter class to incorporate this force. Is the force on the positron from the model hydrogen atom purely repulsive? Choose a = 1 and set the beam radius bmax = 1. Use E = 0.125 and ∆t = 0.01. Compute the trajectories for b = 0.25, 0.5, and 0.75 and describe the qualitative nature of the trajectories. (b) Determine the cross section for E = 0.125. Choose nine bins so that the angular width of a detector is delta = 20◦, and let db = 0.1, 0.01, and 0.002. How does the accuracy of your results depend on the number of bins? Determine the differential cross section for different energies and explain its qualitative energy dependence. (c) What is the value of σT for E = 0.125? Does σT depend on E? The total cross section has units of area, but a point charge does not have an area. To what area does it refer? What would you expect the total cross section to be for scattering from a hard sphere? (d) Change the sign of the force so that it corresponds to electron scattering. How do the trajectories change? Discuss the change in σ(θ). Problem 5.14. Rutherford scattering (a) One of the most famous scattering experiments was performed by Geiger and Marsden who scattered a beam of alpha particles on a thin gold foil. Based on these experiments, Rutherford deduced that the positive charge of the atom is concentrated in a small region at the center of the atom rather than distributed uniformly over the entire atom. Use a 1/r2 force in class Scatter and compute the trajectories for b = 0.25, 0.5, and 0.75 and describe the trajectories. Choose E = 5 and ∆t = 0.01. The default value of x0, the initial x-coordinate of the beam, is x0 = −5. Is this value reasonable? (b) For E = 5 determine the cross section with numberOfBins = 18. Choose the beam width bmax = 2. Then vary db (or numberOfBins) and compare the accuracy of your results to the analytic result for which σ(θ) varies as [sin(θ/2)]−4. How do your computed results compare with this dependence on θ? If necessary, decrease db. Are your results better or worse at small angles, intermediate angles, or large angles near 180◦? Explain. (c) Because the Coulomb force is long range, there is scattering at all impact parameters. Increase the beam radius and determine if your results for σ(θ) change. What happens to the total cross section as you increase the beam width? (d) Compute σ(θ) for different values of E and estimate the dependence of σ(θ) on E. Problem 5.15. Scattering by other potentials CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 133 (a) A simple phenomenological form for the effective interaction between electrons in metals is the screened Coulomb (or Thomas–Fermi) potential given by V (r) = e2 4π 0r e−r/a . (5.31) The range of the interaction a depends on the density and temperature of the electrons. The form (5.31) is known as the Yukawa potential in the context of the interaction between nuclear particles and as the Debye potential in the context of classical plasmas. Choose units such that a = 1 and e2/(4π 0) = 1. Recall that the force is given by f (r) = −dV /dr. Incorporate this force law into class Scatter and compute the dependence of σ(θ) on the energy of the incident particle. Choose the beam width equal to 3. Compare your results for σ(θ) with your results from the Coulomb potential. (b) Modify the force law in Scatter so that f (r) = 24(2/r13 − 1/r7). This form for f (r) is used to describe the interactions between simple molecules (see Chapter 8). Describe some typical trajectories and compute the differential cross section for several different energies. Let bmax = 2. What is the total cross section? How do your results change if you vary bmax? Choose a small angle as the minimum scattering angle. How sensitive is the total cross section to this minimum angle? Does the differential cross section vary for any other angles besides the smallest scattering angle? 5.11 Three-body problems Poincaré showed that it is impossible to obtain an analytic solution for the unrestricted motion of three or more objects interacting under the influence of gravity. However solutions are known for a few special cases, and it is instructive to study the properties of these solutions. The ThreeBody class computes the trajectories of three particles of equal mass moving in a plane and interacting under the influence of gravity. Both the physics and the drawing are implemented in the ThreeBody class shown in Listing 5.9. Note that the getRate and computeForce methods compute trajectories for an arbitrary number of masses. Note how the computeForce method uses the arraycopy method to quickly zero the arrays. To simplify the drawing of the particle trajectories, the ThreeBody class uses an inner class that extends a Circle and contains a Trail. Listing 5.9: A class that models the dynamics of the three-body problem. package org . opensourcephysics . sip . ch05 ; import java . awt . ; import org . opensourcephysics . display . ; import org . opensourcephysics . numerics . ; public class ThreeBody implements Drawable , ODE { int n = 3; / / number of i n t e r a c t i n g bodies / / s t a t e= { x1 , vx1 , y1 , vy1 , x2 , vx2 , y2 , vy2 , x3 , vx3 , y3 , vy3 , t } double [ ] s t a t e = new double [4 n+1]; double [ ] force = new double [2 n] double [ ] zeros = new double [2 n ] ; ODESolver odeSolver = new RK45MultiStep ( this ) ; Mass mass1 = new Mass ( ) , mass2 = new Mass ( ) , mass3 = new Mass ( ) ; CHAPTER 5. FEW-BODY PROBLEMS: THE MOTION OF THE PLANETS 134 public void draw ( DrawingPanel panel , Graphics g ) { mass1 . draw ( panel , g ) ; mass2 . draw ( panel , g ) ; mass3 . draw ( panel , g ) ; } public void doStep ( ) { odeSolver . step ( ) ; mass1 . setXY ( s t a t e [ 0 ] , s t a t e [ 2 ] ) ; mass2 . setXY ( s t a t e [ 4 ] , s t a t e [ 6 ] ) ; mass3 . setXY ( s t a t e [ 8 ] , s t a t e [ 1 0 ] ) ; } void i n i t i a l i z e ( double [ ] i n i t S t a t e ) { / / c o p i e s i n i t S t a t e to s t a t e System . arraycopy ( i ni t St a te , 0 , state , 0 , 13); mass1 . clear ( ) ; mass2 . clear ( ) ; mass3 . clear ( ) ; mass1 . setXY ( s t a t e [ 0 ] , s t a t e [ 2 ] ) ; mass2 . setXY ( s t a t e [ 4 ] , s t a t e [ 6 ] ) ; mass3 . setXY ( s t a t e [ 8 ] , s t a t e [ 1 0 ] ) ; } void computeForce ( double [ ] s t a t e ) { / / s e t s f o r c e array elements to 0 System . arraycopy ( zeros , 0 , force , 0 , force . length ) ; for ( int i = 0; i 0.25. Show that for the suggested values of r, the iterated values of x do not change after an initial transient; that is, the long time dynamical behavior is period 1. In Appendix 6A we show that for r < 3/4 and for x0 in the interval 0 < x0 < 1, the trajectories approach the stable attractor at x = 1 − 1/4r. The set of initial points that iterate to the attractor is called the basin of the attractor. For the logistic map, the interval 0 < x < 1 is the basin of attraction of the attractor x = 1 − 1/4r. (c) Explore the dynamical properties of (6.5) for r = 0.752, 0.76, 0.8, and 0.862. For r = 0.752 and 0.862, approximately 1000 iterations are necessary to obtain convergent results. Show that if r is greater than 0.75, x oscillates between two values after an initial transient behavior. That is, instead of a stable cycle of period 1 corresponding to one fixed point, the system has a stable cycle of period 2. The value of r at which the single fixed point x∗ splits or bifurcates into two values x1 ∗ and x2 ∗ is r = b1 = 3/4. The pair of x values, x1 ∗ and x2 ∗, form a stable attractor of period 2. (d) What are the stable attractors of (6.5) for r = 0.863 and 0.88? What is the corresponding period? What are the stable attractors and corresponding periods for r = 0.89, 0.891, and 0.8922? Another way to determine the behavior of (6.5) is to plot the values of x as a function of r (see Figure 6.2). The iterated values of x are plotted after the initial transient behavior is discarded. Such a plot is generated by BifurcateApp. For each value of r, the first ntransient values of x are computed but not plotted. Then the next nplot values of x are plotted with the first half with the first half in one color and the second half in another. This process is repeated for a new value of r until the desired range of r values is reached. The magnitude of nplot should be at least as large as the longest period that you wish to observe. BifurcateApp extends AbstractSimulation rather than AbstractCalculation because the calculations can be time consuming. For this reason you might want to stop them before they are finished and reset some of the parameters. CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 146 0.0 0.5 1.0 iteratedvaluesofx 0.7 0.8 0.9 1.0 r Figure 6.2: Bifurcation diagram of the logistic map. For each value of r, the iterated values of xn are plotted after the first 1000 iterations are discarded. Note the transition from periodic to chaotic behavior and the narrow windows of periodic behavior within the region of chaos. Listing 6.2: The BifurcateApp program generates a bifurcation plot of the logistic map package org . opensourcephysics . sip . ch06 ; import org . opensourcephysics . controls . ; import org . opensourcephysics . frames . ; public class BifurcateApp extends AbstractSimulation { double r ; / / c o n t r o l parameter double dr ; / / incremental change of r , suggest dr <= 0.01 int ntransient ; / / number of i t e r a t i o n s not p l o t t e d int nplot ; / / number of i t e r a t i o n s p l o t t e d PlotFrame plotFrame = new PlotFrame ( "r" , "x" , "Bifurcation diagram" ) ; public BifurcateApp ( ) { / / small s i z e g i v e s b e t t e r r e s o l u t i o n plotFrame . setMarkerSize (0 , 0 ) ; plotFrame . setMarkerSize (1 , 0 ) ; } public void i n i t i a l i z e ( ) { plotFrame . clearData ( ) ; r = control . getDouble ( "initial r" ) ; dr = control . getDouble ( "dr" ) ; ntransient = control . getInt ( "ntransient" ) ; CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 147 nplot = control . getInt ( "nplot" ) ; } public void doStep ( ) { i f ( r <1.0) { double x = 0 . 5 ; for ( int i = 0; i r∞. Problem 6.3. Chaotic behavior (a) For r > r∞, two initial conditions that are very close to one another can yield very different trajectories after a few iterations. As an example, choose r = 0.91 and consider x0 = 0.5 and 0.5001. How many iterations are necessary for the iterated values of x to differ by more than ten percent? What happens for r = 0.88 for the same choice of seeds? (b) The accuracy of floating point numbers retained on a digital computer is finite. To test the effect of the finite accuracy of your computer, choose r = 0.91 and x0 = 0.5 and compute the trajectory for 200 iterations. Then modify your program so that after each iteration, the operation x = x/10 is followed by x = 10*x. This combination of operations truncates the last digit that your computer retains. Compute the trajectory again and compare your results. Do you find the same discrepancy for r < r∞? (c) What are the dynamical properties for r = 0.958? Can you find other windows of periodic behavior in the interval r∞ < r < 1? 6.3 Period Doubling The results of the numerical experiments that we did in Section 6.2 probably have convinced you that the dynamical properties of a simple, nonlinear deterministic system can be quite complicated. To gain more insight into how the dynamical behavior depends on r, we introduce a simple graphical method for iterating (6.5). In Figure 6.3 we show a graph of f (x) versus x for r = 0.7. A diagonal line corresponding to y = x intersects the curve y = f (x) at the two fixed points x∗ = 0 and x∗ = 9/14 ≈ 0.642857 [see (6.6b)]. If x0 is not a fixed point, we can find the trajectory in the following way. Draw a vertical line from (x = x0,y = 0) to the intersection with the curve y = f (x) at (x0,y0 = f (x0)). Next draw a horizontal line from (x0,y0) to the intersection with the diagonal line at (y0,y0). On this diagonal line y = x, and hence the value of x at this intersection is the first iteration x1 = y0. The second iteration x2 can be found in the same way. From the point (x1,y0), draw a vertical line to the intersection with the curve y = f (x). Keep y fixed at y = y1 = f (x1), and draw a horizontal line until it intersects the diagonal line; the value of x at this intersection is x2. Further iterations can be found by repeating this process. This graphical method is illustrated in Figure 6.3 for r = 0.7 and x0 = 0.9. If we begin with any x0 (except x0 = 0 and x0 = 1), the iterations will converge to the fixed point x∗ ≈ 0.643. It would be a good idea to repeat the procedure shown in Figure 6.3 by hand. For r = 0.7, the fixed point is stable (an attractor of period 1). In contrast, no matter how close x0 is to the fixed point at x = 0, the iterates diverge away from it, and this fixed point is unstable. How can we explain the qualitative difference between the fixed point at x = 0 and at x∗ = 0.642857 for r = 0.7? The local slope of the curve y = f (x) determines the distance moved horizontally each time f is iterated. A slope steeper than 45◦ leads to a value of x further away from its initial value. Hence, the criterion for the stability of a fixed point is that the magnitude of the slope at the fixed point must be less than 45◦. That is, if |df (x)/dx|x=x∗ is less than unity, then x∗ is stable; conversely, if |df (x)/dx|x=x∗ is greater than unity, then x∗ is unstable. CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 149 0.0 0.5 1.0x 0.0 0.5 1.0 y x0 Figure 6.3: Graphical representation of the iteration of the logistic map (6.5) with r = 0.7 and x0 = 0.9. Note that the graphical solution converges to the fixed point x∗ ≈ 0.643. An inspection of f (x) in Figure 6.3 shows that x = 0 is unstable because the slope of f (x) at x = 0 is greater than unity. In contrast, the magnitude of the slope of f (x) at x = x∗ ≈ 0.643 is less than unity, and this fixed point is stable. In Appendix 6A we show that x∗ = 0 is stable for 0 < r < 1/4 (6.6a) and x∗ = 1 − 1 4r is stable for 1/4 < r < 3/4. (6.6b) Thus for 0 < r < 3/4, the behavior after many iterations is known. What happens if r is greater than 3/4? We found in Section 6.2 that if r is slightly greater than 3/4, the fixed point of f becomes unstable and bifurcates to a cycle of period 2. Now x returns to the same value after every second iteration, and the fixed points of f f (x) are the stable attractors of f (x). In the following, we write f (2)(x) = f f (x) and f (n)(x) for the nth iterate of f (x). (Do not confuse f (n)(x) with the nth derivative of f (x).) For example, the second iterate f (2)(x) is given by the fourth-order polynomial: f (2) (x) = 4r 4rx(1 − x) − 4r 4rx(1 − x) 2 = 4r[4rx(1 − x)] 1 − 4rx(1 − x) = 16r2 x − 4rx3 + 8rx2 − (1 + 4r)x + 1 . (6.7) What happens if we increase r still further? Eventually the magnitude of the slope of the fixed points of f (2)(x) exceeds unity, and the fixed points of f (2)(x) become unstable. Now CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 150 f(3) f(2) f(3) f(3) f(2) f(1) f(3) f(2) f(3) answer f(1) f(2) f(3) (a) (b) (c) (d) (e) (f) Figure 6.4: Example of the calculation of f(0.4,0.8,3) using the recursive function defined in GraphicalSolutionApp. The number in each box is the value of the variable iterate. The computer executes code from left to right, and each box represents a copy of the function in the computer’s memory. The input values x = 0.4 and r = 0.8, which are the same in each copy, are not shown. The arrows indicate when a copy is finished and its value is returned to one of the other copies. Notice that the first copy of the function f (3) is the last one to finish. The value of f(x,r,3) = 0.7842. the cycle of f is period 4, and the fixed points of the fourth iterate f (4)(x) = f (2) f (2)(x) = f f f (f (x) are stable. These fixed points also eventually become unstable, and we are led to the phenomena of period doubling that we observed in Problem 6.2. GraphicalSolutionApp implements the graphical analysis of the iterations of f (x). The nth-order iterates are defined in f(x,r,iterate), a recursive method. (The parameter iterate is 1, 2, and 4 for the functions f (x), f (2)(x), and f (4)(x), respectively.) Recursion is an idea that is simple once you understand it, but it can be difficult to grasp initially. Although the method calls itself, the rules for method calls remain the same. Imagine that a recursive method is called. The computer then starts to execute the code in the method, but comes to another call of the same method as itself. At this point the computer stops executing the code of the original method, and makes an exact copy of the method with possibly different input parameters, and starts executing the code in the copy. There are now two possibilities. One is that the computer comes to the end of the copy without another recursive call. In that case the computer deletes the copy of the method and continues executing the code in the original method. The other possibility is that a recursive call is made in the copy, and a third copy is made of the method, and the code in the third copy is now executed. This process continues until the code in all the copies is executed. Every recursive method must have a possibility of reaching the end of the method; otherwise, the program will eventually crash. To understand the method f(x,r,iterate), suppose we want to compute f(0.4,0.8,3). First we write f(0.4,0.8,3) as in Figure 6.4a. Follow the statements within the method until another call to f(0.4,0.8,iterate) occurs. In this case, the call is to f(0.4,0.8,iterate-1) which equals f(0.4,0.8,2). Write f(0.4,0.8,2) above f(0.4,0.8,3) (see Figure 6.4b). When you come to the end of the definition of the method, write down the value of f that is actually returned, and remove the method from the stack by crossing it out (see Figure 6.4d). This returned value for f equals y if iterate > 1, or it is the output of the method for iterate = 1. Continue deleting copies of f as they are finished, until there are no copies left on the paper. The final value of f is the value returned by the computer. Write a short program that defines CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 151 f(x,r,iterate) and prints the value of f(0.4,0.8,3). Is the answer the same as your hand calculation? Listing 6.3: GraphicalSolutionApp displays the graphical solution of the logistic map trajec- tory package org . opensourcephysics . sip . ch06 ; import org . opensourcephysics . controls . ; import org . opensourcephysics . frames . PlotFrame ; public class GraphicalSolutionApp extends AbstractSimulation { PlotFrame plotFrame = new PlotFrame ( "iterations" , "x" , "graphical solution" ) ; double r ; / / c o n t r o l parameter int i t e r a t e ; / / i t e r a t e of f ( x ) double x , y ; double x0 , y0 ; public GraphicalSolutionApp ( ) { plotFrame . setPreferredMinMax (0 , 1 , 0 , 1 ) ; plotFrame . setConnected ( true ) ; plotFrame . setXPointsLinked ( true ) ; / / second argument i n d i c a t e s no marker plotFrame . setMarkerShape (2 , 0 ) ; } public void reset ( ) { control . setValue ( "r" , 0 . 8 9 ) ; control . setValue ( "x" , 0 . 2 ) ; plotFrame . setMarkerShape (0 , 0 ) ; control . setAdjustableValue ( "iterate" , 1 ) ; } public void i n i t i a l i z e ( ) { r = control . getDouble ( "r" ) ; x = control . getDouble ( "x" ) ; i t e r a t e = control . getInt ( "iterate" ) ; x0 = x ; y0 = 0; clear ( ) ; } public void startRunning ( ) { i f ( i t e r a t e != control . getInt ( "iterate" ) ) { i t e r a t e = control . getInt ( "iterate" ) ; clear ( ) ; } r = control . getDouble ( "r" ) ; } public void doStep ( ) { y = f ( x , r , i t e r a t e ) ; plotFrame . append (1 , x0 , y0 ) ; plotFrame . append (1 , x0 , y ) ; CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 152 plotFrame . append (1 , y , y ) ; x = x0 = y0 = y ; control . setValue ( "x" , x ) ; } void drawFunction ( ) { int nplot = 200; / / # of points at which function computed double delta = 1.0/ nplot ; double x = 0; double y = 0; for ( int i = 0; i <=nplot ; i ++) { y = f ( x , r , i t e r a t e ) ; plotFrame . append (0 , x , y ) ; x += delta ; } } void drawLine ( ) { / / draws l i n e y = x for ( double x = 0; x<1;x += 0.001) { plotFrame . append (2 , x , x ) ; } } public double f ( double x , double r , int i t e r a t e ) { i f ( iterate >1) { double y = f ( x , r , iterate −1); return 4 r y (1 −y ) ; } else { return 4 r x (1 −x ) ; } } public void clear ( ) { plotFrame . clearData ( ) ; drawFunction ( ) ; drawLine ( ) ; plotFrame . repaint ( ) ; } public s t a t i c void main ( String [ ] args ) { SimulationControl control = SimulationControl . createApp ( new GraphicalSolutionApp ( ) ) ; control . addButton ( "clear" , "Clear" , "Clears the trajectory." ) ; } } Problem 6.4. Qualitative properties of the fixed points (a) Use GraphicalSolutionApp to show graphically that there is a single stable fixed point of f (x) for r < 3/4. It would be instructive to modify the program so that the value of the slope df /dx|x=xn is shown as you step each iteration. At what value of r does the absolute value of this slope exceed unity? Let b1 denote the value of r at which the fixed point of f (x) bifurcates and becomes unstable. Verify that b1 = 0.75. CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 153 (b) Describe the trajectory of f (x) for r = 0.785. Is the fixed point given by x = 1−1/4r stable or unstable? What is the nature of the trajectory if x0 = 1−1/4r? What is the period of f (x) for all other choices of x0? What are the values of the two-point attractor? (c) The function f (x) is symmetrical about x = 1/2 where f (x) is a maximum. What are the qualitative features of the second iterate f (2)(x) for r = 0.785? Is f (2)(x) symmetrical about x = 1/2? For what value of x does f (2)(x) have a minimum? Iterate xn+1 = f (2)(xn) for r = 0.785 and find its two fixed points x1 ∗ and x2 ∗. (Try x0 = 0.1 and x0 = 0.3.) Are the fixed points of f (2)(x) stable or unstable for this value of r? How do these values of x1 ∗ and x2 ∗ compare with the values of the two-point attractor of f (x)? Verify that the slopes of f (2)(x) at x1 ∗ and x2 ∗ are equal. (d) Verify the following properties of the fixed points of f (2)(x). As r is increased, the fixed points of f (2)(x) move apart, and the slope of f (2)(x) at its fixed points decreases. What is the value of r = s2 at which one of the two fixed points of f (2) equals 1/2? What is the value of the other fixed point? What is the slope of f (2)(x) at x = 1/2? What is the slope at the other fixed point? As r is further increased, the slopes at the fixed points become negative. Finally at r = b2 ≈ 0.8623, the slopes at the two fixed points of f (2)(x) equal −1, and the two fixed points of f (2) become unstable. (The exact value of b2 is b2 = (1 + √ 6)/4.) (e) Show that for r slightly greater than b2, for example r = 0.87, there are four stable fixed points of f (4)(x). What is the value of r = s3 when one of the fixed points equals 1/2? What are the values of the three other fixed points at r = s3? (f) Determine the value of r = b3 at which the four fixed points of f (4) become unstable. (g) Choose r = s3 and determine the number of iterations that are necessary for the trajectory to converge to period 4 behavior. How does this number of iterations change when neighboring values of r are considered? Choose several values of x0 so that your results do not depend on the initial conditions. Problem 6.5. Periodic windows in the chaotic regime (a) If you look closely at the bifurcation diagram in Figure 6.2, you will see that the range of chaotic behavior for r > r∞ is interrupted by intervals of periodic behavior. Magnify your bifurcation diagram so that you can look at the interval 0.957107 ≤ r ≤ 0.960375, where a periodic trajectory of period 3 occurs. (Period 3 behavior starts at r = (1 + √ 8)/4.) What happens to the trajectory for slightly larger r, for example, r = 0.9604? (b) Plot f (3)(x) versus x at r = 0.96, a value of r in the period 3 window. Draw the line y = x and determine the intersections with f (3)(x). The stable fixed points satisfy the condition x∗ = f (3)(x∗). Because f (3)(x) is an eighth-order polynomial, there are eight solutions (including x = 0). Find the intersections of f (3)(x) with y = x and identify the three stable fixed points. What are the slopes of f (3)(x) at these points? Then decrease r to r = 0.957107, the (approximate) value of r below which the system is chaotic. Draw the line y = x and determine the number of intersections with f (3)(x). Note that at this value of r, the curve y = f (3)(x) is tangent to the diagonal line at the three stable fixed points. For this reason, this type of transition is called a tangent bifurcation. Note that there is also an unstable point at x ≈ 0.76. (c) Plot xn+1 = f (3)(xn) versus n for r = 0.9571, a value of r just below the onset of period 3 behavior. How would you describe the behavior of the trajectory? This type of chaotic motion CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 154 k bk 1 0.750 000 2 0.862 372 3 0.886 023 4 0.891 102 5 0.892 190 6 0.892 423 7 0.892 473 8 0.892 484 Table 6.1: Values of the control parameter r = bk for the onset of the kth bifurcation. Six decimal places are shown. is an example of intermittency; that is, nearly periodic behavior interrupted by occasional irregular bursts. (d) To understand the mechanism for the intermittent behavior, we need to “zoom in” on the values of x near the stable fixed points that you found in part (c). To do so change the arguments of the setPreferredMinMax method. You will see a narrow channel between the diagonal line y = x and the plot of f (3)(x) near each fixed point. The trajectory can require many iterations to squeeze through the channel, and we see apparent period 3 behavior during this time. Eventually, the trajectory escapes from the channel and bounces around until it is again enters a channel at some unpredictable later time. 6.4 Universal Properties and Self-Similarity In Sections 6.2 and 6.3 we found that the trajectory of the logistic map has remarkable properties as a function of the control parameter r. In particular, we found a sequence of period doublings accumulating in a chaotic trajectory of infinite period at r = r∞. For most values of r > r∞, the trajectory is very sensitive to the initial conditions. We also found “windows” of period 3, 6, 12, ... embedded in the range of chaotic behavior. How typical is this type of behavior? In the following, we will find further numerical evidence that the general behavior of the logistic map is independent of the details of the form (6.5) of f (x). You might have noticed that the range of r between successive bifurcations becomes smaller as the period increases (see Table 6.1). For example, b2 − b1 = 0.112398, b3 − b2 = 0.023624 and b4 − b3 = 0.00508. A good guess is that the decrease in bk − bk−1 is geometric; that is, the ratio (bk − bk−1)/(bk+1 − bk) is a constant. You can check that this ratio is not exactly constant, but converges to a constant with increasing k. This behavior suggests that the sequence of values of bk has a limit and follows a geometrical progression: bk ≈ r∞ − Cδ−k , (6.8) where δ is known as the Feigenbaum number and C is a constant. From (6.8) it is easy to show that δ is given by the ratio δ = lim k→∞ bk − bk−1 bk+1 − bk . (6.9) CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 155 0.7 0.8 0.9 0.3 0.5 0.7 0.9 r y M1 M2 Figure 6.5: The first few bifurcations of the logistic equation showing the scaling of the maximum distance Mk between the asymptotic values of x describing the bifurcation. Problem 6.6. Estimation of the Feigenbaum constant (a) Derive the relation (6.9) given (6.8). Plot δk = (bk −bk−1)/(bk+1 −bk) versus k using the values of bk in Table 6.1 and determine the value of δ. Is the number of decimal places given in Table 6.1 for bk sufficient for all the values of k shown? The best numerical determination of δ is δ = 4.669201609102991... . (6.10) The number of decimal places in (6.10) is shown to indicate that δ is known precisely. Use (6.8) and (6.10) and the values of bk to determine the value of r∞. (b) In Problem 6.4 we found that one of the four fixed points of f (4)(x) is at x∗ = 1/2 for r = s3 ≈ 0.87464. We also found that the convergence to the fixed points of f (4)(x) for this value of r is more rapid than at nearby values of r. In Appendix 6A we show that these superstable trajectories occur whenever one of the fixed points is at x∗ = 1/2. The values of r = sm that give superstable trajectories of period 2m−1 are much better defined than the points of bifurcation, r = bk. The rapid convergence to the final trajectories also gives better numerical results, and we always know one member of the trajectory, namely x = 1/2. Assume that δ can be defined as in (6.9) with bk replaced by sm. Use s1 = 0.5, s2 ≈ 0.809017, and s3 = 0.874640 to determine δ. The numerical values of sm are found in Project 6.22 by solving the equation f (m)(x = 1/2) = 1/2 numerically; the first eight values of sm are listed in Table 6.2 in Section 6.11. We can associate another number with the series of “pitchfork” bifurcations. From Figures 6.3 and 6.5, we see that each pitchfork bifurcation gives birth to “twins” with the new CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 156 0.7 0.8 0.9 0.3 0.5 0.7 0.9 r y d1 d2 Figure 6.6: The quantity dk is the distance from x∗ = 1/2 to the nearest element of the attractor of period 2k. It is convenient to use this quantity to determine the exponent α. generation more densely packed than the previous generation. One measure of this density is the maximum distance Mk between the values of x describing the bifurcation (see Figure 6.5). The disadvantage of using Mk is that the transient behavior of the trajectory is very long at the boundary between two different periodic behaviors. A more convenient measure of the distance is the quantity dk = xk ∗ − 1/2, where xk* is the value of the fixed point nearest to the fixed point x∗ = 1/2. The first two values of dk are shown in Figure 6.6 with d1 ≈ 0.3090 and d2 ≈ −0.1164. The next value is d3 ≈ 0.0460. Note that the fixed point nearest to x = 1/2 alternates from one side of x = 1/2 to the other. We define the quantity α by the ratio α = lim k→∞ − dk dk+1 . (6.11) The ratios α = (0.3090/0.1164) = 2.65 for k = 1 and α = (0.1164/0.0460) = 2.53 for k = 2 are consistent with the asymptotic value α = 2.5029078750958928485... We now give qualitative arguments that suggest that the general behavior of the logistic map in the period doubling regime is independent of the detailed form of f (x). As we have seen, period doubling is characterized by self-similarities, for example, the period doublings look similar except for a change of scale. We can demonstrate these similarities by comparing f (x) for r = s1 = 0.5 for the superstable trajectory with period 1 to the function f (2)(x) for r = s2 ≈ 0.809017 for the superstable trajectory of period 2 (see Figure 6.7). The function f (x,r = s1) has unstable fixed points at x = 0 and x = 1 and a stable fixed point at x = 1/2. Similarly, the function f (2)(x,r = s2) has a stable fixed point at x = 1/2 and an unstable fixed point at x ≈ 0.69098. Note the similar shape but different scale of the curves in the square boxes in part (a) and part (b) of Figure 6.7. This similarity is an example of scaling. That is, if we scale CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 157 0.5 1.0 x 0.5 1.0 0.0 f(x) (a) 0.5 1.0 x 0.5 1.0 0.0 f(2) (x) (b) Figure 6.7: Comparison of f (x,r) for r = s1 with the second iterate f (2)(x) for r = s2. (a) The function f (x,r = s1) has unstable fixed points at x = 0 and x = 1 and a stable fixed point at x = 1/2. (b) The function f (2)(x,r = s1) has a stable fixed point at x = 1/2. The unstable fixed point of f (2)(x) nearest to x = 1/2 occurs at x ≈ 0.69098, where the curve f (2)(x) intersects the line y = x. The upper right-hand corner of the square box in (b) is located at this point, and the center of the box is at (1/2,1/2). Note that if we reflect this square about the point (1/2,1/2), the shape of the reflected graph in the square box is nearly the same as it is in part (a) but on a smaller scale. f (2) and change (renormalize) the value of r, we can compare f (2) to f . (See Chapter 12 for a discussion of scaling and renormalization in another context.) This graphical comparison is meant only to be suggestive. A precise approach shows that if we continue the comparison of the higher-order iterates, for example, f (4)(x) to f (2)(x), etc., the superposition of functions converges to a universal function that is independent of the form of the original function f (x). Problem 6.7. Further determinations of the exponents α and δ (a) Determine the appropriate scaling factor and superimpose f and the rescaled form of f (2) found in Figure 6.7. (b) Use arguments similar to those discussed in the text and in Figure 6.7 and compare the behavior of f (4)(x,r = s3) in the square about x = 1/2 with f (2)(x,r = s2) in its square about x = 1/2. The size of the squares are determined by the unstable fixed point nearest to x = 1/2. Find the appropriate scaling factor and superimpose f (2) and the rescaled form of f (4). ∗Problem 6.8. Other one-dimensional maps It is easy to modify your programs to consider other one-dimensional maps. Determine the qualitative properties of the one-dimensional maps f (x) = xer(1−x) (6.12) f (x) = r sinπx. (6.13) CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 158 Do they also exhibit the period doubling route to chaos? The map in (6.12) has been used by ecologists (cf. May) to study a population that is limited at high densities by the effect of epidemics. Although it is more complicated than (6.5), its advantage is that the population remains positive no matter what (positive) value is taken for the initial population. There are no restrictions on the maximum value of r, but if r becomes sufficiently large, x eventually becomes effectively zero. What is the behavior of the time series of (6.12) for r = 1.5,2, and 2.7? Describe the qualitative behavior of f (x). Does it have a maximum? The sine map (6.13) with 0 < r ≤ 1 and 0 ≤ x ≤ 1 has no special significance, except that it is nonlinear. If time permits, determine the approximate value of δ for both maps. What limits the accuracy of your determination of δ? The above qualitative arguments and numerical results suggest that the quantities α and δ are universal; that is, independent of the detailed form of f (x). In contrast, the values of the accumulation point r∞ and the constant C in (6.8) depend on the detailed form of f (x). Feigenbaum has shown that the period doubling route to chaos and the values of δ and α are universal properties of maps that have a quadratic maximum; that is, f (x)|x=xm = 0 and f (x)|x=xm < 0. Why is the universality of period doubling and the numbers δ and α more than a curiosity? The reason is that because this behavior is independent of the details, there might exist realistic systems whose underlying dynamics yield the same behavior as the logistic map. Of course, most physical systems are described by differential rather than difference equations. Can these systems exhibit period doubling behavior? Several workers (cf. Testa et al.) have constructed nonlinear RLC circuits driven by an oscillatory source voltage. The output voltage shows bifurcations, and the measured values of the exponents δ and α are consistent with the predictions of the logistic map. Of more general interest is the nature of turbulence in fluid systems. Consider a stream of water flowing past several obstacles. We know that at low flow speeds, the water flows past obstacles in a regular and time-independent fashion called laminar flow. As the flow speed is increased (as measured by a dimensionless parameter called the Reynolds number) some swirls develop, but the motion is still time independent. As the flow speed is increased still further, the swirls break away and start moving downstream. The flow pattern as viewed from the bank becomes time-dependent. For still larger flow speeds, the flow pattern becomes very complex and looks random. We say that the flow pattern has made a transition from laminar flow to turbulent flow. This qualitative description of the transition to chaos in fluid systems is superficially similar to the description of the logistic map. Can fluid systems be analyzed in terms of the simple models of the type we have discussed here? In a few instances such as turbulent convection in a heated saucepan, period doubling and other types of transitions to turbulence have been observed. The type of theory and analysis we have discussed has suggested new concepts and approaches, and the study of turbulent flow is a subject of much current interest. 6.5 Measuring Chaos How do we know if a system is chaotic? The most important characteristic of chaos is sensitivity to initial conditions. In Problem 6.3, for example, we found that the trajectories starting from x0 = 0.5 and x0 = 0.5001 for r = 0.91 become very different after a small number of iterations. Because computers only store floating numbers to a certain number of digits, the implication of this result is that our numerical predictions of the trajectories of chaotic systems are restricted CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 159 10-8 10-6 10-4 10-2 |∆xn| 0 10 20 30 40 50 n Figure 6.8: The evolution of the difference ∆xn between the trajectories of the logistic map at r = 0.91 for x0 = 0.5 and x0 = 0.5001. The separation between the two trajectories increases with n, the number of iterations, if n is not too large. (Note that |∆x1| ∼ 10−8 and that the trend is not monotonic.) to small time intervals. That is, sensitivity to initial conditions implies that even though the logistic map is deterministic, our ability to make numerical predictions of its trajectory is limited. How can we quantify this lack of predictably? In general, if we start two identical dynamical systems from slightly different initial conditions, we expect that the difference between the trajectories will increase as a function of n. In Figure 6.8 we show a plot of the difference |∆xn| versus n for the same conditions as in Problem 6.3a. We see that, roughly speaking, ln|∆xn| is a linearly increasing function of n. This result indicates that the separation between the trajectories grows exponentially if the system is chaotic. This divergence of the trajectories can be described by the Lyapunov exponent λ, which is defined by the relation |∆xn| = |∆x0|eλn (6.14) where ∆xn is the difference between the trajectories at time n. If the Lyapunov exponent λ is positive, then nearby trajectories diverge exponentially. Chaotic behavior is characterized by the exponential divergence of nearby trajectories. A naive way of measuring the Lyapunov exponent λ is to run the same dynamical system twice with slightly different initial conditions and measure the difference of the trajectories as a function of n. We used this method to generate Figure 6.8. Because the rate of separation of the trajectories might depend on the choice of x0, a better method would be to compute the rate of separation for many values of x0. This method would be tedious because we would have to fit the separation to (6.14) for each value of x0 and then determine an average value of λ. A more important limitation of the naive method is that because the trajectory is restricted to the unit interval, the separation |∆xn| ceases to increase when n becomes sufficiently large. Fortunately, there is a better way of determining λ. We take the natural logarithm of both sides CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 160 -2.0 -1.0 0.0 1.0 λ 0.7 0.8 0.9 1.0 r Figure 6.9: The Lyapunov exponent calculated using the method in (6.19) as a function of the control parameter r. Compare the behavior of λ to the bifurcation diagram in Figure 6.2. Note that λ < 0 for r < 3/4 and approaches zero at a period doubling bifurcation. A negative spike corresponds to a superstable trajectory. The onset of chaos is visible near r = 0.892, where λ first becomes positive. For r 0.892, λ generally increases except for dips below zero whenever a periodic window occurs, for example, the dip due to the period 3 window near r = 0.96. For each value of r, the first 1000 iterations were discarded, and 105 values of ln|f (xn)| were used to determine λ. of (6.14), and write λ as λ = 1 n ln ∆xn ∆x0 . (6.15) Because we want to use the data from the entire trajectory after the transient behavior has ended, we use the fact that ∆xn ∆x0 = ∆x1 ∆x0 ∆x2 ∆x1 ··· ∆xn ∆xn−1 . (6.16) Hence, we can express λ as λ = 1 n n−1 i=0 ln ∆xi+1 ∆xi . (6.17) The form (6.17) implies that we can interpret xi for any i as the initial condition. We see from (6.17) that the problem of computing λ has been reduced to finding the ratio ∆xi+1/∆xi. Because we want to make the initial difference between the two trajectories as small as possible, we are interested in the limit ∆xi → 0. The idea of the more sophisticated procedure is to compute dxi+1/dxi from the equation of motion at the same time that the equation of motion is being iterated. We use the logistic map CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 161 as an example. From (6.5) we have dxi+1 dxi = f (xi) = 4r(1 − 2xi). (6.18) We can consider xi for any i as the initial condition and the ratio dxi+1/dxi as a measure of the rate of change of xi. Hence, we can iterate the logistic map as before and use the values of xi and the relation (6.18) to compute f (xi) = dxi+1/dxi at each iteration. The Lyapunov exponent is given by λ = lim n→∞ 1 n n−1 i=0 ln f (xi) (6.19) where we begin the sum in (6.19) after the transient behavior is finished. We have explicitly included the limit n → ∞ in (6.19) to remind ourselves to choose n sufficiently large. Note that this procedure weights the points on the attractor correctly; that is, if a particular region of the attractor is not visited often by the trajectory, it does not contribute much to the sum in (6.19). Problem 6.9. Lyapunov exponent for the logistic map (a) Modify IterateMapApp to compute the Lyapunov exponent λ for the logistic map using the naive approach. Choose r = 0.91, x0 = 0.5, and ∆x0 = 10−6, and plot ln|∆xn/∆x0| versus n. What happens to ln|∆xn/∆x0| for large n? Determine λ for r = 0.91, r = 0.97, and r = 1.0. Does your result for λ for each value of r depend significantly on your choice of x0 or ∆x0? (b) Modify BifurcateApp to compute λ using the algorithm discussed in the text for r = 0.76 to r = 1.0 in steps of ∆r = 0.01. What is the sign of λ if the system is not chaotic? Plot λ versus r and explain your results in terms of behavior of the bifurcation diagram shown in Figure 6.2. Compare your results for λ with those shown in Figure 6.9. How does the sign of λ correlate with the behavior of the system as seen in the bifurcation diagram? For what value of r is λ a maximum? (c) In Problem 6.3b we saw that roundoff errors in the chaotic regime make the computation of individual trajectories meaningless. That is, if the system’s behavior is chaotic, then small roundoff errors are amplified exponentially in time, and the actual numbers we compute for the trajectory starting from a given initial value are not “real.” Repeat your calculation of λ for r = 1 by changing the roundoff error as you did in Problem 6.3b. Does your computed value of λ change? How meaningful is your computation of the Lyapunov exponent? We will encounter a similar question in Chapter 8 where we compute the trajectories of chaotic systems of many particles. We will find that although the “true” trajectories cannot be computed for long times, averages over the trajectories yield meaningful results. We have found that nearby trajectories diverge if λ > 0. For λ < 0, the two trajectories converge and the system is not chaotic. What happens for λ = 0? In this case we will see that the trajectories diverge algebraically; that is, as a power of n. In some cases a dynamical system is at the “edge of chaos” where the Lyapunov exponent vanishes. Such systems are said to exhibit weak chaos to distinguish their behavior from the strongly chaotic behavior (λ > 0) that we have been discussing. If we define z ≡ |∆xn|/|∆x0|, then z will satisfy the differential equation dz dn = λz. (6.20) CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 162 For weak chaos we do not find an exponential divergence, but instead a divergence that is algebraic and is given by dz dn = λqzq (6.21) where q is a parameter that needs to be determined. The solution to (6.21) is z = [1 + (1 − q)λqn]1/(1−q) (6.22) which can be checked by substituting (6.22) into (6.21). In the limit q → 1, we recover the usual exponential dependence. We can determine the type of chaos using the crude approach of choosing a large number of initial values of x0 and x0 +∆x0 and plotting the average of lnz versus n. If we do not obtain a straight line, then the system does not exhibit strong chaos. How can we check for the behavior shown in (6.22)? The easiest way is to plot the quantity z1−q − 1 1 − q (6.23) versus n, which will equal nλq if (6.22) is applicable. We explore these ideas in the following problem. ∗Problem 6.10. Measuring weak chaos (a) Write a program that plots lnz if q = 1 or zq if q 1 as a function of n. Your program should have q, |∆x0|, the number of seeds, and the number of iterations as input parameters. To compare with work by Añaños and Tsallis, use a variation of the logistic map given by xn+1 = 1 − ax2 n (6.24) where |xn| ≤ 1 and 0 ≤ a ≤ 2. The seeds x0 should be equally spaced in the interval |x0| < 1. (b) Consider strong chaos at a = 2. Choose q = 1, 50 iterations, at least 1000 values of x0, and |∆x0| = 10−6. Do you obtain a straight line for lnz versus n? Does zn eventually stop increasing as a function of n? If so why? Try |∆x0| = 10−12. How do your results differ and how are they the same? Also iterate ∆x directly: ∆xn+1 = xn+1 − ˜xn+1 = −a(x2 n − ˜x2 n) = −a(xn − ˜xn)(xn + ˜xn) = −a∆xn(xn + ˜xn) (6.25) where xn is the iterate starting at x0, and ˜xn is the iterate starting at x0 + ∆x0. Show that straight lines are not obtained for your plot if q 1. (c) The edge of chaos for this map is at a = 1.401155189. Repeat part (a) for this value of a and various values of q. Simulations with 105 values of x0 points show that linear behavior is obtained for q ≈ 0.36. A system of fixed energy (and number of particles and volume) has an equal probability of being in any microstate specified by the positions and velocities of the particles (see Sec 15.2). One way of measuring the ability of a system to be in any state is to measure its entropy defined by S = − i pi lnpi, (6.26) CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 163 where the sum is over all states, and pi is the probability or relative frequency of being in the ith state. For example, if the system is always in only one state, then S = 0, the smallest possible entropy. If the system explores all states equally, then S = lnΩ, where Ω is the number of possible states. (You can show this result by letting pi = 1/Ω.) ∗Problem 6.11. Entropy of the logistic map (a) Write a program to compute S for the logistic map. Divide the interval [0,1] into bins or subintervals of width ∆x = 0.01 and determine the relative number of times the trajectory falls into each bin. At each value of r in the range 0.7 ≤ r ≤ 1, the map should be iterated for a fixed number of steps, for example, n = 1000. Choose ∆x = 0.01. What happens to the entropy when the trajectory is chaotic? (b) Repeat part (a) with n = 10,000. For what values of r does the entropy change significantly? Decrease ∆x to 0.001 and repeat. Does this decrease make a difference? (c) Plot pi as a function of x for r = 1. For what value(s) of x is the plot a maximum? We can also measure the (generalized) entropy as a function of time. As we will see in Problem 6.12, S(n) for strong chaos increases linearly with n until all the possible states are visited. However, for weak chaos this behavior is not found. In the latter case we can generalize the entropy to a q-dependent function defined by Sq = 1 − i p q i q − 1 . (6.27) In the limit q → 1, Sq → S. The following problem discusses measuring the entropy for the same system as in Problem 6.10. ∗Problem 6.12. Entropy of weak and strong chaotic systems (a) Write a program that iterates the map (6.24) and plots S if q = 1, or plots Sq, if q 1, as a function of n. The input parameters should be q, the number of bins, the number of random seeds in a single bin, and n, the number of iterations. At each iteration compute the entropy. Then average S over the randomly chosen values of the seeds. (b) Consider strong chaos at a = 2. Choose q = 1, n = 20, ∆x ≤ 0.001, and ten randomly chosen seeds per bin. Do you obtain a straight line for S versus n? Does the curve eventually stop growing? If you decrease ∆x, how do your results differ and how are they the same? Show that S is not a linear function of n if q 1. (c) Repeat part (a) with a = 1.401155189 and various values of q. Simulations with 105 bins show that linear behavior is obtained for q ≈ 0.36, the same value as for the measurements in Problem 6.10. 6.6 *Controlling Chaos The dream of classical physics was that if the initial conditions and all the forces acting on a system were known, then we could predict the future with as much precision as we desire. CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 164 The existence of chaos has shattered that dream. However, if a system is chaotic, we can still control its behavior with small, but carefully chosen, perturbations of the system. We will illustrate the method for the logistic map. The application of the method to other one-dimensional systems is straightforward, but the extension to higher-dimensional systems is more complicated (cf. Ott, Lai). Suppose that we want the trajectory to be periodic even though the parameter r is in the chaotic regime. How can we make the trajectory have periodic behavior without drastically changing r or imposing an external perturbation that is so large that the internal dynamics of the map become irrelevant? The key idea is that for any value of r in the chaotic regime, there is an infinite number of trajectories that have unstable periods. This property of the chaotic regime means that if we choose the value of the seed x0 to be precisely equal to a point on an unstable trajectory with period p, the subsequent trajectory will have this period. However, if we choose a value of x0 that differs ever so slightly from this special value, the trajectory will not be periodic. Our goal is to make slight perturbations to the system to keep it on the desired unstable periodic trajectory. The first step is to find the values of x(i), i = 1 to p, that constitute the unstable periodic trajectory. It is an interesting numerical problem to find the values of x(i), and we consider this problem first. To find a fixed point of the map f (p), we need to find the value of x∗ such that g(p) (x∗ ) ≡ f (p) (x∗ ) − x∗ = 0. (6.28) The algorithms for finding the solution to (6.28) are called root-finding algorithms. You might have heard of Newton’s method, which we describe in Appendix 6B. Here we use the simplest root-finding algorithm, the bisection method. The algorithm works as follows: (i) Choose two values xleft and xright, with xleft < xright, such that the product g(p)(xleft)g(p)(xright) < 0. Because this product is negative, there must be a value of x such that g(p)(x) = 0 in the interval xleft,xright (ii) Choose the midpoint, xmid = xleft + 1 2 (xright − xleft) = 1 2 (xleft + xright), as the guess for x∗. (iii) If g(p)(xmid) has the same sign as g(p)(xleft), then replace xleft by xmid; otherwise, replace xright by xmid. The interval for the location of the root is now reduced. (iv) Repeat steps (ii) and (iii) until the desired level of precision is achieved. The following program implements this algorithm for the logistic map. An alternative implementation named FixedPointApp that does not use recursion is not listed, but is available in the ch06 package. One possible problem is that some of the roots of g(p)(x) = 0 are also roots of g(p )(x) = 0 for p equal to a factor of p. (For example, if p = 6, 2 and 3 are factors.) As p increases, it might become more difficult to find a root that is part of a period p trajectory and not part of a period p trajectory. Listing 6.4: The RecursiveFixedPointApp program finds stable and unstable periodic trajectories with the given period using the bisection root- finding algorithm package org . opensourcephysics . sip . ch06 ; import org . opensourcephysics . controls . ; public class RecursiveFixedPointApp extends AbstractCalculation { double r ; / / c o n t r o l parameter CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 165 int period ; double xleft , xright ; double gleft , gright ; public void reset ( ) { control . setValue ( "r" , 0 . 8 ) ; / / c o n t r o l parameter r control . setValue ( "period" , 2 ) ; / / period control . setValue ( "epsilon" , 0.0000001); / / d e s i r e d p r e c i s i o n control . setValue ( "xleft" , 0 . 0 1 ) ; / / guess f o r x l e f t control . setValue ( "xright" , 0 . 9 9 ) ; / / guess f o r xr ig ht } public void calculate ( ) { double epsilon = control . getDouble ( "epsilon" ) ; / / d e s i r e d p r e c i s i o n r = control . getDouble ( "r" ) ; period = control . getInt ( "period" ) ; x l e f t = control . getDouble ( "xleft" ) ; xright = control . getDouble ( "xright" ) ; g l e f t = map( xleft , r , period )− x l e f t ; gright = map( xright , r , period )− xright ; i f ( g l e f t gright <0) { while (Math . abs ( xleft −xright )> epsilon ) { bisection ( ) ; } double x = 0.5 ( x l e f t+xright ) ; control . println ( "explicit search for period "+period+" behavior" ) ; control . println (0+"\t"+x ) ; / / r e s u l t for ( int i = 1; i <=2 period +1; i ++) { x = map( x , r , 1 ) ; control . println ( i+"\t"+x ) ; } } else { control . println ( "range does not enclose a root" ) ; } } public void bisection ( ) { / / midpoint between x l e f t and xr ig ht double xmid = 0.5 ( x l e f t+xright ) ; double gmid = map(xmid , r , period )−xmid ; i f (gmid gleft >0) { x l e f t = xmid ; / / change x l e f t g l e f t = gmid ; } else { xright = xmid ; / / change xr ig ht gright = gmid ; } } double map( double x , double r , double period ) { i f ( period >1) { double y = map( x , r , period −1); return 4 r y (1 −y ) ; CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 166 } else { return 4 r x (1 −x ) ; } } public s t a t i c void main ( String [ ] args ) { CalculationControl . createApp (new RecursiveFixedPointApp ( ) ) ; } } Problem 6.13. Unstable periodic trajectories for the logistic map (a) Test RecursiveFixedPointApp for values of r for which the logistic map has a stable period with p = 1 and p = 2. Set the desired precision equal to 10−7. Initially use xleft = 0.01 and xright = 0.99. Calculate the stable attractor analytically and compare the results of your program with the analytic results. (b) Set r = 0.95 and find the periodic trajectories for p = 1, 2, 5, 6, 7, 12, 13, and 19. (c) Modify RecursiveFixedPointApp so that nb, the number of bisections needed to obtain the unstable trajectory, is listed. Choose three of the cases considered in part (b) and compute nb for the precision = 0.01, 0.001, 0.0001, and 0.00001. Determine the functional dependence of nb on . Now that we know how to find the values of the unstable periodic trajectories, we discuss an algorithm for stabilizing this period. Suppose that we wish to stabilize the unstable trajectory of period p for a choice of r = r0. The idea is to make small adjustments of r = r0 + ∆r at each iteration so that the difference between the actual trajectory and the target periodic trajectory is small. If the actual trajectory is xn and we wish the trajectory to be at x(i), we make the next iterate xn+1 equal to x(i + 1) by expanding the difference xn+1 − x(i + 1) in a Taylor series and setting the difference to zero to first order. We have xn+1 − x(i + 1) = f (xn,r) − f (x(i),r0). If we expand f (xn,r) about (x(i),r0), we have to first order xn+1 − x(i + 1) = ∂f (x,r) ∂x [xn − x(i)] + ∂f (x,r) ∂r ∆r = 0. (6.29) The partial derivatives in (6.29) are evaluated at x = x(i) and r = r0. The result is 4r0 1 − 2x(i) xn − x(i) + 4x(i) 1 − x(i) ∆r = 0 (6.30) and the solution of (6.30) for ∆r can be written as ∆r = −r0 1 − 2x(i) xn − x(i) x(i) 1 − x(i) . (6.31) The procedure is to iterate the logistic map at r = r0 until xn is sufficiently close to an x(i). The nature of chaotic systems is that the trajectory is guaranteed to eventually come close to the desired unstable trajectory. Then we use (6.31) to change the value of r so that the next iteration is closer to x(i + 1). We summarize the algorithm for controlling chaos as follows: 1. Find the unstable periodic trajectory x(1),x(2)...x(p) for the desired value of r0. CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 167 2. Iterate the map with r = r0 until xn is within of x(i). Then use (6.31) to determine r. 3. Turn off the control by setting r = r0. Problem 6.14. Controlling chaos (a) Write a program that allows the user to turn the control on and off. The trajectory can be seen by plotting xn versus n. The program should incorporate as input the desired unstable periodic trajectory x(i), the period p, the value of r0, and the parameter . (b) Test your program with r0 = 0.95 and the periods p = 1, 5, and 13. Use = 0.02. (c) Modify your program so that the values of r as well as the values of xn are shown. How does r change if we vary ? Try = 0.05, 0.01, and 0.005. (d) Add a method to compute n , the number of iterations necessary for the trajectory xn to be within of x(1) when the control is on. Find n , the average value of n , by starting with 100 random values of x0. Compute n as a function of for δ = 0.05, 0.005, 0.0005, and 0.00005. What is the functional dependence of n on ? 6.7 Higher-Dimensional Models So far we have discussed the logistic map as a mathematical model that has some remarkable properties and produces some interesting computer graphics. In this section we discuss some two- and three-dimensional systems that also might seem to have little to do with realistic physical systems. However, as we will see in Sections 6.8 and 6.9, similar behavior is found in realistic physical systems under the appropriate conditions. We begin with a two-dimensional map and consider the sequence of points (xn,yn) generated by xn+1 = yn + 1 − axn 2 (6.32a) yn+1 = bxn. (6.32b) The map (6.32) was proposed by Hénon who was motivated by the relevance of this dynamical system to the behavior of asteroids and satellites. Problem 6.15. The Hénon map (a) Write a program to iterate (6.32) for a = 1.4 and b = 0.3 and plot 104 iterations starting from x0 = 0,y0 = 0. Make sure you compute the new value of y using the old value of x and not the new value of x. Do not plot the initial transient. Look at the trajectory in the region defined by |x| ≤ 1.5 and |y| ≤ 0.45. Make a similar plot beginning from the second initial condition, x0 = 0.63135448,y0 = 0.18940634. Compare the shape of the two plots. Is the shape of the two curves independent of the initial conditions? (b) Increase the scale of your plot so that all points in the region 0.50 ≤ x ≤ 0.75 and 0.15 ≤ y ≤ 0.21 are shown. Begin from the second initial condition and increase the number of computed points to 105. Then make another plot showing all points in the region 0.62 ≤ x ≤ 0.64 and 0.185 ≤ y ≤ 0.191. If time permits, make an additional enlargement and plot all points within the box defined by 0.6305 ≤ x ≤ 0.6325 and 0.1889 ≤ y ≤ 0.1895. You will CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 168 have to increase the number of computed points to order 106. What is the structure of the curves within each box? Does the attractor appear to have a similar structure on smaller and smaller length scales? The region of points from which the points cannot escape is the basin of the Hénon attractor. The attractor is the set of points to which all points in the basin are attracted. That is, two trajectories that begin from different conditions will eventually lie on the attractor. (c) Determine if the system is chaotic; that is, sensitive to initial conditions. Start two points very close to each other and watch their trajectories for a fixed time. Choose different colors for the two trajectories. (d)∗ It is straightforward in principle to extend the method for computing the Lyapunov exponent that we used for a one-dimensional map to higher-dimensional maps. The idea is to linearize the difference (or differential) equations and replace dxn by the corresponding vector quantity drn. This generalization yields the Lyapunov exponent corresponding to the divergence along the fastest growing direction. If a system has f degrees of freedom, it has a set of f Lyapunov exponents. A method for computing all f exponents is discussed in Project 6.24. One of the earliest indications of chaotic behavior was in an atmospheric model developed by Lorenz. His goal was to describe the motion of a fluid layer that is heated from below. The result is convective rolls, where the warm fluid at the bottom rises, cools off at the top, and then falls down later. Lorenz simplified the description by restricting the motion to two spatial dimensions. This situation has been realized experimentally and is known as a Rayleigh–Benard cell. The equations that Lorenz obtained are dx dt = −σx + σy (6.33a) dy dt = −xz + rx − y (6.33b) dz dt = xy − bz (6.33c) where x is a measure of the fluid flow velocity circulating around the cell, y is a measure of the temperature difference between the rising and falling fluid regions, and z is a measure of the difference in the temperature profile between the bottom and the top from the normal equilibrium temperature profile. The dimensionless parameters σ, r, and b are determined by various fluid properties, the size of the Raleigh-Benard cell, and the temperature difference in the cell. Note that the variables x, y, and z have nothing to do with the spatial coordinates, but are measures of the state of the system. Although it is not expected that you will understand the relation of the Lorenz equations to convection, we have included these equations here to reinforce the idea that simple sets of equations can exhibit chaotic behavior. LorenzApp displays the solution to (6.33) using the Open Source Physics 3D drawing framework and is available in the ch06 package. To make three-dimensional plots, we use the Display3DFrame class; the only argument of its constructor is the title for the plot. The following code fragment sets up the plot. Display3DFrame frame = new Display3DFrame ( "Lorenz attractor" ) ; Lorenz lorenz = new Lorenz ( ) ; frame . setPreferredMinMax ( −15.0 , 15.0 , −15.0 , 15.0 , 0.0 , 5 0 . 0 ) ; frame . setDecorationType ( VisualizationHints .DECORATION_AXES) ; frame . addElement ( lorenz ) ; / / lorenz i s a 3D element CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 169 0 10 20 -25 0 25 x t 0 10 20 -25 0 25 y t 0 10 20 0 25 50 z t -25 0 25 -25 0 25 x y -25 0 25 0 25 50 z x -25 0 25 0 25 50 z y Figure 6.10: A trajectory of the Lorenz model with σ = 10, b = 8/3, and r = 28 and the initial condition x0 = 1,y0 = 1,z0 = 20. A time interval of t = 20 is shown with points plotted at intervals of 0.01. The fourth-order Runge–Kutta algorithm was used with ∆t = 0.0025. Housekeeping methods such as reset and initialize are similar to methods in other simulations and are not shown. The class Lorenz draws the attractor in the three-dimensional (x,y,z) space defined by (6.33). The state of the system is shown as a red ball in this 3D space, and the state’s trajectory is shown as a trail. An easy way to show the time evolution is to extend the 3D Group class and create the ball and the trail inside the group. When points are added to the group, the trail is extended and the position of the ball is set. The Lorenz class imports org.opensourcephysics.display3d.simple3d.*. The ball and trail are then instantiated and added to the group as follows: public class Lorenz extends Group implements ODE { ElementEllipsoid ball = new ElementEllipsoid ( ) ; ElementTrail t r a i l = new ElementTrail ( ) ; addElement ( t r a i l ) ; / / adds t r a c e to Lorenz group addElement ( ball ) ; / / adds b a l l to Lorenz group CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 170 . . . } The properties of the ball and trail objects are set by ball . setSizeXYZ (1 , 1 , 1 ) ; / / s e t s s i z e of b a l l in world c o o r d i n a t e s ball . getStyle ( ) . s e t F i l l C o l o r ( java . awt . Color .RED) ; To plot each part of the trajectory through state space, we use the method trail.addPoint(x,y,z) to add to the trail and ball.setXYZ(x,y,z) to show the current state. The user can project onto two dimensions using the frame’s menu or rotate the three-dimensional plot using the mouse because these capabilities are built into the frame. The getRate and getState methods model (6.33) by implementing the ODE interface. Problem 6.16. The Lorenz model (a) Use a Runge–Kutta algorithm such as RK4 or RK45 (see Appendix 3A) to obtain a numerical solution of the Lorenz equations (6.33). Generate three-dimensional plots using Display3DFrame. Explore the basin of the attractor with σ = 10, b = 8/3, and r = 28. (b) Determine qualitatively the sensitivity to initial conditions. Start two points very close to each other and watch their trajectories for approximately 104 time steps. (c) Let zm denote the value of z where z is a relative maximum for the mth time. You can determine the value of zm by finding the average of the two values of z when the right-hand side of (6.33) changes sign. Plot zm+1 versus zm and describe what you find. This procedure is one way that a continuous system can be mapped onto a discrete map. What is the slope of the zm+1 versus zm curve? Is its magnitude always greater than unity? If so, then this behavior is an indication of chaos. Why? The application of the Lorenz equations to weather prediction has led to a popular metaphor known as the butterfly effect. This metaphor is made even more meaningful by inspection of Figure 6.10. The “butterfly effect” is often ascribed to Lorenz (see Hilborn). In a 1963 paper he remarked that: “One meteorologist remarked that if the theory were correct, one flap of a seagull’s wings would be enough to alter the course of the weather forever.” By 1972, the seagull had evolved into the more poetic butterfly and the title of his talk was “Predictability: Does the flap of a butterfly’s wings in Brazil set off a tornado in Texas?” 6.8 Forced Damped Pendulum We now consider the dynamics of nonlinear systems described by classical mechanics. The general problem in classical mechanics is the determination of the positions and velocities of a system of particles subjected to certain forces. For example, we considered in Chapter 5 the celestial two-body problem and were able to predict the motion at any time. We will find that we cannot make long-time predictions for the trajectories of nonlinear classical systems when these systems exhibit chaos. A familiar example of a nonlinear mechanical system is the simple pendulum (see Chapter 3). To make its dynamics more interesting, we assume that there is a linear damping term CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 171 present and that the pivot is forced to move vertically up and down. Newton’s second law for this system is (cf. McLaughlin or Percival and Richards) d2θ dt2 = −γ dθ dt − [ω0 2 + 2Acosωt]sinθ (6.34) where θ is the angle the pendulum makes with the vertical axis, γ is the damping coefficient, ω0 2 = g/L is the natural frequency of the pendulum, and ω and A are the frequency and amplitude of the external force. The effect of the vertical acceleration of the pivot is equivalent to a time-dependent gravitational field, because we can write the total vertical force due to gravity, −mg plus the pivot motion f (t) as −mg(t), where g(t) ≡ g − f (t)/m. How do we expect the driven, damped simple pendulum to behave? Because there is damping present, we expect that if there is no external force, the pendulum would come to rest. That is, (x = 0,v = 0) is a stable attractor. As A is increased from zero, this attractor remains stable for sufficiently small A. At a value of A equal to Ac, this attractor becomes unstable. How does the driven nonlinear oscillator behave as we increase A? It is difficult to determine whether the pendulum has some kind of underlying periodic behavior by plotting only its position or even plotting its trajectory in phase space. We expect that if it does, the period will be related to the period of the external time-dependent force. Thus, we analyze the motion by plotting a point in phase space after every cycle of the external force. Such a phase space plot is called a Poincaré map. Hence, we will plot dθ/dt versus θ for values of t equal to nT for n equal to 1,2,3,... . If the system has a period T , then the Poincaré map consists of a single point. If the period of the system is nT , there will be n points. PoincareApp uses the fourth-order Runge–Kutta algorithm to compute θ(t) and the angular velocity dθ(t)/dt for the pendulum described by (6.34). This equation is modeled in the DampedDrivenPendulum class, but is not shown here because it is similar to other ODE implementations. A phase diagram for dθ(t)/dt versus θ(t) is shown in one frame. In the other frame, the Poincaré map is represented by drawing a small box at the point (θ,dθ/dt) at time t = nT . If the system has period 1; that is, if the same values of (θ,dθ/dt) are drawn at t = nT , we would see only one box; otherwise, we would see several boxes. Because the first few values of (θ,dθ/dt) show the transient behavior, it is desirable to clear the display and draw a new Poincaré map without changing A, θ, or dθ/dt. Listing 6.5: PoincareApp plots a phase diagram and a Poincaré map for the damped driven pendulum. package org . opensourcephysics . sip . ch06 ; import org . opensourcephysics . controls . ; import org . opensourcephysics . frames . PlotFrame ; import org . opensourcephysics . numerics .RK4; public class PoincareApp extends AbstractSimulation { final s t a t i c double PI = Math . PI ; / / defined f o r b r e v i t y PlotFrame phaseSpace = new PlotFrame ( "theta" , "angular velocity" , "Phase space plot" ) ; PlotFrame poincare = new PlotFrame ( "theta" , "angular velocity" , "Poincare plot" ) ; int nstep = 100; / / # i t e r a t i o n s between Poincare p l o t DampedDrivenPendulum pendulum = new DampedDrivenPendulum ( ) ; RK4 odeMethod = new RK4(pendulum ) ; public PoincareApp ( ) { CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 172 / / angular frequency of e x t e r n a l f o r c e equals two and hence / / period of e x t e r n a l f o r c e equals pi odeMethod . setStepSize ( PI/ nstep ) ; / / dt = PI / nsteps phaseSpace . setMarkerShape (0 , 6 ) ; / / second argument i n d i c a t e s a p i x e l / / smaller s i z e g i v e s b e t t e r r e s o l u t i o n poincare . setMarkerSize (0 , 2 ) ; poincare . setMarkerColor (0 , java . awt . Color .RED) ; phaseSpace . setMessage ( "t = " +0); } public void reset ( ) { control . setValue ( "theta" , 0 . 2 ) ; control . setValue ( "angular velocity" , 0 . 6 ) ; control . setValue ( "gamma" , 0 . 2 ) ; / / damping constant control . setValue ( "A" , 0 . 8 5 ) ; / / amplitude } public void doStep ( ) { double s t a t e [ ] = pendulum . getState ( ) ; for ( int istep = 0; istep PI ) { s t a t e [ 0 ] = s t a t e [0] −2.0 PI ; } else i f ( s t a t e [0]<−PI ) { s t a t e [ 0 ] = s t a t e [0]+2 PI ; } phaseSpace . append (0 , s t a t e [ 0 ] , s t a t e [ 1 ] ) ; } poincare . append (0 , s t a t e [ 0 ] , s t a t e [ 1 ] ) ; phaseSpace . setMessage ( "t = "+decimalFormat . format ( s t a t e [ 2 ] ) ) ; poincare . setMessage ( "t = "+decimalFormat . format ( s t a t e [ 2 ] ) ) ; i f ( phaseSpace . isShowing ( ) ) { phaseSpace . render ( ) ; } i f ( poincare . isShowing ( ) ) { poincare . render ( ) ; } } public void i n i t i a l i z e ( ) { double theta = control . getDouble ( "theta" ) ; / / i n i t i a l angle / / i n i t i a l angular v e l o c i t y double omega = control . getDouble ( "angular velocity" ) ; pendulum .gamma = control . getDouble ( "gamma" ) ; / / damping constant / / amplitude of e x t e r n a l f o r c e pendulum .A = control . getDouble ( "A" ) ; pendulum . i n i t i a l i z e S t a t e (new double [ ] { theta , omega , 0 } ) ; clear ( ) ; } public void clear ( ) { phaseSpace . clearData ( ) ; poincare . clearData ( ) ; CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 173 phaseSpace . render ( ) ; poincare . render ( ) ; } public s t a t i c void main ( String [ ] args ) { SimulationControl control = SimulationControl . createApp (new PoincareApp ( ) ) ; control . addButton ( "clear" , "Clear" ) ; } } Problem 6.17. Dynamics of a driven, damped simple pendulum (a) Use PoincareApp to simulate the driven, damped simple pendulum. In the program, ω = 2 so that the period T of the external force equals π. The program also assumes that ω0 = 1. Use γ = 0.2 and A = 0.85 and compute the phase space trajectory. After the transient, how many points do you see in the Poincaré plot? What is the period of the pendulum? Vary the initial values of θ and dθ/dt. Is the attractor independent of the initial conditions? Remember to ignore the transient behavior. (b) Modify PoincareApp so that it plots θ and dθ/dt as a function of t. Describe the qualitative relation between the Poincaré plot, the phase space plot, and the t dependence of θ and dθ/dt. (c) The amplitude A plays the role of the control parameter for the dynamics of the system. Use the behavior of the Poincaré plot to find the value A = Ac at which the (0,0) attractor becomes unstable. Start with A = 0.1 and continue increasing A until the (0,0) attractor becomes unstable. (d) Find the period for A = 0.1, 0.25, 0.5, 0.7, 0.75, 0.85, 0.95, 1.00, 1.02, 1.031, 1.033, 1.036, and 1.05. Note that for small A, the period of the oscillator is twice that of the external force. The steady state period is 2π for Ac < A < 0.71, π for 0.72 < A < 0.79, and then 2π again. (e) The first period doubling occurs for A ≈ 0.79. Find the approximate values of A for further period doubling and use these values of A to compute the exponent δ defined by (6.10). Compare your result for δ with the result found for the one-dimensional logistic map. Are your results consistent with those that you found for the logistic map? An analysis of this system can be found in the article by McLaughlin. (f) Sometimes a trajectory does not approach a steady state even after a very long time, but a slight perturbation causes the trajectory to move quickly onto a steady state attractor. Consider A = 0.62 and the initial condition (θ = 0.3,dθ/dt = 0.3). Describe the behavior of the trajectory in phase space. During the simulation, change θ by 0.1. Does the trajectory move onto a steady state trajectory? Do similar simulations for other values of A and other initial conditions. (g) Repeat the calculations of parts (b)–(d) for γ = 0.05. What can you conclude about the effect of damping? (h) Replace the fourth-order Runge–Kutta algorithm by the lower-order Euler-Richardson algorithm. Which algorithm gives the better trade-off between accuracy and speed? CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 174 Problem 6.18. The basin of an attractor (a) For γ = 0.2 and A > 0.79, the pendulum rotates clockwise or counterclockwise in the steady state. Each of these two rotations is an attractor. The set of initial conditions that lead to a particular attractor is called the basin of the attractor. Modify PoincareApp so that the program draws the basin of the attractor with dθ/dt > 0. For example, your program might simulate the motion for about 20 periods and then determine the sign of dθ/dt. If dθ/dt > 0 in the steady state, then the program plots a point in phase space at the coordinates of the initial condition. The program repeats this process for many initial conditions. Describe the basin of attraction for A = 0.85 and increments of the initial values of θ and dθ/dt equal to π/10. (b) Repeat part (a) using increments of the initial values of θ and dθ/dt equal to π/20 or as small as possible given your computer resources. Does the boundary of the basin of attraction appear smooth or rough? Is the basin of the attractor a single region or is it disconnected into more than one piece? (c) Repeat parts (a) and (b) for other values of A, including values near the onset of chaos and in the chaotic regime. Is there a qualitative difference between the basins of periodic and chaotic attractors? For example, can you always distinguish the boundaries of the basin? 6.9 *Hamiltonian Chaos Hamiltonian systems are a very important class of dynamical systems. The most familiar are mechanical systems without friction, and the most important of these is the solar system. The linear harmonic oscillator and the simple pendulum that we considered in Chapter 3 are two simple examples. Many other systems can be included in the Hamiltonian framework, for example, the motion of charged particles in electric and magnetic fields and ray optics. The Hamiltonian dynamics of charged particles is particularly relevant to confinement issues in particle accelerators, storage rings, and plasmas. In each case a function of all the coordinates and momenta called the Hamiltonian is formed. For many systems this function can be identified with the total energy. The Hamiltonian for a particle in a potential V (x,y,z) is H = 1 2m (px 2 + py 2 + pz 2 ) + V (x,y,z). (6.35) Typically we write (6.35) using the notation H = i p2 i 2m + V ({qi}) (6.36) where p1 ≡ px, q1 ≡ x, etc. This notation emphasizes that the pi and the qi are generalized coordinates. For example, in some systems p can represent the angular momentum and q can represent an angle. For a system of N particles in three dimensions, the sum in (6.36) runs from 1 to 3N, where 3N is the number of degrees of freedom. The methods for constructing the generalized momenta and the Hamiltonian are described in standard classical mechanics texts. The time dependence of the generalized momenta and CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 175 coordinates is given by ˙pi ≡ dpi dt = − ∂H ∂qi (6.37a) ˙qi ≡ dqi dt = ∂H ∂pi . (6.37b) Check that (6.37) leads to the usual form of Newton’s second law by considering the simple example of a single particle in a potential U(x), where q = x and p = m ˙x. As we found in Chapter 4, an important property of conservative systems is preservation of areas in phase space. Consider a set of initial conditions of a dynamical system that form a closed surface in phase space. For example, if phase space is two-dimensional, this surface would be a one-dimensional loop. As time evolves, this surface in phase space will typically change its shape. For Hamiltonian systems, the volume (area for a two-dimensional phase space) enclosed by this surface remains constant in time. For dissipative systems, this volume will decrease, and hence dissipative systems are not described by a Hamiltonian. One consequence of the constant phase space volume is that Hamiltonian systems do not have phase space attractors. In general, the motion of Hamiltonian systems is very complex. In some systems the motion is regular, and there is a constant of the motion (a quantity that does not change with time) for each degree of freedom. Such a system is said to be integrable. For time-independent systems, an obvious constant of the motion is the total energy. The total momentum and angular momentum are other examples. There may be others as well. If there are more degrees of freedom than constants of the motion, then the system can be chaotic. When the number of degrees of freedom becomes large, the possibility of chaotic behavior becomes more likely. An important example that we will consider in Chapter 8 is a system of interacting particles. Their chaotic motion is essential for the system to be described by the methods of statistical mechanics. For regular motion the change in shape of a closed surface in phase space is uninteresting. For chaotic motion, nearby trajectories must exponentially diverge from each other, but are confined to a finite region of phase space. Hence, there will be local stretching of the surface accompanied by repeated folding to ensure confinement. There is another class of systems whose behavior is in between; that is, the system behaves regularly for some initial conditions and chaotically for others. We will study these mixed systems in this section. Consider the Hamiltonian for a system of N particles. If the system is integrable, there are 3N constants of the motion. It is natural to identify the generalized momenta with these constants. The coordinates that are associated with each of these constants will vary linearly with time. If the system is confined in phase space, then the coordinates must be periodic. If we have just one coordinate, we can think of the motion as a point moving on a circle in phase space. In two dimensions the motion is a point moving in two circles at once; that is, a point moving on the surface of a torus. In three dimensions we can imagine a generalized torus with three circles, and so on. If the period of motion along each circle is a rational fraction of the period of all the other circles, then the torus is called a resonant torus, and the motion in phase space is periodic. If the periods are not rational fractions of each other, then the torus is called nonresonant. If we take an integrable Hamiltonian and change it slightly, what happens to these tori? A partial answer is given by a theorem due to Kolmogorov, Arnold, and Moser (KAM), which states that, under certain circumstances, the tori will remain. When the perturbation of the Hamiltonian becomes sufficiently large, these KAM tori are destroyed. CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 176 θ periodic impulse Figure 6.11: Model of a kicked rotor consisting of a rigid rod with moment of inertia I. Gravity and friction at the pivot is ignored. The motion of the rotor is given by the standard map in (6.39). To understand the basic ideas associated with mixed systems, we consider a simple model of a rotor known as the standard map (see Figure 6.11). The rod has a moment of inertia I and length L and is fastened at one end to a frictionless pivot. The other end is subjected to a vertical periodic impulsive force of strength k/L applied at time t = 0, τ, 2τ,... Gravity is ignored. The motion of the rotor can be described by the angle θ and the corresponding angular momentum pθ. The Hamiltonian for this system can be written as H(θ,pθ,t) = pθ 2 2I + k cosθ n δ(t − nτ). (6.38) The term δ(t − nτ) is zero everywhere except at t = nτ; its integral over time is unity if t = nτ is within the limits of integration. If we use (6.37) and (6.38), it is easy to show that the corresponding equations of motion are given by dpθ dt = k sinθ n δ(t − nτ) (6.39a) dθ dt = pθ I . (6.39b) From (6.39) we see that pθ is constant between kicks (remember that gravity is assumed to be absent), but changes discontinuously at each kick. The angle θ varies linearly with t between kicks and is continuous at each kick. It is convenient to know the values of θ and pθ at times just after the kick. We let θn and pn be the values of θ(t) and pθ(t) at times t = nτ + 0+, where 0+ is a infinitesimally small positive number. If we integrate (6.39a) from t = (n + 1)τ − 0+ to t = (n + 1)τ + 0+, we obtain pn+1 − pn = k sinθn+1. (6.40a) (Remember that p is constant between kicks and the delta function contributes to the integral only when t = (n + 1)τ.) From (6.39b) we have θn+1 − θn = (τ/I)pn. (6.40b) If we choose units such that τ/I = 1, we obtain the standard map θn+1 = (θn + pn) modulo 2π (6.41a) pn+1 = pn + k sinθn+1 (standard map). (6.41b) We have added the requirement in (6.41a) that the value of the angle θ is restricted to be between zero and 2π. CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 177 Before we iterate (6.41), let us check that (6.41) represents a Hamiltonian system; that is, the area in q-p space is constant as n increases. (Here q corresponds to θ.) Suppose we start with a rectangle of points of length dqn and dpn. After one iteration, this rectangle will be deformed into a parallelogram of sides dqn+1 and dpn+1. From (6.41) we have dqn+1 = dqn + dpn (6.42a) dpn+1 = dpn + k cosqn+1 dqn+1. (6.42b) If we substitute (6.42a) in (6.42b), we obtain dpn+1 = (1 + k cosqn+1)dpn + k cosqn+1 dqn. (6.43) To find the area of a parallelogram, we take the magnitude of the cross product of the vectors dqn+1 = (dqn,dpn) and dpn+1 = (1 + k cosqndqn,k cosqndpn). The result is dqn dpn, and hence the area in phase space has not changed. The standard map is an example of an area-preserving map . The qualitative properties of the standard map are explored in Problem 6.19. You will find that for k = 0, the rod rotates with a fixed angular velocity determined by the momentum pn = p0 = constant. If p0 is a rational number times 2π, then the trajectory in phase space consists of a sequence of isolated points lying on a horizontal line (resonant tori). Can you see why? If p0 is not a rational number times 2π or if your computer does not have sufficient precision, then after a long time, the trajectory will consist of a horizontal line in phase space. As we increase k, these horizontal lines are deformed into curves that run from q = 0 to q = 2π, and the isolated points of the resonant tori are converted into closed loops. For some initial conditions, the trajectories will become chaotic after the transient behavior has ended. Problem 6.19. The standard map (a) Write a program to iterate the standard map and plot its trajectory in phase space. Use different colors so that several trajectories can be shown at the same time for the same value of the parameter k. Choose a set of initial conditions that form a rectangle (see Problem 4.10). Does the shape of this area change with time? What happens to the total area? (b) Begin with k = 0 and choose an initial value of p that is a rational number times 2π. What types of trajectories do you obtain? If you obtain trajectories consisting of isolated points, do these points appear to shift due to numerical roundoff errors? How can you tell? What happens if p0 is an irrational number times 2π? Remember that a computer can only approximate an irrational number. (c) Consider k = 0.2 and explore the nature of the phase space trajectories. What structures appear that do not appear for k = 0? Discuss the motion of the rod corresponding to some of the typical trajectories that you find. (d) Increase k until you first find several chaotic trajectories. How can you tell that they are chaotic? Do these chaotic trajectories fill all of phase space? If there is one trajectory that is chaotic at a particular value of k, are all trajectories chaotic? What is the approximate value for kc above which chaotic trajectories appear? We now discuss a discrete map that models the rings of Saturn (see Fröyland). The model assumes that the rings of Saturn are due to perturbations produced by Mimas. There are two important forces acting on objects near Saturn. The force due to Saturn can be incorporated CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 178 as follows. We know that each time Mimas completes an orbit, it traverses a total angle of 2π. Hence, the angle θ of any other moon of Saturn relative to Mimas can be expressed as θn+1 = θn + 2π σ3/2 rn 3/2 (6.44) where rn is the radius of the orbit after n revolutions and σ = 185.7×103 km is the mean distance of Mimas from Saturn. The other important force is due to Mimas and causes the radial distance rn to change. A discrete approximation to the radial acceleration dvr/dt is (see (3.16)) ∆vr ∆t ≈ r(t + ∆t) − 2r(t) + r(t − ∆t) (∆t)2 . (6.45) The acceleration equals the radial force due to Mimas. If we average over a complete period, then a reasonable approximation for the change in rn due to Mimas is rn+1 − 2rn + rn−1 = f (rn,θn) (6.46) where f (rn,θn) is proportional to the radial force. (We have absorbed the factor of (∆t)2 and the mass into f .) In general, the form of f (rn,θn) is very complicated. We make a major simplifying assumption and take f to be proportional to −(rn − σ)−2 and to be periodic in θn. This form for the force incorporates the fact that for large rn, the force has the usual form for the gravitational force. For simplicity, we express this periodicity in the simplest possible way, that is, as cosθn. We also want the map to be area conserving. These considerations lead to the following twodimensional map: θn+1 = θn + 2π σ3/2 rn 3/2 (6.47a) rn+1 = 2rn − rn−1 − a cosθn (rn − σ)2 . (6.47b) The constant a for Saturn’s rings is approximately 2 × 1012 km3 . We can show, using a similar technique as before, that the volume in (r,θ) space is preserved, and hence (6.47) is a Hamiltonian map. The purpose of the above discussion was only to motivate and not to derive the form of the map (6.47). In Problem 6.20 we investigate how the map (6.47) yields the qualitative structure of Saturn’s rings. In particular, what happens to the values of rn if the period of a moon is related to the period of Mimas by the ratio of two integers? Problem 6.20. A simple model of the rings of Saturn (a) Write a program to implement the map (6.47). Be sure to save the last two values of r so that the values of rn are updated correctly. The radius of Saturn is 60.4 × 103 km. Express all lengths in units of 103 km. In these units a = 2000. Plot the points (rn cosθn,rn sinθn). Choose initial values for r between the radius of Saturn and σ, the distance of Mimas from Saturn, and find the bands of rn values where stable trajectories are found. (b) What is the effect of changing the value of a? Try a = 200 and a = 20,000 and compare your results with part (a). CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 179 L L 2L 0 m m θ1 θ2 Figure 6.12: The double pendulum. (c) Vary the force function. Replace cosθ by other trigonometric functions. How do your results change? If the changes are small, does that give you some confidence that the model has something to do with Saturn’s rings? A more realistic dynamical system is the double pendulum, a system that can be demonstrated in the laboratory. This system consists of two equal point masses m, with one suspended from a fixed support by a rigid weightless rod of length L and the other suspended from the first by a similar rod (see Figure 6.12). Because there is no friction, this system is an example of a Hamiltonian system. The four rectangular coordinates x1,y1,x2, and y2 of the two masses can be expressed in terms of two generalized coordinates θ1,θ2: x1 = Lsinθ1 (6.48a) y1 = 2L − Lcosθ1 (6.48b) x2 = Lsinθ1 + Lsinθ2 (6.48c) y2 = 2L − Lcosθ1 − Lcosθ2. (6.48d) The kinetic energy is given by K = 1 2 m( ˙x2 1 + ˙x2 2 + ˙y2 1 + ˙y2 2) = 1 2 mL2 [2 ˙θ2 1 + ˙θ2 2 + 2 ˙θ1 ˙θ2 cos(θ1 − θ2)], (6.49) and the potential energy is given by U = mgL 3 − 2cosθ1 − cosθ2 . (6.50) For convenience, U has been defined so that its minimum value is zero. To use Hamilton’s equations of motion (6.37), we need to express the sum of the kinetic energy and potential energy in terms of the generalized momenta and coordinates. In rectangular coordinates the momenta are equal to pi = ∂K/∂ ˙qi, where, for example, qi = x1 and pi is the x-component of mv1. This relation works for generalized momenta as well, and the generalized momentum corresponding to θ1 is given by p1 = ∂K/∂ ˙θ1. If we calculate the appropriate CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 180 q1 p1 -6 -4 -2 0 2 4 6 8 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 Figure 6.13: Poincaré plot for the double pendulum with p1 plotted versus q1 for q2 = 0 and p2 > 0. Two sets of initial conditions, (q1,q2,p2) = (0,0,0) and (1.1,0,0), respectively, were used to create the plot. The initial value of the coordinate p2 is found from (6.52) by requiring that E = 15. derivatives, we can show that the generalized momenta can be written as p1 = mL2 2 ˙θ1 + ˙θ2 cos(θ1 − θ2) (6.51a) p2 = mL2 ˙θ2 + ˙θ1 cos(θ1 − θ2) . (6.51b) The Hamiltonian or total energy becomes H = 1 2mL2 p1 2 + 2p2 2 − 2p1p2 cos(q1 − q2) 1 + sin2 (q1 − q2) + mgL(3 − 2cosq1 − cosq2) (6.52) where q1 = θ1 and q2 = θ2. The equations of motion can be found by using (6.52) and (6.37). Figure 6.13 shows a Poincaré map for the double pendulum. The coordinate p1 is plotted versus q1 for the same total energy E = 15, but for two different initial conditions. The map includes the points in the trajectory for which q2 = 0 and p2 > 0. Note the resemblance between Figure 6.13 and plots for the standard map above the critical value of k; that is, there is a regular trajectory and a chaotic trajectory for the same parameters, but different initial conditions. Problem 6.21. Double pendulum (a) Use either the fourth-order Runge–Kutta algorithm (with ∆t = 0.003) or the second-order Euler–Richardson algorithm (with ∆t = 0.001) to simulate the double pendulum. Choose CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 181 m = 1, L = 1, and g = 9.8. The input parameter is the total energy E. The initial values of q1 and q2 can be chosen either randomly within the interval |qi| < π or by the user. Then set the initial p1 = 0 and solve for p2 using (6.52) with H = E. First explore the pendulum’s behavior by plotting the generalized coordinates and momenta as a function of time in four windows. Consider the energies E = 1, 5, 10, 15, and 40. Try a few initial conditions for each value of E. Visually determine whether the steady state behavior is regular or appears to be chaotic. Are there some values of E for which all the trajectories appear regular? Are there values of E for which all trajectories appear chaotic? Are there values of E for which both types of trajectories occur? (b) Repeat part (a) but plot the phase space diagrams p1 versus q1 and p2 versus q2. Are these plots more useful for determining the nature of the trajectories than those drawn in part (a)? (c) Draw the Poincaré plot with p1 plotted versus q1 only when q2 = 0 and p2 > 0. Overlay trajectories from different initial conditions but with the same total energy on the same plot. Duplicate the plot shown in Figure 6.13. Then produce Poincaré plots for the values of E given in part (a) with at least five different initial conditions for each energy. Describe the different types of behavior. (d) Is there a critical value of the total energy at which some chaotic trajectories first occur? (e) Animate the double pendulum, showing the two masses moving back and forth. Describe how the motion of the pendulum is related to the behavior of the Poincaré plot. Hamiltonian chaos has important applications in physical systems such as the solar system, the motion of the galaxies, and plasmas. It also has helped us understand the foundation for statistical mechanics. One of the most fascinating applications has been to quantum mechanics, which has its roots in the Hamiltonian formulation of classical mechanics. A current area of interest is the quantum analogue of classical Hamiltonian chaos. The meaning of this analogue is not obvious because well-defined trajectories do not exist in quantum mechanics. Moreover, Schrödinger’s equation is linear and can be shown to have only periodic and quasiperiodic so- lutions. 6.10 Perspective As the many books and review articles on chaos can attest, it is impossible to discuss all aspects of chaos in a single chapter. We will revisit chaotic systems in Chapter 13 where we introduce the concept of fractals. We will find that one of the characteristics of chaotic dynamics is that the resulting attractors often have an intricate geometrical structure. The most general ideas that we have discussed in this chapter are that simple systems can exhibit complex behavior and that chaotic systems exhibit extreme sensitivity to initial conditions. We have also learned that computers allow us to explore the behavior of dynamical systems and visualize the numerical output. However, the simulation of a system does not automatically lead to understanding. If you are interested in learning more about the phenomena of chaos and the associated theory, the suggested readings at the end of the chapter are a good place to start. We also invite you to explore chaotic phenomenon in more detail in the following projects. CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 182 6.11 Projects The first several projects are on various aspects of the logistic map. These projects do not exhaust the possible investigations of the properties of the logistic map. Project 6.22. A more accurate determination of δ and α We have seen that it is difficult to determine δ accurately by finding the sequence of values of bk at which the trajectory bifurcates for the kth time. A better way to determine δ is to compute it from the sequence sm of superstable trajectories of period 2m−1. We already have found that s1 = 1/2, s2 ≈ 0.80902, and s3 ≈ 0.87464. The parameters s1, s2,... can be computed directly from the equation f (2m−1) x = 1 2 = 1 2 . (6.53) For example, s2 satisfies the relation f (2)(x = 1/2) = 1/2. This relation, together with the analytic form for f (2)(x) given in (6.7), yields 8r2 (1 − r) − 1 = 0. (6.54) If we wish to solve (6.54) numerically for r = s2, we need to be careful not to find the irrelevant solutions corresponding to a lower period. In this case we can factor out the solution r = 1/2 and solve the resultant quadratic equation analytically to find s2 = (1 + √ 5)/4. Clearly r = s1 = 1/2 solves (6.54) with period 1, because from (6.53), f (1)(x = 1/2) = 4r 1 2 (1 − 1 2 ) = r = 1/2 only for r = 1/2. (a) It is straightforward to adapt the bisection method discussed in Section 6.6. Adapt the class RecursiveFixedPointApp to find the numerical solutions of (6.53). Good starting values for the left-most and right-most values of r are easy to obtain. The left-most value is r = r∞ ≈ 0.8925. If we already know the sequence s1, s2,...,sm, then we can determine δ by δm = sm−1 − sm−2 sm − sm−1 . (6.55) We use this determination for δm to find the right-most value of r: r (m+1) right = sm − sm−1 δm . (6.56) We choose the desired precision to be 10−16. A summary of our results is given in Table 6.2. Verify these results and determine δ. (b) Use your values of sm to obtain a more accurate determination of α and δ. Project 6.23. From chaos to order The bifurcation diagram of the logistic map (see Figure 6.2) has many interesting features that we have not explored. For example, you might have noticed that there are several smooth dark bands in the chaotic region for r > r∞. Use BifurcateApp to generate the bifurcation diagram for r∞ ≤ r ≤ 1. If we start at r = 1.0 and decrease r, we see that there is a band that narrows and eventually splits into two parts at r ≈ 0.9196. If you look closely, you will see that the band splits into four parts at r ≈ 0.899. If you look even more closely, you will see many more bands. What type of change occurs near the splitting (merging) of these bands)? Use IterateMap to CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 183 m Period sm 1 1 0.500 000 000 2 2 0.809 016 994 3 4 0.874 640 425 4 8 00.888 660 216 5 16 0.891 666 845 6 32 0.892 310 883 7 64 0.892 448 823 8 128 0.892 478 091 Table 6.2: Values of the control parameter sm for the superstable trajectories of period 2m−1. Nine decimal places are shown. look at the time series of (6.5) for r = 0.9175. You will notice that although the trajectory looks random, it oscillates back and forth between two bands. This behavior can be seen more clearly if you look at the time series of xn+1 = f (2)(xn). A detailed discussion of the splitting of the bands can be found in Peitgen et al. Project 6.24. Calculation of the Lyapunov spectrum In Section 6.5 we discussed the calculation of the Lyapunov exponent for the logistic map. If a dynamical system has a multidimensional phase space, for example, the Hénon map and the Lorenz model, there is a set of Lyapunov exponents called the Lyapunov spectrum that characterize the divergence of the trajectory. As an example, consider a set of initial conditions that forms a filled sphere in phase space for the (three-dimensional) Lorenz model. If we iterate the Lorenz equations, then the set of phase space points will deform into another shape. If the system has a fixed point, this shape contracts to a single point. If the system is chaotic, then the sphere will typically diverge in one direction but become smaller in the other two directions. In this case we can define three Lyapunov exponents to measure the deformation in three mutually perpendicular directions. These three directions generally will not correspond to the axes of the original variables. Instead, we must use a Gram–Schmidt orthogonalization procedure. The algorithm for finding the Lyapunov spectrum is as follows: (i) Linearize the dynamical equations. If r is the f -component vector containing the dynamical variables, then define ∆r as the linearized difference vector. For example, the linearized Lorenz equations are d∆x dt = −σ∆x + σ∆y (6.57a) d∆y dt = −x∆z − z∆x + r∆x − ∆y (6.57b) d∆z dt = x∆y + y∆x − b∆z. (6.57c) (ii) Define f orthonormal initial values for ∆r. For example, ∆r1(0) = (1,0,0), ∆r2(0) = (0,1,0), and ∆r3(0) = (0,0,1). Because these vectors appear in a linearized equation, they do not have to be small in magnitude. (iii) Iterate the original and linearized equations of motion. One iteration yields a new vector from the original equation of motion and f new vectors ∆rα from the linearized equations. CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 184 (iv) Find the orthonormal vectors ∆rα from the ∆rα using the Gram–Schmidt procedure. That is, ∆r1 = ∆r1 |∆r1| (6.58a) ∆r2 = ∆r2 − (∆r1 · ∆r2)∆r1 |∆r2 − (∆r1 · ∆r2)∆r1| (6.58b) ∆r3 = ∆r3 − (∆r1 · ∆r3)∆r1 − (∆r2 · ∆r3)∆r2 ∆r3 − (∆r1 · ∆r3)∆r1 − (∆r2 · ∆r3)∆r2 . (6.58c) It is straightforward to generalize the method to higher-dimensional models. (v) Set the ∆rα(t) equal to the orthonormal vectors ∆rα(t). (vi) Accumulate the running sum, Sα as Sα → Sα + log|∆rα(t)|. (vii) Repeat steps (iii)–(vi) and periodically output the approximate Lyapunov exponents λα = (1/n)Sα, where n is the number of iterations. To obtain a result for the Lyapunov spectrum that represents the steady state attractor, include only data after the transient behavior has ended. (a) Compute the Lyapunov spectrum for the Lorenz model for σ = 16, b = 4, and r = 45.92. Try other values of the parameters and compare your results. (b) Linearize the equations for the Hénon map and find the Lyapunov spectrum for a = 1.4 and b = 0.3 in (6.32). Project 6.25. A spinning magnet Consider a compass needle that is free to rotate in a periodically reversing magnetic field which is perpendicular to the axis of the needle. The equation of motion of the needle is given by d2φ dt2 = − µ I B0 cosωt sinφ (6.59) where φ is the angle of the needle with respect to a fixed axis along the field, µ is the magnetic moment of the needle, I its moment of inertia, and B0 and ω are the amplitude and the angular frequency of the magnetic field, respectively. Choose an appropriate numerical method for solving (6.59) and plot the Poincaré map at time t = 2πn/ω. Verify that if the parameter λ = 2B0µ/I/ω2 > 1, then the motion of the needle exhibits chaotic motion. Briggs (see references) discusses how to construct the corresponding laboratory system and other nonlinear physical systems. Project 6.26. Billiard models Consider a two-dimensional planar geometry in which a particle moves with constant velocity along straight line orbits until it elastically reflects off the boundary. This straight line motion occurs in various “billiard” systems. A simple example of such a system is a particle moving with fixed speed within a circle. For this geometry the angle between the particle’s momentum and the tangent to the boundary at a reflection is the same for all points. Suppose that we divide the circle into two equal parts and connect them by straight lines of length L as shown in Figure 6.14a. This geometry is called a stadium billiard. How does the CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 185 r (a) L L r (b) Figure 6.14: (a) Geometry of the stadium billiard model. (b) Geometry of the Sinai billiard model. motion of a particle in the stadium compare to the motion in the circle? In both cases we can find the trajectory of the particle by geometrical considerations. The stadium billiard model and a similar geometry known as the Sinai billiard model (see Figure 6.14b) have been used as model systems for exploring the foundations of statistical mechanics. There is also much interest in relating the behavior of a classical particle in various billiard models to the solution of Schrödinger’s equation for the same geometries. (a) Write a program to simulate the stadium billiard model. Use the radius r of the semicircles as the unit of length. The algorithm for determining the path of the particle is as follows: (i) Begin with an initial position (x0,y0) and momentum (px0,py0) of the particle such that |p0| = 1. (ii) Determine which of the four sides the particle will hit. The possibilities are the top and bottom line segments and the right and left semicircles. (iii) Calculate the next position of the particle from the intersection of the straight line defined by the current position and momentum, and the equation for the segment where the next reflection occurs. (iv) Determine the new momentum, (px,py), of the particle after reflection such that the angle of incidence equals the angle of reflection. For reflection off the line segments we have (px,py) = (px,−py). For reflection off a circle we have px = y2 − (x − xc)2 px − 2(x − xc)ypy (6.60a) py = −2(x − xc)ypx + (x − xc)2 − y2 py (6.60b) where (xc,0) is the center of the circle. (Note that the momentum px rather than px is on the right-hand side of (6.60b). Remember that all lengths are scaled by the radius of the circle.) (v) Repeat steps (ii)–(iv). (b) Determine if the particle dynamics is chaotic by estimating the largest Lyapunov exponent. One way to do so is to start two particles with almost identical positions and/or momenta CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 186 (varying by say 10−5). Compute the difference ∆s of the two phase space trajectories as a function of the number of reflections n, where ∆s is defined by ∆s = |r1 − r2|2 + |p1 − p2|2. (6.61) Choose L = 1 and r = 1. The Lyapunov exponent can be found from a semilog plot of ∆s versus n. Repeat your calculation for different initial conditions and average your values of ∆s before plotting. Repeat the calculation for L = 0.5 and 2.0 and determine if your results depend on L. (c) Another test for the existence of chaos is the reversibility of the motion. Reverse the momentum after the particle has made n reflections, and let the drawing color equal the background color so that the path can be erased. What limitation does roundoff error place on your results? Repeat this simulation for L = 1 and L = 0. (d) Place a small hole of diameter d in one of the circular sections of the stadium so that the particle can escape. Choose L = 1 and set d = 0.02. Give the particle a random position and momentum, and record the time when the particle escapes through the hole. Repeat for at least 104 particles and compute the fraction of particles S(n) remaining after a given number of reflections n. The function S(n) will decay with n. Determine the functional dependence of S on n and calculate the characteristic decay time if S(n) decays exponentially. Repeat for L = 0.1, 0.5, and 2.0. Is the decay time a function of L? Does S(n) decay exponentially for the circular billiard model (L = 0) (see Bauer and Bertsch)? (e) Choose an arbitrary initial position for the particle in a stadium with L = 1 and a small hole as in part (d). Choose at least 5000 values of the initial value px0 uniformly distributed between 0 and 1. Choose py0 so that |p| = 1. Plot the escape time versus px0 and describe the visual pattern of the trajectories. Then choose 5000 values of px0 in a smaller interval centered about the value of px0 for which the escape time was greatest. Plot these values of the escape time versus px0. Do you see any evidence of self-similarity? (f) Repeat steps (a)–(e) for the Sinai billiard geometry. Project 6.27. The circle map and mode locking The driven damped pendulum can be approximated by a one-dimensional difference equation for a range of amplitudes and frequencies of the driving force. This difference equation is known as the circle map and is given by θn+1 = θn + Ω − K 2π sin2πθn (modulo 1). (6.62) The variable θ represents an angle, and Ω represents a frequency ratio, the ratio of the natural frequency of the pendulum to the frequency of the periodic driving force. The parameter K is a measure of the strength of the nonlinear coupling of the pendulum to the external force. An important quantity is the winding number which is defined as W = lim m→∞ 1 m m−1 n=0 ∆θn (6.63) where ∆θn = Ω − (K/2π)sin2πθn. CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 187 (a) Consider the linear case K = 0. Choose Ω = 0.4 and θ0 = 0.2 and determine W . Verify that if Ω is a ratio of two integers, then W = Ω and the trajectory is periodic. What is the value of W if Ω = √ 2/2, an irrational number? Verify that W = Ω and that the trajectory comes arbitrarily close to any particular value of θ. Does θn ever return exactly to its initial value? This type of behavior of the trajectory is termed quasiperiodic. (b) For K > 0, we will find that W Ω and “locks” into rational frequency ratios for a range of values of K and Ω. This type of behavior is called mode locking. For K < 1, the trajectory is either periodic or quasiperiodic. Determine the value of W for K = 1/2 and values of Ω in the range O < Ω ≤ 1. The widths in Ω of the various mode-locked regions where W is fixed increase with K. Consider other values of K, and draw a diagram in the K-Ω plane (0 ≤ K,Ω ≤ 1) so that those areas corresponding to frequency locking are shaded. These shaded regions are called Arnold tongues. (c) For K = 1, all trajectories are frequency-locked periodic trajectories. Fix K at K = 1 and determine the dependence of W on Ω. The plot of W versus Ω for K = 1 is called the Devil’s staircase. Project 6.28. Chaotic scattering In Chapter 5 we discussed the classical scattering of particles off a fixed target, and found that the differential cross section for a variety of interactions is a smoothly varying function of the scattering angle. That is, a small change in the impact parameter b leads to a small change in the scattering angle θ. Here we consider examples where small changes in b lead to large changes in θ. Such a phenomenon is called chaotic scattering because of the sensitivity to initial conditions that is characteristic of chaos. The study of chaotic scattering is relevant to the design of electronic nanostructures, because many experimental structures exhibit this type of scattering. A typical scattering model consists of a target composed of a group of fixed hard disks and a scatterer consisting of a point particle. The goal is to compute the path of the scatterer as it bounces off the disks and measure θ and the time of flight as a function of the impact parameter b. If a particle bounces inside the target region before leaving, the time of flight can be very long. There are even some trajectories for which the particle never leaves the target region. Because it is difficult to monitor a trajectory that bounces back and forth between the hard disks, we consider instead a two-dimensional map that contains the key features of chaotic scattering (see Yalcinkaya and Lai for further discussion). The map is given by xn+1 = a xn − 1 4 (xn + yn)2 (6.64a) yn+1 = 1 a yn + 1 4 (xn + yn)2 (6.64b) where a is a parameter. The target region is centered at the origin. In an actual scattering experiment, the relation between (xn+1,yn+1) and (xn,yn) would be much more complicated, but the map (6.64) captures most of the important features of realistic chaotic scattering experiments. The iteration number n is analogous to the number of collisions of the scattered particle off the disks. When xn or yn is significantly different from zero, the scatterer has left the target region. (a) Write a program to iterate the map (6.64). Let a = 8.0 and y0 = −0.3. Choose 104 initial values of x0 uniformly distributed in the interval 0 < x0 < 0.1. Determine the time T (x0), the number of iterations for which xn ≤ −5.0. After this time, xn rapidly moves to −∞. Plot CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 188 T (x0) versus x0. Then choose 104 initial values in a smaller interval centered about a value of x0 for which T (x0) > 7. Plot these values of T (x0) versus x0. Do you see any evidence of self-similarity? (b) A trajectory is said to be uncertain if a small change in x0 leads to a change in T (x0). We expect that the number of uncertain trajectories N will depend on a power of ; that is, N ∼ α. Determine N( ) for = 10−p with p = 2 to 7 using the values of x0 in part (a). Then determine the uncertainty dimension 1 − α from a log-log plot of N versus . Repeat these measurements for other values of a. Does α depend on a? (c) Choose 4×104 initial conditions in the same interval as in part (a) and determine the number of trajectories S(n) that have not yet reached xn = −5 as a function of the number of iterations n. Plot lnS(n) versus n and determine if the decay is exponential. It is possible to obtain algebraic decay for values of a less than approximately 6.5. (d) Let a = 4.1 and choose 100 initial conditions uniformly distributed in the region 1.0 < x0 < 1.05 and 0.60 < y0 < 0.65. Are there any trajectories that are periodic and hence have infinite escape times? Due to the accumulation of roundoff error, it is possible to find only finite, but very long escape times. These periodic trajectories form closed curves, and the regions enclosed by them are called KAM surfaces. Project 6.29. Chemical reactions In Project 4.17 we discussed how chemical oscillations can occur when the reactants are continuously replenished. In this project we introduce a set of chemical reactions that exhibits the period doubling route to chaos. Consider the following reactions (see Peng et al.): P → A (6.65a) P + C → A + C (6.65b) A → B (6.65c) A + 2B → 3B (6.65d) B → C (6.65e) C → D. (6.65f) Each of the above reactions has an associated rate constant. The time dependence of the concentrations of A,B, and C is given by: dA dt = k1P + k2P C − k3A − k4AB2 (6.66a) dB dt = k3A + k4AB2 − k5B (6.66b) dC dt = k5B − k5C. (6.66c) We assume that P is held constant by replenishment from an external source. We also assume the chemicals are well mixed so that there is no spatial dependence. In Section 7.8 we discuss the effects of spatial inhomogeneities due to molecular diffusion. Equations (6.65) can be CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 189 written in a dimensionless form as dX dτ = c1 + c2Z − X − XY 2 (6.67a) c3 dY dτ = X + XY 2 − Y (6.67b) c4 dZ dτ = Y − Z (6.67c) where the ci are constants, τ = k3t, and X, Y , and Z are proportional to A, B, and C, respectively. (a) Write a program to solve the coupled differential equations in (6.67). Use a fourth-order Runge–Kutta algorithm with an adaptive step size. Plot lnY versus the time τ. (b) Set c1 = 10, c3 = 0.005, and c4 = 0.02. The constant c2 is the control parameter. Consider c2 = 0.10 to 0.16 in steps of 0.005. What is the period of lnY for each value of c2? (c) Determine the values of c2 at which the period doublings occur for as many period doublings as you can determine. Compute the constant δ [see (6.9)] and compare its value to the value of δ for the logistic map. (d) Make a bifurcation diagram by taking the values of lnY from the Poincaré plot at X = Z and plotting them versus the control parameter c2. Do you see a sequence of period doublings? (e) Use three-dimensional graphics to plot the trajectory of (6.67) with lnX, lnY , and lnZ as the three axes. Describe the attractors for some of the cases considered in part (a). Appendix 6A: Stability of the Fixed Points of the Logistic Map In the following, we derive analytic expressions for the fixed points of the logistic map. The fixed-point condition is given by x∗ = f (x∗ ). (6.68) From (6.5) this condition yields the two fixed points x∗ = 0 and x∗ = 1 − 1 4r . (6.69) Because x is restricted to be positive, the only fixed point for r < 1/4 is x = 0. To determine the stability of x∗, we let xn = x∗ + n (6.70a) and xn+1 = x∗ + n+1. (6.70b) Because | n| 1, we have xn+1 = f (x∗ + n) ≈ f (x∗ ) + nf (x∗ ) = x∗ + nf (x∗ ). (6.71) If we compare (6.70b) and (6.71), we obtain n+1/ n = f (x∗ ). (6.72) If |f (x∗)| > 1, the trajectory will diverge from x∗ because | n+1| > | n|. The opposite is true for |f (x∗)| < 1. Hence, the local stability criteria for a fixed point x∗ are CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 190 1. |f (x∗)| < 1, x∗ is stable; 2. |f (x∗)| = 1, x∗ is marginally stable; 3. |f (x∗)| > 1, x∗ is unstable. If x∗ is marginally stable, the second derivative f (x) must be considered, and the trajectory approaches x∗ with deviations from x∗ inversely proportional to the square root of the number of iterations. For the logistic map, the derivatives at the fixed points are. respectively, f (x = 0) = d dx [4rx(1 − x)] x=0 = 4r (6.73) and f (x = x∗ ) = d dx [4rx(1 − x)] x=1−1/4r = 2 − 4r. (6.74) It is straightforward to use (6.73) and (6.74) to find the range of r for which x∗ = 0 and x∗ = 1 − 1/4r are stable. If a trajectory has period two, then f (2)(x) = f (f (x)) has two fixed points. If you are interested, you can solve for these fixed points analytically. As we found in Problem 6.2, these two fixed points become unstable at the same value of r. We can derive this property of the fixed points using the chain rule of differentiation: d dx f (2) (x) x=x0 = d dx f (f (x)) x=x0 = f (f (x0))f (x) x=x0 . (6.75) If we substitute x1 = f (x0), we can write d dx f (f (x)) x=x0 = f (x1)f (x0). (6.76) In the same way, we can show that d dx f (2) (x) x=x1 = f (x0)f (x1). (6.77) We see that if x0 becomes unstable, then |f (2) (x0)| > 1 as does |f (2) (x1)|. Hence, x1 is also unstable at the same value of r, and we conclude that both fixed points of f (2)(x) bifurcate at the same value of r, leading to an trajectory of period 4. From (6.74) we see that f (x = x∗) = 0 when r = 1/2 and x∗ = 1/2. Such a fixed point is said to be superstable, because as we found in Problem 6.4, convergence to the fixed point is relatively rapid. Superstable trajectories occur whenever one of the fixed points is at x∗ = 1/2. Appendix 6B: Finding the Roots of a Function The roots of a function f (x) are the values of the variable x for which the function f (x) is zero. Even an apparently simple equation such as f (x) = tanx − x − c = 0 (6.78) CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 191 where c is a constant cannot be solved analytically for x. Regardless of the function and the approach to root finding, the first step should be to learn as much as possible about the function. For example, plotting the function will help us to determine the approximate locations of the roots. Newton’s (or the Newton–Raphson) method is based on replacing the function by the first two terms of the Taylor expansion of f (x) about the root x. If our initial guess for the root is x0, we can write f (x) ≈ f (x0) + (x − x0)f (x0). If we set f (x) equal to zero and solve for x, we find x = x0 − f (x0)/f (x0). If we have made a good choice for x0, the resultant value of x should be closer to the root than x0. The general procedure is to calculate successive approximations as follows: xn+1 = xn − f (xn) f (xn) . (6.79) If this series converges, it converges very quickly. However, if the initial guess is poor or if the function has closely spaced multiple roots, the series may not converge. The successive iterations of Newton’s method is an another example of a map. Newton’s method also works with complex functions as we will see in the following problem. Problem 6.30. Cube roots Consider the function f (z) = z3−1, where z = x+iy, and f (z) = z2. Map the range of convergence of (6.79) in the region [−2 < x < 2,−2 < y < 2] in the complex plane. Color the starting z value red, green, or blue depending on the root to which the initial guess converges. If the trajectory does not converge, color the starting point black. For more insight add a mouse handler to your program so that if you click on your plot, the sequence of iterations starting from the point where you clicked will be shown. The following problem discusses a situation that typically arises in courses on quantum mechanics. Problem 6.31. Energy levels in a finite square well The quantum mechanical energy levels in the one-dimensional finite square well can be found by solving the relation tan = ρ2 − 2 (6.80) where = √ mEa2/2 and ρ = mV0a2/2 are defined in terms of the particle mass m, the particle energy E, the width of the well a, and the depth of the well V0. The function tan has zeros at = 0,π,2π,... and asymptotes at = 0,π/2,3π/2,5π/2... . The function ρ − 2 is a quarter circle of radius ρ. Write a program to plot these two functions with ρ = 3, and then use Newton’s method to determine the roots of (6.80). Find the value of ρ and thus V0 such that below this value there is only one energy level and above this value there is more than one. At what value of ρ do three energy levels first appear? In Section 6.6 we introduced the bisection root-finding algorithm. This algorithm is implemented in the Root class in the numerics package. It can be used with any function. Listing 6.6: The bisection method defined in the Root class in the numerics package public s t a t i c double bisection ( final Function f , double x1 , double x2 , final double tolerance ) { int count = 0; int maxCount = ( int ) (Math . log (Math . abs ( x2 − x1 )/ tolerance )/Math . log ( 2 ) ) ; CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 192 maxCount = Math .max(MAX_ITERATIONS, maxCount ) + 2; double y1 = f . evaluate ( x1 ) , y2 = f . evaluate ( x2 ) ; i f ( y1 y2 > 0) { / / y1 and y2 must have o p p o s i t e sign return Double .NaN; / / i n t e r v a l does not contain a root } while ( count < maxCount ) { double x = ( x1 + x2 ) / 2; double y = f . evaluate ( x ) ; i f (Math . abs ( y ) < tolerance ) return x ; i f ( y y1 > 0) { / / r e p l a c e the endpoint that has the same sign x1 = x ; y1 = y ; } else { x2 = x ; y2 = y ; } count ++; } return Double .NaN; / / did not converge in max i t e r a t i o n s } The bisection algorithm is guaranteed to converge if you can find an interval where the function changes sign. However, it is slow. Newton’s algorithm is very fast but may not converge. We develop an algorithm in the following problem that combines these two approaches. Problem 6.32. Finding roots Modify Newton’s algorithm to keep track of the interval between the minimum and the maximum of x while iterating (6.79). If the iterate xn+1 jumps outside this interval, interrupt Newton’s method and use the bisection algorithm for one iteration. Test the root at the end of the iterative process to check that the algorithm actually found a root. Test your algorithm on the function in (6.78). References and Suggestions for Further Reading Books Ralph H. Abraham and Christopher D. Shaw, Dynamics – The Geometry of Behavior, 2nd ed. (Addison–Wesley, 1992). The authors use an abundance of visual representations. Hao Bai-Lin, Chaos II (World Scientific, 1990). A collection of reprints on chaotic phenomena. The following papers were cited in the text. James P. Crutchfield, J. Doyne Farmer, Norman H. Packhard, and Robert S. Shaw, “Chaos,” Sci. Am. 255 (6), 46–57 (1986); Mitchell J. Feigenbaum, “Quantitative universality for a class of nonlinear transformations,” J. Stat. Phys. 19, 25–52 (1978); M. Hénon, “A two-dimensional mapping with a strange attractor,” Commun. Math. Phys. 50, 69–77 (1976); Robert M. May, “Simple mathematical models with very complicated dynamics,” Nature 261, 459–467 (1976); Robert Van Buskirk and Carson Jeffries, “Observation of chaotic dynamics of coupled nonlinear oscillators,” Phys. Rev. A 31, 3332–3357 (1985). CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 193 G. L. Baker and J. P. Gollub, Chaotic Dynamics: An Introduction, 2nd ed. (Cambridge University Press, 1995). A good introduction to chaos with special emphasis on the forceddamped-nonlinear harmonic oscillator. Several programs are given. Pedrag Cvitanovic, Universality in Chaos, 2nd ed. (Adam-Hilger, 1989). A collection of reprints on chaotic phenomena including the articles by Hénon and May also reprinted in the Bai-Lin collection and the chaos classic, Mitchell J. Feigenbaum, “Universal behavior in nonlinear systems,” Los Alamos Sci. 1, 4–27 (1980). Robert Devaney, A First Course in Chaotic Dynamical Systems Addison–Wesley, 1992). This text is a good introduction to the more mathematical ideas behind chaos and related top- ics. Jan Fröyland, Introduction to Chaos and Coherence (Institute of Physics Publishing, 1992). See Chapter 7 for a simple model of Saturn’s rings. Martin C. Gutzwiller, Chaos in Classical and Quantum Mechanics (Springer–Verlag, 1990). A good introduction to problems in quantum chaos for the more advanced student. Robert C. Hilborn, Chaos and Nonlinear Dynamics (Oxford University Press, 1994). An excellent pedagogically oriented text. Douglas R. Hofstadter, Metamagical Themas (Basic Books, 1985). A shorter version is given in his article, “Metamagical themas,” Sci. Am. 245 (11), 22–43 (1981). E. Atlee Jackson, Perspectives of Nonlinear Dynamics, Vols. 1 and 2. (Cambridge University Press, 1989, 1991). An advanced text that is a joy to read. R. V. Jensen, “Chaotic scattering, unstable periodic orbits, and fluctuations in quantum transport,” Chaos 1, 101–109 (1991). This paper discusses the quantum version of systems similar to those discussed in Projects 6.28 and 6.26. Francis C. Moon, Chaotic and Fractal Dynamics, An Introduction for Applied Scientists and Engineers (Wiley–VCH, 1992). An engineering oriented text with a section on how to build devices that demonstrate chaotic dynamics. Edward Ott, Chaos in Dynamical Systems (Cambridge University Press, 1993). An excellent textbook on chaos at the upper undergraduate to graduate level. See also E. Ott, “Strange attractors and chaotic motions of dynamical systems,” Rev. Mod. Phys. 53, 655–671 (1981). Edward Ott, Tim Sauer, and James A. Yorke, editors, Coping with Chaos (John Wiley & Sons, 1994). A reprint volume emphasizing the analysis of experimental time series from chaotic systems. Heinz–Otto Peitgen, Hartmut Jürgens, and Dietmar Saupe, Fractals for the Classroom, Part II (Springer–Verlag, 1992). A delightful book with many beautiful illustrations. Chapter 11 discusses the nature of the bifurcation diagram of the logistic map. Ian Percival and Derek Richards, Introduction to Dynamics (Cambridge University Press, 1982). An advanced undergraduate text that introduces phase trajectories and the theory of stability. A derivation of the Hamiltonian for the driven damped pendulum considered in Section 6.4 is given in Chapter 5, example 5.7. CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 194 Ivars Peterson, Newton’s Clock: Chaos in the Solar System (W. H. Freeman, 1993). An historical survey of our understanding of the motion of bodies within the solar system with a focus on chaotic motion. Stuart L. Pimm, The Balance of Nature (The University of Chicago Press, 1991). An introductory treatment of ecology with a chapter on applications of chaos to real biological systems. The author contends that much of the difficulty in assessing the importance of chaos is that ecological studies are too short. William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery, Numerical Recipes, 2nd ed. (Cambridge University Press, 1992). Chapter 9 discusses various rootfinding methods. S. Neil Rasband, Chaotic Dynamics of Nonlinear Systems (Wiley–Interscience, 1990). Clear presentation of the most important topics in classical chaos theory. M. Lakshmanan and S. Rajaseekar, Nonlinear Dynamics (Springer–Verlag, 2003). Although this text is for advanced students, many parts are accessible. Robert Shaw, The Dripping Faucet as a Model Chaotic System (Aerial Press, 1984). Steven Strogatz, Nonlinear Dynamics and Chaos with Applications to Physics, Biology, Chemistry and Engineering (Addison–Wesley, 1994). Another outstanding text. Anastasios A. Tsonis, Chaos: From Theory to Applications (Plenum Press, 1992). Of particular interest is the discussion of applications to nonlinear time series forecasting. Nicholas B. Tufillaro, Tyler Abbott, and Jeremiah Reilly, Nonlinear Dynamics and Chaos (Addison– Wesley, 1992). See also, N. B. Tufillaro and A. M. Albano, “Chaotic dynamics of a bouncing ball,” Am. J. Phys. 54, 939–944 (1986). The authors describe an undergraduate level experiment on a bouncing ball subject to repeated impacts with a vibrating table. See also the article by Warr et al. Articles Garin F. J. Añaños and Constantino Tsallis, “Ensemble averages and nonextensivity of onedimensional maps,” Phys. Rev. Lett. 93, 020601 (2004). Gregory L. Baker, “Control of the chaotic driven pendulum,” Am. J. Phys. 63 (9), 832–838 (1995). W. Bauer and G. F. Bertsch, “Decay of ordered and chaotic systems,” Phys. Rev. Lett. 65, 2213 (1990). See also the comment by Olivier Legrand and Didier Sornette, “First return, transient chaos, and decay in chaotic systems,” Phys. Rev. Lett. 66, 2172 (1991), and the reply by Bauer and Bertsch on the following page. The dependence of the decay laws on chaotic behavior is very general and has been considered in various contexts including room acoustics and the chaotic scattering of microwaves in an “elbow” cavity. Chaotic behavior is a sufficient but not necessary condition for exponential decay. Keith Briggs, “Simple experiments in chaotic dynamics,” Am. J. Phys. 55, 1083–1089 (1987). S. N. Coppersmith, “A simpler derivation of Feigenbaum’s renormalization group equation for the period-doubling bifurcation sequence,” Am. J. Phys. 67 (1), 52–54 (1999). CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 195 J. P. Crutchfield, J. D. Farmer, and B. A. Huberman, “Fluctuations and simple chaotic dynamics,” Phys. Repts. 92, 45–82 (1982). Robert DeSerio, “Chaotic pendulum: The complete attractor,” Am. J. Phys. 71 (3), 250–257 (2003). William L. Ditto and Louis M. Pecora, “Mastering chaos,” Sci. Am. 262 (8), 78–82 (1993). J. C. Earnshaw and D. Haughey, “Lyapunov exponents for pedestrians,” Am. J. Phys. 61, 401 (1993). Daniel J. Gauthier, “Resource letter: CC-1: Controlling chaos,” Am. J. Phys. 71 (8), 750–759 (2003). The article includes a bibliography of materials on controlling chaos. Wayne Hayes, “Computer simulations, exact trajectories, and the gravitational N-body problem,” Am. J. Phys. 72 (9), 1251–1257 (2004). The article discusses the concept of shadowing which is used in the simulation of chaotic systems. Robert C. Hilborn, “Sea gulls, butterflies, and grasshoppers: A brief history of the butterfly effect in nonlinear dynamics,” Am. J. Phys. 72 (4), 425–427 (2004). Robert C. Hilborn and Nicholas B. Tufillaro, “Resource letter: ND-1: Nonlinear dynamics,” Am. J. Phys. 65 (9), 822–834 (1997). Ying–Cheng Lai, “Controlling chaos,” Computers in Physics 8, 62 (1994). Section 6.6 is based on this article. R. B. Levien and S. M. Tan, “Double pendulum: An experiment in chaos,” Am. J. Phys. 61 (11), 1038–1044 (1993). V. Lopac and V. Danani, “Energy conservation and chaos in the gravitationally driven Fermi oscillator,” Am. J. Phys. 66 (10), 892–902 (1998). J. B. McLaughlin, “Period-doubling bifurcations and chaotic motion for a parametrically forced pendulum,” J. Stat. Phys. 24, 375–388 (1981). Sergio De Souza–Machado, R. W. Rollins, D. T. Jacobs, and J. L. Hartman, “Studying chaotic systems using microcomputer simulations and Lyapunov exponents,” Am. J. Phys. 58 (4), 321–329 (1990). Bo Peng, Stephen K. Scott, and Kenneth Showalter, “Period doubling and chaos in a threevariable autocatalator,” J. Phys. Chem. 94, 5243–5246 (1990). Bo Peng, Valery Petrov, and Kenneth Showalter, “Controlling chemical chaos," J. Phys. Chem. 95, 4957–4959 (1991). Troy Shinbrot, Celso Grebogi, Jack Wisdom, and James A. Yorke, “Chaos in a double pendulum,” Am. J. Phys. 60 (6), 491–499 (1992). Niraj Srivastava, Charles Kaufman, and Gerhard Müller, “Hamiltonian chaos,” Computers in Physics 4, 549–553 (1990); ibid. 5, 239–243 (1991); ibid. 6, 84–88 (1992). Todd Timberlake, “A computational approach to teaching conservative chaos,” Am. J. Phys. 72 (8), 1002–1007 (2004). CHAPTER 6. THE CHAOTIC MOTION OF DYNAMICAL SYSTEMS 196 Jan Tobochnik and Harvey Gould, “Quantifying chaos,” Computers in Physics 3 (6), 86 (1989). There is a typographical error in this paper in the equations for step (3) of the algorithm for computing the Lyapunov spectrum. The correct equations are given in Project 6.24. S. Warr, W. Cooke, R. C. Ball, and J. M. Huntley, “Probability distribution functions for a single particle vibrating in one dimension: Experimental study and theoretical analysis,” Physica A 231, 551–574 (1996). This paper and the book by Tufillaro, Abbott, and Reilly consider the motion of a ball bouncing on a periodically vibrating table. This nonlinear dynamical system exhibits fixed points, periodic and strange attractors, and period-doubling bifurcations to chaos, similar to the logistic map. Simulations of this system are very interesting, but not straightforward. Tolga Yalcinkaya and Ying–Cheng Lai, “Chaotic scattering,” Computers in Physics 9, 511– 518 (1995). Project 6.28 is based on a draft of this article. The map (6.64) is discussed in more detail in Yun–Tung Lau, John M. Finn, and Edward Ott, “Fractal dimension in nonhyperbolic chaotic scattering,” Phys. Rev. Lett. 66, 978 (1991). Chapter 7 Random Processes Random processes are introduced in the context of several simple physical systems, including random walks on a lattice, polymers, and diffusion–controlled chemical reactions. The generation of random number sequences is also discussed. 7.1 Order to Disorder In Chapter 6 we saw several examples of how, under certain conditions, the behavior of a nonlinear deterministic system can appear to be random. In this chapter we will see some examples of how chance can generate statistically predictable outcomes. For example, we know that if we bet often on the outcome of a game for which the probability of winning is less than 50%, we will lose money eventually. We first discuss an example that illustrates the tendency of systems of many particles to evolve to a well-defined state. Imagine a closed box that is divided into two parts of equal volume (see Figure 7.1). The left half contains a gas of N identical particles and the right half is initially empty. We then make a small hole in the partition between the two halves. What happens? We know that after some time, the average number of particles in each half of the box will become N/2, and we say that the system has reached equilibrium. How can we simulate this process? One way is to give each particle an initial velocity and position and adopt a deterministic model of the motion of the particles. For example, we could assume that each particle moves in a straight line until it hits a wall of the box or another particle and undergoes an elastic collision. We will consider similar deterministic models in Chapter 8. Instead, we first simulate a probabilistic model based on a random process. The basic assumptions of this model are that the motion of the particles is random and the particles do not interact with one another. Hence, the probability per unit time that a particle goes through the hole in the partition is the same for all N particles regardless of the number of particles in either half. We also assume that the size of the hole is such that only one particle can pass through at a time. We first model the motion of a particle passing through the hole by choosing one of the N particles at random and moving it to the other side. For visualization purposes, we will use arrays to specify the position of each particle. We then randomly generate an integer i between 0 and N − 1 and change the arrays appropriately. A more efficient Monte Carlo algorithm is discussed in Problem 7.2b. The tool we need to simulate this random process is a random number generator. 197 CHAPTER 7. RANDOM PROCESSES 198 Figure 7.1: A box is divided into two equal halves by a partition. After a small hole is opened in the partition, one particle can pass through the hole per unit time. It is counterintuitive that we can use a deterministic computer to generate sequences of random numbers. In Section 7.9 we discuss some of the methods for computing a set of numbers that appear statistically random but are in fact generated by a deterministic algorithm. These algorithms are sometimes called pseudorandom number generators to distinguish their output from intrinsically random physical processes, such as the time between clicks in a Geiger counter near a radioactive sample. For the present we will be content to use the random number generator supplied with Java, although the random number generators included with various programming languages vary in quality. The method Math.random() produces a random number r that is uniformly distributed in the interval 0 ≤ r < 1. To generate a random integer i between 0 and N − 1, we write: int i = ( int ) (N Math . random ( ) ) ; The effect of the (int) cast is to eliminate the decimal digits from a floating point number. For example, (int)(5.7) = 5. The algorithm for simulating the evolution of the model can be summarized by the following steps: 1. Use a random number generator to choose a particle at random. 2. Move this particle to the other side of the box. 3. Give the particle a random position on the new side of the box. This step is for visualization purposes only. 4. Increase the “time” by unity. Note that this definition of time is arbitrary. Class Box implements this algorithm and class BoxApp plots the evolution of the number of particles on the left half of the box. Listing 7.1: Class Box for the simulation of the approach to equilibrium. package org . opensourcephysics . sip . ch07 ; import java . awt . ; import org . opensourcephysics . display . ; public class Box implements Drawable { public double x [ ] , y [ ] ; CHAPTER 7. RANDOM PROCESSES 199 public int N, nleft , time ; public void i n i t i a l i z e ( ) { / / l o c a t i o n of p a r t i c l e s ( f o r v i s u a l i z a t i o n purposes only ) x = new double [N] ; y = new double [N] ; n l e f t = N; / / s t a r t with a l l p a r t i c l e s on the l e f t time = 0; for ( int i = 0; i ," , "Averages" ) ; HistogramFrame distribution = new HistogramFrame ( "x" , "H(x)" , "Histogram" ) ; int t r i a l s ; / / number of t r i a l s public WalkerApp ( ) { plotFrame . setXYColumnNames (0 , "t" , "" ) ; plotFrame . setXYColumnNames (1 , "t" , "" ) ; } public void i n i t i a l i z e ( ) { walker . p = control . getDouble ( "Probability p of step to right" ) ; walker .N = control . getInt ( "Number of steps N" ) ; walker . i n i t i a l i z e ( ) ; t r i a l s = 0; } public void doStep ( ) { t r i a l s ++; walker . step ( ) ; distribution . append ( walker . position ) ; distribution . setMessage ( "trials = "+ t r i a l s ) ; } public void stopRunning ( ) { plotFrame . clearData ( ) ; for ( int t = 0; t<=walker .N; t ++) { double xbar = walker . xAccum[ t ] 1.0/ t r i a l s ; double x2bar = walker . xSquaredAccum [ t ] 1.0/ t r i a l s ; plotFrame . append (0 , 1.0 t , xbar ) ; plotFrame . append (1 , 1.0 t , x2bar − xbar xbar ) ; } plotFrame . repaint ( ) ; } public void reset ( ) { control . setValue ( "Probability p of step to right" , 0 . 5 ) ; control . setValue ( "Number of steps N" , 100); } public s t a t i c void main ( String [ ] args ) { SimulationControl . createApp (new WalkerApp ( ) ) ; } } CHAPTER 7. RANDOM PROCESSES 206 Problem 7.5. Random walks in one dimension (a) In class Walker the steps are of unit length so that a = 1. Use Walker and WalkerApp to estimate the number of trials needed to obtain ∆x2 for N = 20 and p = 1/2 with an accuracy of approximately 5%. Compare your result for ∆x2 to the exact answer in (7.10). Approximately how many trials do you need to obtain the same relative accuracy for N = 100? (b) Is x exactly zero in your simulations? Explain the difference between the analytic result and the results of your simulations. Note that we have used the same notation ... to denote the exact average calculated analytically and the approximate average computed by averaging over many trials. The distinction between the two averages should be clear from the context. (c) How do your results for x and ∆x2 change for p q? Choose p = 0.7 and determine the N dependence of x and ∆x2. (d)∗ Determine ∆x2 for N = 1 to N = 5 by enumerating all the possible walks. For simplicity, choose p = 1/2 so that x = 0. For N = 1 there are two possible walks: one step to the right and one step to the left. In both cases x2 = 1, and hence x1 2 = 1. For N = 2 there are four possible walks with the same probability: (i) two steps to the right, (ii) two steps to the left, (iii) first step to the right and second step to the left, and (iv) first step to the left and second step to the right. The value of x2 2 for these walks is 4, 4, 0, and 0, respectively, and hence x2 2 = (4 + 4 + 0 + 0)/4 = 2. Write a program that enumerates all the possible walks of a given number of steps and compute the various averages of interest exactly. The class WalkerApp displays the distribution of values of the displacement x after N steps. One way of determining the number of times that the variable x has a certain value would be to define a one-dimensional array, probability, and let probability[x] += 1; In this case because x takes only integer values, the array index of probability is the same as x itself. However, the above statement does not work in Java because x can be negative as well as positive. What we need is a way of mapping the value x to a bin or index number. The HistogramFrame class, which is part of the Open Source Physics display package, does this mapping automatically using the Java Hashtable class. In simple data structures, data is accessed by an index that indicates the location of the data in the data structure. Hashtable data is accessed by a key, which in our case is the value of x. A hashing function converts the key to an index. The append method of the HistogramFrame class takes a value, finds the index using a hashing function, and then increments the data associated with that key. The HistogramFrame class also draws itself. The HistogramFrame class is very useful for taking a quick look at the distribution of values in a data set. You do not need to know how to group the data into bins or the range of values of the data. The default bin width is unity, but the bin width can be set using the setBinWidth method. See WalkerApp for an example of the use of the HistogramFrame class. Frequently, we wish to use the histogram data to compute other quantities. You can collect the data using the Data Table menu item in HistogramFrame and copy the data to a file. Another option is to include additional code in your program to analyze the data. The following statements assume that a HistogramFrame object called histogram has been created and data entered into it. CHAPTER 7. RANDOM PROCESSES 207 / / c r e a t e s array e n t r i e s of data from histogram java . u t i l .Map. Entry [ ] entries = histogram . entries ( ) ; for ( int i = 0 , length = entries . length ; i < length ; i ++) { / / g e t s bin number Integer binNumber = ( Integer ) entries [ i ] . getKey ( ) ; / / g e t s number of occurrences f o r bin number i Double occurrences = ( Double ) entries [ i ] . getValue ( ) ; / / g e t s value of l e f t edge of bin double value = histogram . getLeftMostBinPosition ( binNumber . intValue ( ) ) ; / / s e t s value to middle of bin value += 0.5 histogram . getBinWidth ( ) ; / / convert from Double c l a s s to double data type double number = occurrences . doubleValue ( ) ; / / use value and number in your a n a l y s i s } Problem 7.6. Nature of the probability distribution (a) Compute PN (x), the probability that the displacement of the walker from the origin is x after N steps. What is the difference between the histogram, that is, the number of occurrences, and the probability? Consider N = 10 and N = 40 and at least 1000 trials. Does the qualitative form of PN (x) change as the number of trials increases? What is the approximate width of PN (x) and the value of PN (x) at its maximum for each value of N? (b) What is the approximate shape of the envelope of PN (x)? Does the shape change as N is increased? (c) Fit the envelope PN (x) for sufficiently large N to the continuous function C 1 √ 2π∆x2 e−(x− x )2/2∆x2 . (7.11) The form of (7.11) is the standard form of the Gaussian distribution with C = 1. The easiest way to do this fit is to plot your results for PN (x) and the form (7.11) on the same graph using your results for x and ∆x2 as input parameters. Visually choose the constant C to obtain a reasonable fit. What are the possible values of x for a given value of N? What is the minimum difference between these values? How does this difference compare to your value for C? Problem 7.7. More random walks in one dimension (a) Suppose that the probability of a step to the right is p = 0.7. Compute x and ∆x2 for N = 4, 8, 16, and 32. What is the interpretation of x in this case? What is the qualitative dependence of ∆x2 on N? (b) An interesting property of random walks is the mean number DN of distinct lattice sites visited during the course of an N step walk. Do a Monte Carlo simulation of DN and determine its N dependence. We can consider either a large number of successive walks as in Problem 7.7 or a large number of noninteracting walkers moving at the same time as in Problem 7.8. CHAPTER 7. RANDOM PROCESSES 208 Figure 7.2: An example of a 6 × 6 square lattice. Note that each site or node has four nearest neighbors. Problem 7.8. A random walk in two dimensions (a) Consider a collection of walkers initially at the origin of a square lattice (see Figure 7.2). At each unit of time, each of the walkers moves at random with equal probability in one of the four possible directions. Create a drawable class, Walker2D, which contains the positions of M walkers moving in two dimensions and draws their location, and modify WalkerApp. Unlike WalkerApp, this new class need not specify the maximum number of steps. Instead, the number of walkers should be specified. (b) Run your application with the number of walkers M ≥ 1000 and allow the walkers to take at least 500 steps. If each walker represents a bee, what is the qualitative nature of the shape of the swarm of bees? Describe the qualitative nature of the surface of the swarm as a function of the number of steps N. Is the surface jagged or smooth? (c) Compute the quantities x , y , ∆x2, and ∆y2 as a function of N. The average is over the M walkers. Also compute the mean square displacement R2 given by R2 = x2 − x 2 + y2 − y 2 = ∆x2 + ∆y2 . (7.12) What is the dependence of each quantity on N? (As before, we will frequently write R2 instead of R2 N .) (d) Estimate R2 for N = 8, 16, 32, and 64 by averaging over a large number of walkers for each value of N. Assume that R = √ R2 has the asymptotic N dependence: R ∼ Nν (N 1), (7.13) and estimate the exponent ν from a log-log plot of R2 versus N. We will see in Chapter 13 that the exponent 1/ν is related to how a random walk fills space. If ν ≈ 1/2, estimate the magnitude of the self-diffusion coefficient D from the relation R2 = 4DN. CHAPTER 7. RANDOM PROCESSES 209 Figure 7.3: Examples of the random path of a raindrop to the ground. The step probabilities are given in Problem 7.9. The walker starts at x = 0, y = h. (e) Do a Monte Carlo simulation of R2 on a triangular lattice (see Figure 8.5) and estimate ν. Can you conclude that the exponent ν is independent of the symmetry of the lattice? Does D depend on the symmetry of the lattice? If so, give a qualitative explanation for this dependence. (f)∗ Enumerate all the random walks on a square lattice for N = 4 and obtain exact results for x , y , and R2. Assume that all four directions are equally probable. Verify your program by comparing the Monte Carlo and exact enumeration results. Problem 7.9. The fall of a rain drop Consider a random walk that starts at a site a distance y = h above a horizontal line (see Figure 7.3). If the probability of a step down is greater than the probability of a step up, we expect that the walker will eventually reach a site on the horizontal line. This walk is a simple model of the fall of a rain drop in the presence of a random swirling breeze. Do a Monte Carlo simulation to determine the mean time τ for the walker to reach any site on the line y = 0 and find the functional dependence of τ on h. Is it possible to define a velocity in the vertical direction? Because the walker does not always move vertically, it suffers a net displacement x in the horizontal direction. How does ∆x2 depend on h and τ? Reasonable values for the step probabilities are 0.1, 0.6, 0.15, 0.15, corresponding to up, down, right, and left, respectively. 7.3 Modified Random Walks So far we have considered random walks on one- and two-dimensional lattices where the walker has no “memory” of the previous step. What happens if the walker remembers the nature of the previous steps? What happens if there are multiple random walkers, with the condition that no double occupancy is allowed? We explore these and other variations of the simple random walk in this section. All these variations have applications to physical systems, but the applications are more difficult to understand than the models themselves. The fall of a raindrop considered in Problem 7.9 is an example of a restricted random walk, that is, a walk in the presence of a boundary. In the following problem, we discuss in a more CHAPTER 7. RANDOM PROCESSES 210 general context the effects of various types of restrictions or boundaries on random walks. Other examples of a restricted random walk are given in Problems 7.17 and 7.23. Problem 7.10. Restricted random walks (a) Consider a one-dimensional lattice with trap sites at x = 0 and x = L(L > 0). A walker begins at site x0 (0 < x0 < L) and takes unit steps to the left and right with equal probability. When the walker arrives at a trap site, it can no longer move. Do a Monte Carlo simulation and verify that the mean number of steps τ for the particle to be trapped (the mean first passage time) is given by τ = (2D)−1 x0(L − x0) (7.14) where D is the self-diffusion coefficient in the absence of traps [see (7.29)]. (b) Random walk models in the presence of traps have had an important role in condensed matter physics. For example, consider the following idealized model of energy transport in solids. The solid is represented as a lattice with two types of sites: hosts and traps. An incident photon is absorbed at a host site and excites the host molecule or atom. The excitation energy or exciton is transferred at random to one of the host’s nearest neighbors, and the original excited molecule returns to its ground state. In this way the exciton wanders through the lattice until it reaches a trap site at which a chemical reaction occurs. A simple version of this energy transport model is given by a one-dimensional lattice with traps placed on a periodic sublattice. Because the traps are placed at regular intervals, we can replace the random walk on an infinite lattice by a random walk on a circular ring. Consider a lattice of N host or nontrapping sites and one trap site. If a walker has an equal probability of starting from any host site and an equal probability of a step to each nearest neighbor site, what is the N dependence of the mean survival time τ (the mean number of steps taken before a trap site is reached)? Use the results of part (a) rather than doing a simulation. (c) Consider a one-dimensional lattice with reflecting sites at x = −L and x = L. For example, if a walker reaches the reflecting site at x = L, it is reflected at the next step to x = L − 1. At t = 0 the walker starts at x = 0 and steps with equal probability to the left and right. Write a Monte Carlo program to determine PN (x), the probability that the walker is at site x after N steps. Compare the form of PN (x) with and without the presence of the reflecting sites. Can you distinguish the two probability distributions if N is the order of L? At what value of N can you first distinguish the two distributions? Problem 7.11. A persistent random walk (a) In a persistent random walk, the transition or jump probability depends on the previous step. Consider a walk on a one-dimensional lattice, and suppose that step N − 1 has been made. Then step N is made in the same direction with probability α; a step in the opposite direction occurs with probability 1 − α. Write a program to do a Monte Carlo simulation of the persistent random walk in one dimension. Estimate x , ∆x2, and PN (x). Note that it is necessary to specify both the initial position and an initial direction of the walker. What is the α = 1/2 limit of the persistent random walk? (b) Consider α = 0.25 and α = 0.75 and determine ∆x2 for N = 8, 64, 256, and 512. Assume that ∆x2 ∼ N2ν for large N, and estimate the value of ν from a log-log plot of ∆x2 versus N CHAPTER 7. RANDOM PROCESSES 211 for large N. Does ν depend on α? If ν ≈ 1/2, determine the self-diffusion coefficient D for α = 0.25 and 0.75. In general, D is given by D = 1 2d lim N→∞ ∆x2 N (7.15) where d is the dimension of space. That is, D is given by the asymptotic behavior of the mean square displacement. (For the simple random walk considered in Section 7.2, ∆x2 ∝ N for all N.) Give a physical argument for why D(α 0.5) is greater (smaller) than D(α = 0.5). (c) You might have expected that the persistent random walk yields a nonzero value for x . Verify that x = 0, and explain why this result is exact. How does the persistent random walk differ from the biased random walk for which p q? (d) A persistent random walk can be considered as an example of a multistate walk in which the state of the walk is defined by the last transition. The walker is in one of two states; at each step the probabilities of remaining in the same state or switching states are α and 1−α, respectively. One of the earliest applications of a two-state random walk was to the study of diffusion in a chromatographic column. Suppose that a molecule in a chromatographic column can be either in a mobile phase (constant velocity v) or in a trapped phase (zero velocity). Instead of each step changing the position by ±1, the position at each step changes by +v or 0. A quantity of experimental interest is the probability PN (x) that a molecule has moved a distance x in N steps. Choose v = 1 and α = 0.75 and determine the qualitative behavior of PN (x). Problem 7.12. Synchronized random walks (a) Randomly place two walkers on a one-dimensional lattice of L sites, so that both walkers are not at the same site. At each time step randomly choose whether the walkers move to the left or to the right. Both walkers move in the same direction. If a walker cannot move in the chosen direction because it is at a boundary, then this walker remains at the same site for this time step. A trial ends when both walkers are at the same site. Write a program to determine the mean time and the mean square fluctuations of the time for two walkers to reach the same site. This model is relevant to a method of doing cryptography using neural networks (see Rutter et al.). (b) Change your program so that you use biased random walkers for which p q. How does this change affect your results? Problem 7.13. Random walk on a continuum One of the first continuum models of a random walk was proposed by Rayleigh in 1919. In this model the length a of each step is a random variable and the direction of each step is uniformly random. In this case the variable of interest is R, the distance of the walker from the origin after N steps. The model is known as the freely jointed chain in polymer physics (see Section 7.7) in which case R is the end-to-end distance of the polymer. For simplicity, we first consider a walker in two dimensions with steps of equal (unit) length at a random angle. (a) Write a Monte Carlo program to compute R and determine its dependence on N. (b) Because R is a continuous variable, we need to compute pN (R)∆R, the probability that R is between R and R + ∆R after N steps. The quantity pN (R) is the probability density. Because CHAPTER 7. RANDOM PROCESSES 212 the area of the ring between R and R + ∆R is π(R + ∆R)2 − πR2 = 2πR∆R + π(∆R)2 ≈ 2πR∆R, we see that pN (R)∆R is proportional to R∆R. Verify that for sufficiently large N, pN (R)∆R has the form pN (R)∆R ∝ 2πR∆Re−(R− R )2/2∆R2 (7.16) where ∆R2 = R2 − R 2. Problem 7.14. Random walks with steps of variable length (a) Consider a random walk in one dimension with jumps of all lengths. The probability that the length of a single step is between a and a + ∆a is f (a)∆a, where f (a) is the probability density. If the form of f (a) is given by f (a) = C e−a for a > 0 with the normalization condition ∞ 0 f (a)da = 1, the code needed to generate step lengths according to this probability density is given by (see Section 11.5) stepLength = −Math . log (1 − Math . random ( ) ) ; Modify Walker and WalkerApp to simulate walks of variable length with this probability density. Consider N ≥ 100 and visualize the motion of the walker. Generate many walks of N steps and determine p(x)∆x, the probability that the displacement is between x and x + ∆x after N steps. Plot p(x) versus x and confirm that the form of p(x) is consistent with a Gaussian distribution. Note that the bin width ∆a is one of the input parameters. (b) Assume that the probability density f (a) is given by f (a) = C/a2 for a ≥ 1. Determine the normalization constant C using the condition C ∞ 1 a−2 da = 1. In this case we will learn in Section 11.5 that the statement stepLength = 1.0/(1.0 − Math . random ( ) ) ; generates values of a according to this form of f (a). Do a Monte Carlo simulation as in part (a) and determine p(x)∆x. Is the form of p(x) a Gaussian? This type of random walk, for which f (a) decreases as a power law a−1−α, is known as a Levy flight for α ≤ 2. Problem 7.15. Exploring the central limit theorem Consider a continuous random variable x with probability density f (x). That is, f (x)∆x is the probability that x has a value between x and x + ∆x. The mth moment of f (x) is defined as xm = xm f (x)dx. (7.17) The mean value x is given by (7.17) with m = 1. The variance σ2 x of f (x) is defined as σ2 x = x2 − x 2 . (7.18) Consider the sum yn corresponding to the average of n values of x: y = yn = 1 n (x1 + x2 + ··· + xn). (7.19) Suppose that we make many measurements of y. We know that the values of y will not be identical but will be distributed according to a probability density p(y), where p(y)∆y is the probability that the measured value of y is in the range y to y + ∆y. The main quantities of interest are y , p(y), and an estimate of the probable variability of y in a series of measurements. CHAPTER 7. RANDOM PROCESSES 213 (a) Suppose that f (x) is uniform in the interval [−1,1]. Calculate x , x2 , and σx analytically. (b) Write a program to make a sufficient number of measurements of y and determine y and p(y)∆y. Use the HistogramFrame class to determine and plot p(y)∆y. Choose at least 104 measurements of y for n = 4, 16, 32, and 64. What is the qualitative form of p(y)? Does the qualitative form of p(y) change as the number of measurements of y is increased for a given value of n? Does the qualitative form of p(y) change as n is increased? (c) Each value of y can be considered to be a measurement. How much does the value of y vary (on the average) from one measurement to another? Make a rough estimate of this variability by comparing several measurements of y for a given value of n. Increase n by a factor of four and estimate the variability of y again. Does the variability from one measurement to another decrease (on the average) as n is increased? (d) The sample variance ˜σ2 is given by ˜σ2 = n i=1[yi − y ]2 n − 1 . (7.20) The reason for the factor of n − 1 rather than n in (7.20) is that to compute ˜σ2, we need to use the n values of x to compute the mean y, and thus, loosely speaking, we have only n − 1 independent values of x remaining to calculate ˜σ2. Show that if n 1, then ˜σ2 ≈ σ2 y , where σ2 y is given by σ2 y = y2 − y 2 . (7.21) (e) The quantity ˜σ is known as the standard deviation of the mean. That is, ˜σ gives a measure of how much variation we expect to find if we make repeated measurements of y. How does the value of ˜σ compare with your estimate of the variability in part (b)? (f) What is the qualitative shape of the probability density p(y) that you obtained in part (b)? What is the order of magnitude of the width of the probability? (g) Verify from your results that ˜σ ≈ σy ≈ σx/ √ n − 1 ≈ σx/ √ n. (h) To test the generality of your results, consider the exponential probability density f (x) =    e−x x ≥ 0 0 x < 0. (7.22) Calculate x and σx analytically. Modify your Monte Carlo program and estimate y , ˜σ, σy, and p(y). How are ˜σ, σy, and σx related for a given value of n? Plot p(y) and discuss its qualitative form and its dependence on n and on the number of measurements of y. Problem 7.15 illustrates the central limit theorem, which states that the probability distribution of a sum of random variables, the random variable y, is a Gaussian centered at y with a standard deviation approximately given by 1/ √ n times the standard deviation of f (x). The requirements are that f (x) has finite first and second moments, that the measurements of y are statistically independent, and that n is large. What is the relation of the central limit theorem to the calculations of the probability distribution in the random walk models that we have considered? CHAPTER 7. RANDOM PROCESSES 214 Problem 7.16. Generation of the Gaussian distribution Consider the sum y = 12 i=1 ri (7.23) where ri is a uniform random number in the unit interval. Make many measurements of y and show that the probability distribution of y approximates the Gaussian distribution with mean value 6 and variance 1. What is the relation of this result to the central limit theorem? Discuss how to use this result to generate a Gaussian distribution with arbitrary mean and variance. This way of generating a Gaussian distribution is particularly useful when a “quick and dirty” approximation is appropriate. A better method for generating a sequence of random numbers distributed according to the Gaussian distribution is discussed in Section 11.5. Many of the problems we have considered have revealed the slow convergence of Monte Carlo simulations and the difficulty of obtaining quantitative results for asymptotic quantities. We conclude this section with a cautionary note and consider a “simple” problem for which straightforward Monte Carlo methods give misleading asymptotic results. ∗Problem 7.17. Random walk on lattices containing random traps (a) In Problem 7.10 we considered the mean survival time of a one-dimensional random walker in the presence of a periodic distribution of traps. Now suppose that the trap sites are distributed at random on a one-dimensional lattice with density ρ = N/L. For example, if ρ = 0.01, the probability that a site is a trap site is 1%. (A site is a trap site if r ≤ ρ, where, as usual, r is uniformly distributed in the interval 0 ≤ r < 1.) If a walker is placed at random at any nontrapping site, determine its mean survival time τ; that is, the mean number of steps before a trap site is reached. Assume that the walker has an equal probability of moving to a nearest neighbor site at each step and use periodic boundary conditions; that is, the lattice sites are located on a ring. The major complication is that it is necessary to perform three averages: the distribution of traps, the origin of the walker, and the different walks for a given trap distribution and origin. Choose reasonable values for the number of trials associated with each average and do a Monte Carlo simulation to estimate the mean survival time τ. If τ exhibits a power law dependence on ρ, for example τ ≈ τ0 ρ−z, estimate the exponent z. (b) A seemingly straightforward extension of part (a) is to estimate the survival probability SN after N steps. Choose ρ = 0.5 and do a Monte Carlo simulation of SN for N as large as possible. (Published results are for N = 3 × 104, on lattices large enough that a walker doesn’t reach the boundary, with about 54 000 trials.) Assume that the asymptotic form of SN for large N is given by SN ∼ e−bNα (7.24) where the exponent α is the quantity of interest, and b is a constant that depends on ρ. Are your results consistent with this form? Is it possible to make a meaningful estimate of the exponent α? (c) It has been proven that the asymptotic N dependence of SN has the form (7.24) with α = 1/3. Are your Monte Carlo results consistent with this value of α? The object of part (b) is to convince you that it is not possible to use simple Monte Carlo methods directly to obtain the correct asymptotic behavior of SN . The difficulty is that we are trying to estimate SN in the asymptotic region where SN is very small, and the small number of trials in this region prevents us from obtaining meaningful results. CHAPTER 7. RANDOM PROCESSES 215 1 0 1 1 1 1 0 1 1 0 1 0 1 1 N = 0 2 1 0 2 1 1 1 2 1 0 2 1 2 1 0 0 0 2 1 1 N = 1 2 1 0 2 1 4 3 4 3 2 1 0 4 1 4 1 0 0 0 2 1 2 1 N = 2 Figure 7.4: Example of the exact enumeration of walks on a given configuration of traps. The filled and empty squares denote regular and trap sites, respectively. At step N = 0, a walker is placed at each regular site. The numbers at each site i represent the number of walkers wi. Periodic boundary conditions are used. The initial number of walkers in this example is w0 = 10. The mean survival probability at step N = 1 and N = 2 is found to be 0.6 and 0.475, respectively. (d) One way to reduce the number of required averages is to determine exactly the probability that the walker is at site i after N steps for a given distribution of trap sites. The method is illustrated in Figure 7.4. The first line represents a given configuration of traps distributed randomly on a one-dimensional lattice. One walker is placed at each nontrap site; trap sites are assigned the value 0. Because each walker moves with probability 1/2 to each neighbor, the number of walkers wi(N + 1) on site i at step N + 1 is given by wi(N + 1) = 1 2 [wi+1(N) + wi−1(N)]. (7.25) (Compare the relation (7.25) to the relation that you found in Problem 7.5d.) The survival probability SN after N steps for a given configuration of traps is given exactly by SN = 1 w0 i wi(N) (7.26) where w0 is the initial number of walkers and the sum is over all sites in the lattice. Explain the relation (7.26) and write a program that computes SN using (7.25) and (7.26). Then obtain SN by averaging over several configurations of traps. Choose ρ = 0.5 and determine SN for N = 32, 64, 128, 512, and 1024. Choose periodic boundary conditions and as large a lattice as possible. How well can you estimate the exponent α? For comparison Havlin et al. consider a lattice of L = 50,000 and values of N up to 107. One reason that random walks are very useful in simulating many physical processes is that they are closely related to solutions of the diffusion equation. The one-dimensional diffusion equation can be written as ∂P (x,t) ∂t = D ∂2P (x,t) ∂x2 (7.27) where D is the self-diffusion coefficient and P (x,t)∆x is the probability of a particle being in the interval between x and x + ∆x at time t. In a typical application P (x,t) might represent the concentration of ink molecules diffusing in a fluid. In three dimensions the second derivative ∂2/∂x2 is replaced by the Laplacian 2. In Appendix 7B we show that the solution to the CHAPTER 7. RANDOM PROCESSES 216 diffusion equation with the boundary condition P (x = ±∞,t) = 0 yields x(t) = 0 (7.28) and x2 (t) = 2Dt (one dimension). (7.29) If we compare the form of (7.29) with (7.10), we see that the random walk on a one-dimensional lattice and the diffusion equation give the same time dependence if we identify t with N∆t and D with a2/∆t. The relation of discrete random walks to the diffusion equation is an example of how we can approach many problems in several ways. The traditional way to treat diffusion is to formulate the problem as a partial differential equation as in (7.27). The usual method for solving (7.27) numerically is known as the Crank–Nicholson method (see Press et al.). One difficulty with this approach is the treatment of complicated boundary conditions. An alternative is to formulate the problem as a random walk on a lattice for which it is straightforward to incorporate various boundary conditions. We will consider random walks in many contexts (see, for example, Section 10.5 and Chapter 16). 7.4 The Poisson Distribution and Nuclear Decay As we have seen, we can often change variable names and consider a seemingly different physical problem. Our goal in this section is to discuss the decay of unstable nuclei, but we first discuss a conceptually easier problem related to throwing darts at random. Related physical problems are the distribution of stars in the sky and the distribution of photons on a photographic plate. Suppose we randomly throw N = 100 darts at a board that has been divided into M = 1000 equal size regions. The probability that a dart hits a given region or cell in any one throw is p = 1/M. If we count the number of darts in the different regions, we would find that most cells are empty, some cells have one dart, and other cells have more than one dart. What is the probability P (n) that a given cell has n darts? Problem 7.18. Throwing darts Write a program that simulates the throwing of N darts at random into M cells in a dart board. Throwing a dart at random at the board is equivalent to choosing an integer at random between 1 and M. Determine H(n), the number of cells with n darts. Average H(n) over many trials and then compute the probability distribution P (n) = H(n) M . (7.30) As an example, choose N = 50 and M = 500. Choose the number of trials to be sufficiently large. so that you can determine the qualitative form of P (n). What is n ? In this case the probability p that a dart lands in a given cell is much less then unity. The conditions N 1 and p 1 with n = Np fixed and the independence of the events (the presence of a dart in a particular cell) satisfy the requirements for a Poisson distribution. The form of the Poisson distribution is P (n) = n n n! e− n (7.31) CHAPTER 7. RANDOM PROCESSES 217 where n is the number of darts in a given cell and n is the mean number, n = N n=0 nP (n). Because N 1, we can take the upper limit of this sum to be ∞ when it is convenient. Problem 7.19. Darts and the Poisson distribution (a) Write a program to compute n=0 P (n), n=0 nP (n), and n=0 n2 P (n) using the form (7.31) for P (n) and reasonable values of p and N. Verify that P (n) in (7.31) is normalized. What is the value of σ2 n = n2 − n 2 for the Poisson distribution? (b) Modify the program that you developed in Problem 7.18 to compute n as well as P (n). Choose N = 50 and M = 1000. How do your computed values of P (n) compare to the Poisson distribution in (7.31) using your measured value of n as input? If time permits, use larger values of N and M. (c) Choose N = 50 and M = 100 and redo part (b). Are your results consistent with a Poisson distribution? What happens if M = N = 50? Now that we are more familiar with the Poisson distribution, we consider the decay of radioactive nuclei. We know that a collection of radioactive nuclei will decay; however, we cannot know a priori which nucleus will decay next. If all nuclei of a particular type are identical, why do they not all decay at the same time? The answer is based on the uncertainty inherent in the quantum description of matter at the microscopic level. In the following, we will see that a simple model of the decay process leads to exponential decay. This approach complements the continuum approach discussed in Section 3.9. Because each nucleus is identical, we assume that during any time interval ∆t, each nucleus has the same probability per unit time p of decaying. The basic algorithm is simple – choose an unstable nucleus and generate a random number r uniformly distributed in the unit interval 0 ≤ r < 1. If r ≤ p, the unstable nucleus decays; otherwise, it does not. Each unstable nucleus is tested once during each time interval. Note that for a system of unstable nuclei, there are many events that can happen during each time interval; for example, 0,1,2,...,n nuclei can decay. Once a nucleus decays, it is no longer in the group of unstable nuclei that is tested at each time interval. Class Nuclei in Listing 7.5 implements the nuclear decay algorithm. Listing 7.5: The Nuclei class. package org . opensourcephysics . sip . ch07 ; public class Nuclei { int n [ ] ; / / accumulated data on number of unstable nuclei , index i s time int tmax ; / / maximum time to record data int n0 ; / / i n i t i a l number of unstable n u c l e i double p ; / / decay p r o b a b i l i t y public void i n i t i a l i z e ( ) { n = new int [ tmax +1]; } public void step ( ) { n [ 0 ] += n0 ; int nUnstable = n0 ; for ( int t = 0; t Figure 7.5: Plot of ∆x2 versus lnN for the data listed in Table 7.2. The straight line y = 1.02x + 0.83 through the points is found by minimizing the sum (7.34). where ∆2 = 1 n − 2 n i=1 d2 i (7.43) and di is given by (7.33). Because there are n data points, we might have guessed that n rather than n − 2 would be present in the denominator of (7.43). The reason for the factor of n − 2 is related to the fact that to determine ∆, we first need to calculate two quantities m and b, leaving only n−2 independent degrees of freedom. To see that the n − 2 factor is reasonable, consider the special case of n = 2. In this case we can find a line that passes exactly through the two data points, but we cannot deduce anything about the reliability of the set of measurements because the fit is exact. If we use (7.43), we see that both the numerator and denominator would be zero, and hence ∆ would be undetermined. If a factor of n rather than n − 2 appeared in (7.43), we would conclude that ∆ = 0/2 = 0, an absurd conclusion. Usually n 1, and the difference between n and n − 2 is negligible. For our example, ∆ = 0.03, σb = 0.07, and σm = 0.02. The uncertainties δm and δν are related CHAPTER 7. RANDOM PROCESSES 224 by 2δν = δm. Because δm = σm, we conclude that our best estimate for ν is ν = 0.51 ± 0.01. If the values of yi have different uncertainties σi, then the data points are weighted by the quantity wi = 1/σ2 i . In this case it is reasonable to minimize the quantity χ2 = n i=1 wi(yi − mxi − b)2 . (7.44) The resulting expressions in (7.39) for m and b are unchanged if we generalize the definition of the averages to be f = 1 n w n i=1 wifi (7.45) where w = 1 n n i=1 wi. (7.46) Problem 7.27. Example of least squares fit (a) Write a program to find the least squares fit for a set of data. As a check on your program, compute the most probable values of m and b for the data shown in Table 7.2. (b) Modify the random walk program so that steps of length 1 and 2 are taken with equal probability. Use at least 10,000 trials and do a least squares fit to ∆x2 as done in the text. Is your most probable estimate for ν closer to ν = 1/2? For simple random walk problems the relation ∆x2 = aNν holds for all N. However, in many random walk problems, a power law relation between ∆x2 and N holds only asymptotically for large N, and hence we should use only the larger values of N to estimate the slope. Also, because we are finding the best fit for the logarithm of the independent variable N, we need to give equal weight to all intervals of lnN. In the above example, we used N = 8, 16, 32, and 64, so that the values of lnN are equally spaced. 7.7 Applications to Polymers Random walk models play an important role in polymer physics (cf. de Gennes). A polymer consists of N repeat units (monomers) with N 1 (N ∼ 103 – 105). For example, polyethylene can be represented as ···−CH2−CH2−CH2−···. The detailed structure of the polymer is important for many practical applications. For example, if we wish to improve the fabrication of rubber, a good understanding of the local motions of the monomers in the rubber chain is essential. However, if we are interested in the global properties of the polymer, the details of the chain structure can be ignored. Let us consider a familiar example of a polymer chain in a good solvent: a noodle in warm water. A short time after we place a noodle in warm water, the noodle becomes flexible, and it neither collapses into a little ball or becomes fully stretched. Instead, it adopts a random structure as shown schematically in Figure 7.6. If we do not add too many noodles, we can say that the noodles behave as a dilute solution of polymer chains in a good solvent. The dilute nature of the solution implies that we can ignore entanglement effects of the noodles and consider each CHAPTER 7. RANDOM PROCESSES 225 (a) (b) Figure 7.6: (a) Schematic illustration of a linear polymer in a good solvent. (b) Example of the corresponding self-avoiding walk on a square lattice. noodle individually. The presence of a good solvent implies that the polymers can move freely and adopt many different configurations. A fundamental geometrical property that characterizes a polymer in a good solvent is the mean square end-to-end distance R2 N , where N is the number of monomers. (For simplicity, we will frequently write R2 in the following.) For a dilute solution of polymer chains in a good solvent, it is known that the asymptotic dependence of R2 is given by (7.13) with ν ≈ 0.5874 in three dimensions. If we were to ignore the interactions of the monomers, the simple random walk model would yield ν = 1/2, independent of the dimension and symmetry of the lattice. Because this result for ν does not agree with experiment, we know that we are overlooking an important physical feature of polymers. We now discuss a random walk that incorporates the global features of dilute linear polymers in solution. We have already introduced a model of a polymer chain consisting of straight line segments of the same size joined together at random angles (see Problem 7.13). A further idealization is to place the polymer chain on a lattice (see Figure 7.6). A more realistic model of linear polymers accounts for its most important physical feature; that is, two monomers cannot occupy the same spatial position. This constraint is known as the excluded volume condition, which is ignored in a simple random walk. A well-known lattice model for a linear polymer chain that incorporates this constraint is known as the self-avoiding walk (SAW). This model consists of the set of all N-step walks starting from the origin subject to the global constraint that no lattice site can be visited more than once in each walk; this constraint accounts for the excluded volume condition. Self-avoiding walks have many applications, such as the physics of magnetic materials and the study of phase transitions, and they are of interest as purely mathematical objects. Many of the obvious questions have resisted rigorous analysis, and exact enumeration and Monte Carlo simulation have played an important role in our current understanding. The result for ν in two dimensions for the self-avoiding walk is known to be exactly ν = 3/4. The proportionality CHAPTER 7. RANDOM PROCESSES 226 1 2 3 4 (a) 1 4 5 6 7 (b) 1 2 3 w4 = 2/3 (c) 1 4 5 w6 = 1/3 2 3 2 3 Figure 7.7: Examples of self-avoiding walks on a square lattice. The origin is denoted by a filled circle. (a) An N = 3 walk. The fourth step shown is forbidden. (b) An N = 7 walk that leads to a self-intersection at the next step; the weight of the N = 8 walk is zero. (c) Two examples of the weights of walks in the enrichment method. constant in (7.13) depends on the structure of the monomers and on the solvent. In contrast, the exponent ν is independent of these details and depends only on the spatial dimension. We consider Monte Carlo simulations of the self-avoiding walk in two dimensions in Problems 7.28–7.30. Another algorithm for the self-avoiding walk is considered in Project 7.41. Problem 7.28. The two-dimensional self-avoiding walk Consider the self-avoiding walk on the square lattice. Choose an arbitrary site as the origin and assume that the first step is “up.” The walks generated by the three other possible initial directions only differ by a rotation of the whole lattice and do not have to be considered explicitly. The second step can be in three rather than four possible directions because of the constraint that the walk cannot return to the origin. To obtain unbiased results, we generate a random number to choose one of the three directions. Successive steps are generated in the same way. Unfortunately, the walk will very likely not continue indefinitely. To obtain unbiased results, we must choose at random one of the three steps, even though one or more of these steps might lead to a self-intersection. If the next step does lead to a self-intersection, the walk must be terminated to keep the statistics unbiased. An example of a three-step walk is shown in Figure 7.7a. The next step leads to a self-intersection and violates the constraint. In this case we must start a new walk at the origin. (a) Write a program that implements this algorithm and record the fraction f (N) of successful attempts at constructing polymer chains with N total monomers. Represent the lattice as a array so that you can record the sites that already have been visited. What is the qualitative dependence of f (N) on N? What is the maximum value of N that you can reasonably consider? (b) Determine the mean square end-to-end distance R2 N for values of N that you can reasonably consider with this sampling method. The disadvantage of the straightforward sampling method in Problem 7.28 is that it becomes very inefficient for long chains; that is, the fraction of successful attempts decreases exponentially. To overcome this attrition, several “enrichment” techniques have been developed. We first discuss a relatively simple algorithm proposed by Rosenbluth and Rosenbluth in which each walk of N steps is associated with a weighting function w(N). Because the first step to the north is always possible, we have w(1) = 1. In order that all allowed configurations of a given N are counted equally, the weights w(N) for N > 1 are determined according to the following possibilities: CHAPTER 7. RANDOM PROCESSES 227 1. All three possible steps violate the self-intersection constraint (see Figure 7.7b). The walk is terminated with a weight w(N) = 0, and a new walk is generated at the origin. 2. All three steps are possible and w(N) = w(N − 1). 3. Only m steps are possible with 1 ≤ m < 3 (see Figure 7.7c). In this case w(N) = (m/3)w(N − 1), and one of the m possible steps is chosen at random. The desired unbiased value of R2 is obtained by weighting R2 i , the value of R2 obtained in the ith trial, by the value of wi(N), the weight found for this trial. Hence, we write R2 = i wi(N)R2 i i wi(N) (7.47) where the sum is over all trials. Problem 7.29. Rosenbluth and Rosenbluth enrichment method Incorporate the Rosenbluth method into your Monte Carlo program and compute R2 for N = 4, 8, 16, and 32. Estimate the exponent ν from a log-log plot of R2 versus N. Can you distinguish your estimate for ν from its random walk value ν = 1/2? The Rosenbluth and Rosenbluth procedure is not particularly efficient because many walks still terminate, and thus we do not obtain many walkers for large N. Grassberger improved this algorithm by increasing the population of walkers with high weights and reducing the population of walkers with low weights. The idea is that if w(N) for a given trial is above a certain threshold, we add a new walker and give the new and old walker half of the original weight. If w(N) is below a certain threshold, then we eliminate half of the walkers with weights below this threshold (for example, every second walker) and double the weights of the remaining half. It is a good idea to adjust the thresholds as the simulation runs in order to maintain a relatively constant number of walkers. More recently Prellberg and Krawczyk further improved the Rosenbluth and Rosenbluth enrichment method so that there is no need to provide a threshold value. After each step the average weight of the walkers w(N) is computed for a given trial, and the ratio r = w(N)/ w(N) is used to determine whether to add walkers (enrichment) or eliminate walkers (pruning). If r > 1, then c = min(r,m) copies of the walker are made each with weight w(N)/c. If r < 1, then remove this walker with probability 1 − r. This algorithm leads to an approximately constant number of walkers and is related to the Wang–Landau method which we will discuss in Problem 15.30. Another enrichment algorithm is the “reptation” method (see Wall and Mandel). For simplicity, consider a model polymer chain in which all bond angles are ±90◦. As an example of this model, the five independent N = 5 polymer chains are shown in Figure 7.8. (Other chains differ only by a rotation or a reflection.) The reptation method can be stated as follows: 1. Choose a chain at random and remove the tail link. 2. Attempt to add a link to the head of the chain. There is a maximum of two directions in which the new head link can be added. 3. If the attempt violates the self-intersection constraint, return to the original chain and interchange the head and tail. Include the chain in the statistical sample. CHAPTER 7. RANDOM PROCESSES 228 (a) (b) (c) (d) (e) Figure 7.8: The five independent possible walks of N = 5 steps on a square lattice with ±90◦ bond angles. The tail and head of each walk are denoted by a circle and arrow, respectively. The above steps are repeated many times to obtain an estimate of R2. As an example of the reptation method, consider chain a of Figure 7.8. A new link can be added in two directions (see Figure 7.9a), so that on the average we find a → 1 2 c + 1 2 d. In contrast, a link can be added to chain b in only one direction, and we obtain b → 1 2 e + 1 2 b, where the tail and head of chain b have been interchanged (see Figure 7.9b). Confirm that c → 1 2 e+ 1 2 a, d → 1 2 c+ 1 2 d, and e → 1 2 a+ 1 2 b, and that all five chains are equally probable. That is, the transformations in the reptation method preserve the proper statistical weights of the chains without attrition. There is just one problem: unless we begin with a double-ended “cul-de-sac” configuration, such as shown in Figure 7.10, we will never obtain such a configuration using the above transformation. Hence, the reptation method introduces a small statistical bias, and the calculated mean end-to-end distance will be slightly larger than if all configurations were considered. However, the probability of such trapped configurations is very small, and the bias can be neglected for most purposes. ∗Problem 7.30. The reptation method (a) Adopt the ±90◦ bond angle restriction and calculate by hand the exact value of R2 for N = 5. Then write a Monte Carlo program that implements the reptation method. Generate one walk of N = 5 and use the reptation method to generate a statistical sample of chains. As a check on your program, compute R2 for N = 5 and compare your result with the exact result. Then extend your Monte Carlo computations of R2 to larger N. (b) Modify the reptation model so that the bond angle can also be 180◦. This modification leads to a maximum of three directions for a new bond. Compare your results with those from part (a). In principle, the dynamics of a polymer chain undergoing collisions with solvent molecules can be simulated by using a molecular dynamics method. However, in practice, only relatively small chains can be simulated in this way. An alternative approach is to use a Monte CHAPTER 7. RANDOM PROCESSES 229 (a) (c) (d) (b) (e) (b) + + Figure 7.9: The possible transformations of chains a and b. One of the two possible transformations of chain b violates the self-intersection restriction, and the head and tail are interchanged. Figure 7.10: Example of a double-cul-de-sac configuration for the self-avoiding walk that cannot be obtained by the reptation method. Carlo model that simplifies the effect of the random collisions of the solvent molecules with the atoms of the chain. Most of these models (cf. Verdier and Stockmayer) consider the chain to be composed of beads connected by bonds and restrict the positions of the beads to the sites of a lattice. For simplicity, we assume that the bond angles can be either ±90◦ or 180◦. The idea is to begin with an allowed configuration of N beads (N −1 bonds). A possible starting configuration can be generated by taking successive steps in the positive y direction and positive x directions. The dynamics of the Verdier–Stockmayer algorithm is summarized by the following steps: 1. Select at random a bead (occupied site) on the polymer chain. If the bead is not an end site, then the bead can move to a nearest neighbor site of another bead if this site is empty and if the new angle between adjacent bonds is either ±90◦ or 180◦. For example, bead 4 in Figure 7.11 can move to position 4 , while bead 3 cannot move if selected. That is, a selected bead can move to a diagonally opposite unoccupied site only if the two bonds to which it is attached are mutually perpendicular. 2. If the selected bead is an end site, move it to one of two (maximum) possible unoccupied sites, so that the bond to which it is connected changes its orientation by ±90◦ (see Figure 7.11). CHAPTER 7. RANDOM PROCESSES 230 1 1'' 2 1' 4' 3 5 4 6 7 8' 7' 8 Figure 7.11: Examples of possible moves of the simple polymer dynamics model considered in Problem 7.31. For this configuration, beads 2, 3, 5, and 6 cannot move, while beads 1, 4, 7, and 8 can move to the positions shown if they are selected. Only one bead can move at a time. This figure is adopted from the article by Verdier and Stockmayer. 3. If the selected bead cannot move, retain the previous configuration. The physical quantities of interest include R2 and the mean square displacement of the center of mass of the chain r2 = x2 − x 2 + y2 − y 2, where x and y are the coordinates of the center of mass. The unit of time is the number of Monte Carlo steps per bead; in one Monte Carlo step per bead, each bead has one chance on the average to move to a different site. Another efficient method for simulating the dynamics of a polymer chain is the bond fluctuation model (see Carmesin and Kremer). Problem 7.31. The dynamics of polymers in a dilute solution (a) Consider a two-dimensional lattice and compute R2 and r2 for various values of N. How do these quantities depend on N? (The first published results for three dimensions were limited to 32 Monte Carlo steps per bead for N = 8, 16, and 32 and only 8 Monte Carlo steps per bead for N = 64.) Also compute the probability density P (R) that the end-to-end distance is R. How does this probability compare to a Gaussian distribution? (b)∗ Two configurations are strongly correlated if they differ by the position of only one bead. Hence, it would be a waste of computer time to measure the end-to-end distance and the position of the center of mass after every single move. Ideally, we wish to compute these quantities for configurations that are approximately statistically independent. Because we do not know a priori the mean number of Monte Carlo steps per bead needed to obtain configurations that are statistically independent, it is a good idea to estimate this time in our preliminary calculations. The correlation time τ is the time needed to obtain statistically independent configurations and can be obtained by computing the equilibrium averaged time-autocorrelation function for a chain of fixed N: C(t) = R2(t + t)R2(t ) − R2 2 R4 − R2 2 . (7.48) C(t) is defined so that C(t = 0) = 1 and C(t) = 0 if the configurations are not correlated. Because the configurations will become uncorrelated if the time t between the configurations is sufficiently long, we expect that C(t) → 0 for t 1. We expect that C(t) ∼ e−t/τ; that is, C(t) decays exponentially with a decay or correlation time τ. Estimate τ from a plot of CHAPTER 7. RANDOM PROCESSES 231 0 0 0.5 0 0.5 0 0 0 1 0.27 0 0.73 t = 1 0 0.73 0 1 1 0.27 t = 2 0 0 1 0.5 1 0.5 t = 3 t = 0 Figure 7.12: Example of the evolution of the true self-avoiding walk with g = 1 (see (7.49)). The shaded site represents the location of the walker at time t. The number of visits to each site are given within each site. and the probability of a step to a nearest neighbor site is given below it. Note the use of periodic boundary conditions. lnC(t) versus t. Another way of estimating τ is from the integral ∞ 0 dt C(t), where C(t) is normalized so that C(0) = 1. (Because we determine C(t) at discrete values of t, this integral is actually a sum.) How do your two estimates of τ compare? A more detailed discussion of the estimation of correlation times can be found in Section 15.7. Another type of random walk that is less constrained than the self-avoiding random walk is the “true” self-avoiding walk. This walk describes the path of a random walker that avoids visiting a lattice site with a probability that is a function of the number of times the site has been visited already. This constraint leads to a reduced excluded volume interaction in comparison to the usual self-avoiding walk. Problem 7.32. The true self-avoiding walk in one dimension In one dimension the true self-avoiding walk corresponds to a walker that can jump to one of its two nearest neighbors with a probability that depends on the number of times these neighbors have already been visited. Suppose that the walker is at site i at step t. The probability that the walker will jump to site i + 1 at time t + 1 is given by pi+1 = e−gni+1 e−gni+1 + e−gni−1 (7.49) where ni±1 is the number of times that the walker has already visited site i ± 1. The probability of a jump to site i − 1 is pi−1 = 1 − pi+1. The parameter g (g > 0) is a measure of the “desire” of the path to avoid itself. The first few steps of a typical true self-avoiding walk are shown in Figure 7.12. The main quantity of interest is the exponent ν. We know that g = 0 corresponds to the usual random walk with ν = 1/2, and that the limit g → ∞ corresponds to the self-avoiding walk. What is the value of ν for a self-avoiding walk in one dimension? Is the value of ν for any finite value of g different than these two limiting cases? Write a program to do a Monte Carlo simulation of the true self-avoiding walk in one dimension. Use an array to record the number of visits to every site. At each step calculate the CHAPTER 7. RANDOM PROCESSES 232 probability p of a jump to the right. Generate a random number r and compare it to p. If r ≤ p, move the walker to the right; otherwise, move the walker to the left. Compute ∆x2 as a function of the number of steps N, where x is the distance of the walker from the origin. Make a log-log plot of ∆x2 versus N and estimate ν. Can you distinguish ν from its random walk and self-avoiding walk values? Reasonable choices of parameters are g = 0.1 and N ∼ 103. Averages over 103 trials yield qualitative results. For comparison, published results are for N ∼ 104 and for 103 trials; extended results for g = 2 are given for N = 2 × 105 and 104 trials (see Bernasconi and Pietronero). 7.8 Diffusion-Controlled Chemical Reactions Imagine a system containing particles of a single species A. The particles diffuse, and when two particles “collide,” a reaction occurs such that the two combine to form an inert species which is no longer involved in the reaction. We can represent this chemical reaction as A + A → 0. (7.50) If we ignore the spatial distribution of the particles, we can describe the kinetics by a simple rate equation: dA(t) dt = −kA2 (t) (7.51) where A is the concentration of A particles at time t and k is the rate constant. (In the chemical kinetics literature, it is traditional to use the term concentration rather than the number density.) For simplicity, we assume that all reactants are entered into the system at t = 0, and that no reactants are added later (the system is closed). It is easy to show that the solution of the first-order differential equation (7.51) is A(t) = A(0) 1 + ktA(0) . (7.52) Hence, A(t) ∼ t−1 in the limit of long times. Another interesting case is the bimolecular reaction A + B → 0. (7.53) If we neglect spatial fluctuations in the concentration as before (this neglect yields what is known as a mean-field approximation), we can write the corresponding rate equation as dA(t) dt = dB(t) dt = −kA(t)B(t). (7.54) We have also A(t) − B(t) = constant (7.55) because each reaction leaves the difference between the concentration of A and B particles unchanged. For the special case of equal initial concentrations, the solution of (7.54) with (7.55) is the same as (7.52). What is the solution for the case A(0) B(0)? This derivation of the time dependence of A for the kinetics of the one and two species annihilation process is straightforward, but is based on the assumption that the particles are distributed uniformly. In the following two problems, we simulate the kinetics of these processes and test this assumption. CHAPTER 7. RANDOM PROCESSES 233 Problem 7.33. Diffusion-controlled chemical reactions in one dimension (a) Assume that N particles do a random walk on a one-dimensional lattice of length L with periodic boundary conditions. Every particle moves once in one unit of time. Use the array site[j] to record the label of the particle, if any, at site j. Because we are interested in the long time behavior of the system when the concentration A = N/L of particles is small, it is efficient to also maintain an array of particle positions x[i] such that site[x(i] = i. For example, if particle 5 is located at site 12, then x[5] = 12 and site[12] = 5. We also need an array, newSite, to maintain the new positions of the walkers as they are moved one at a time. After each walker is moved, we check to see if two walkers have landed on the same position k. If they have, we set newSite[k] = −1 and the value of x[i] for these two walkers to −1. The value −1 indicates that no particle exists at the site. After all the walkers have moved, we let site = newSite for all sites, and remove all the reacting particles in x that have values equal to −1. This operation can be accomplished by replacing any reacting particle in x by the last particle in the array. Begin with all sites occupied A(t = 0) = 1. (b) Make a log-log plot of the quantity A(t)−1 − A(0)−1 versus the time t. The times should be separated by exponential intervals, so that your data is equally spaced on a logarithmic plot. For example, you might include data with times equal to 2p and p = 1, 2, 3, . . . . Does your log-log plot yield a straight line for long times? If so, calculate its slope. Is the mean-field approximation for A(t) valid in one dimension? You can obtain crude results for small lattices of order L = 100 and times of order t = 102. To obtain results to within 10%, you will need lattices of order L = 104 and times of order t = 213. (c) More insight into the origin of the time dependence of A(t) can be gained from the behavior of the quantity P (r,t), the probability that the nearest neighbor distance is r at time t. The nearest neighbor distance of a given particle is defined as the minimum distance between it and all other particles. The distribution of these distances changes dramatically as the reaction proceeds, and this change can give information about the reaction mechanism. Place the particles at random on a one-dimensional lattice and verify that the most probable nearest neighbor distance is r = 1 (one lattice constant) for all concentrations. (This result is true in any dimension.) Then verify that the distribution of nearest neighbor distances on a one-dimensional lattice is given by P (r,t = 0) = 2Ae−2A(r−1) (random distribution). (7.56) Is the form (7.56) properly normalized? Start with A(t = 0) = 0.1 and find P (r,t) for t = 10, 100, and 1000. Average over all particles. How does P (r,t) change as the reaction proceeds? Does it retain the same form as the concentration decreases? (d)∗ Compute the quantity D(t), the number of distinct sites visited by an individual walker. How does the time dependence of D(t) compare to the computed time dependence of A(t)−1 − 1? (e)∗ Write a program to simulate the reaction A + B = 0. For simplicity, assume that multiple occupancy of the same site is not allowed; for example, an A particle cannot jump to a site already occupied by an A particle. The easiest procedure is to allow a walker to choose one of its nearest neighbor sites at random, but not to move the walker if the chosen site is already occupied by a particle of the same type. If the site is occupied by a walker of another type, then the pair of reacting particles is annihilated. Keep separate arrays for the A and B particles with the value of the array denoting the label of the particle as before. CHAPTER 7. RANDOM PROCESSES 234 One way to distinguish A and B walkers is to make the array element site(k) positive if the site is occupied by an A particle and negative if the site is occupied by a B particle. Start with equal concentrations of A and B particles and occupy the sites at random. Some of the interesting questions are similar to those that we posed in parts (b)–(d). Color code the particles and observe what happens to the relative positions of the particles. ∗Problem 7.34. Reaction diffusion in two dimensions (a) Do a similar simulation as in Problem 7.33 on a two-dimensional lattice for the reaction A+ A → 0. In this case it is convenient to have one array for each dimension, for example, siteX and siteY, or to store the lattice as a one-dimensional array (see Section 12.2). Set A(t = 0) = 1 and choose L = 50. Show the walkers after each Monte Carlo step per walker and describe their distribution as they diffuse. Are the particles uniformly distributed throughout the lattice for all times? Calculate A(t) and compare your results for A(t)−1 − A(0)−1 to the tdependence of D(t), the number of distinct lattice sites that are visited in time t. (In two dimensions, D(t) ∼ t/ logt.) How well do the slopes compare? Do a similar simulation with A(t = 0) = 0.01. What slope do you obtain in this case? What can you conclude about the initial density dependence? Is the mean-field approximation valid in this case? (b) Begin with A and B type random walkers initially segregated on the left and right halves (in the x direction) of a square lattice. The process A + B → C exhibits a reaction front where the production of particles of type C is nonzero. Some of the quantities of interest are the time dependence of the mean position x (t) and the width w(t) of the reaction front. The rules of this process are the same as in part (a) except that a particle of type C is added to a site when a reaction occurs. A particular site can be occupied by one particle of type A or type B as well as any number of particles of type C. If n(x,t) is the number of particles of type C at a distance x from the initial boundary of the reactants, then x(t) and w(t) can be written as x (t) = x xn(x,t) x n(x,t) (7.57) w(t)2 = x x − x (t) 2 n(x,t) x n(x,t) . (7.58) Choose lattice sizes of order 100 × 100 and average over at least 10 trials. The fluctuations in x(t) and w(t) can be reduced by averaging n(x,t) over the order of 100 time units centered about t. More details can be found in Jiang and Ebner. 7.9 Random Number Sequences So far we have used the random number generator supplied with Java to generate the desired random numbers. In principle, we could have generated these numbers from a random physical process, such as the decay of radioactive nuclei or the thermal noise from a semiconductor device. In practice, random number sequences are generated from a physical process only for specific purposes such as a lottery. Although we could store the outcome of a random physical process so that the random number sequence would be both truly random and reproducible, such a method would usually be inconvenient and inefficient in part because we often require very long sequences. In practice, we use a digital computer, a deterministic machine, to generate CHAPTER 7. RANDOM PROCESSES 235 sequences of pseudorandom numbers. Although these sequences cannot be truly random, such a distinction is unimportant if the sequence satisfies all our criteria for randomness. It is common to refer to random number generators even though we really mean pseudorandom number generators. Most random number generators yield a sequence in which each number is used to find the succeeding one according to a well-defined algorithm. The most important features of a desirable random number generator are that its sequence satisfies the known statistical tests for randomness, which we will explore in the following problems. We also want the generator to be efficient and machine independent and the sequence to be reproducible. The most widely used random number generator is based on the linear congruential method. One advantage of the linear congruential method is that it is very fast. For a given seed x0, each number in the sequence is determined by the one-dimensional map xn = (axn−1 + c)modm (7.59) where a, c, and m as well as xn are integers. The notation y = zmodm means that m is subtracted from z until 0 ≤ y < m. The map (7.59) is characterized by three parameters, the multiplier a, the increment c, and the modulus m. Because m is the largest integer generated by (7.59), the maximum possible period is m. In general, the period depends on all three parameters. For example, if a = 3, c = 4, m = 32, and x0 = 1, the sequence generated by (7.59) is 1, 7, 25, 15, 17, 23, 9, 31, 1, 7, 25, ..., and the period is 8, rather than the maximum possible value of m = 32. If we choose a, c, and m carefully such that the maximum period is obtained, then all possible integers between 0 and m−1 would occur in the sequence. Because we usually wish to have random numbers r in the unit interval 0 ≤ r < 1, rather than random integers, random number generators usually return the ratio xn/m which is always less than unity. Several rules have been developed (see Knuth) to obtain the longest period. Some of the properties of the linear congruential method are explored in Problem 7.35. Another popular random number generator is the generalized feedback shift register method which uses bit manipulation (see Sections 14.1 and 14.6). Every integer is represented as a series of 1s and 0s called bits. These bits can be shuffled by using the bitwise exclusive or operator ⊕ (xor) defined by a⊕b = 1 if the bits a b; a⊕b = 0 if a = b. The nth member of the sequence is given by xn = xn−p ⊕ xn−q (7.60) where p > q, and p, q, and xn are integers. The first p random integers must be supplied by another random number generator. As an example of how the operator ⊕ works, suppose that n = 6, p = 5, q = 3, x3 = 11, and x1 = 6. Then x6 = x1 ⊕ x3 = 0110 ⊕ 1011 = 1101 = 23 + 22 + 20 = 8 + 4 + 1 = 13. Not all values of p and q lead to good results. Some common pairs are (p,q) = (31,3),(250,103), and (521,168). In Java and C the exclusive or operation on the integers m and n is written as mˆn. The algorithm for producing the random numbers after p integers have been produced is shown in the following. Initially the index k can be set to 0. 1. If k < q, set j = k + q, else set j = k − p + q. 2. Set xk = xk ⊕ xj; xk is the desired random number for this iteration. If a random number between 0 and 1 is desired, divide xk by the maximum possible integer that the computer can hold. CHAPTER 7. RANDOM PROCESSES 236 3. Increment k to (k + 1) mod p. Because the exclusive or operator and bit manipulation is very fast, this random number generator is very fast. However, the period may not be long enough for some applications, and the correlations between numbers might not be as good as needed. The shuffling algorithm discussed in Problem 7.36 should be used to improve this generator. These two examples of random number generators illustrate their general nature. That is, numbers in the sequence are used to find the succeeding ones according to a well-defined algorithm. The sequence is determined by the seed, the first number of the sequence, or the first p members of the sequence for the generalized feedback shift register and related generators. Usually, the maximum possible period is related to the size of the computer word, for example, 32 or 64 bits. The choice of the constants and the proper initialization of the sequence is very important, and thus these algorithms must be implemented with care. There is no necessary and sufficient test for the randomness of a finite sequence of numbers; the most that can be said about any finite sequence of numbers is that it is apparently random. Because no single statistical test is a reliable indicator, we need to consider several tests. Some of the best known tests are discussed in Problem 7.35. Many of these tests can be stated in terms of random walks. Problem 7.35. Statistical tests of randomness (a) Period. An obvious requirement for a random number generator is that its period be much greater than the number of random numbers needed in a specific calculation. One way to visualize the period of the random number generator is to use it to generate a plot of the displacement x of a random walker as a function of the number of steps N. When the period of the random number is reached, the plot will begin to repeat itself. Generate such a plot using (7.59) for a = 899, c = 0, and m = 32768, and for a = 16807, c = 0, and m = 231 −1 with x0 = 12. What are the periods of the corresponding random number generators? Obtain similar plots using different values for the parameters a, c, and m. Why is the seed value x0 = 0 forbidden for the choice c = 0? Do some combinations of a, c, and m give longer periods than others? (b) Uniformity. A random number sequence should contain numbers distributed in the unit interval with equal probability. The simplest test of uniformity is to divide this interval into M equal size subintervals or bins. For example, consider the first N = 104 numbers generated by (7.59) with a = 106, c = 1283, and m = 6075 (see Press et al.). Place each number into one of M = 100 bins. Is the number of entries in each bin approximately equal? What happens if you increase N? (c) Chi-square test. Is the distribution of numbers in the bins of part (b) consistent with the laws of statistics? The most common test of this consistency is the chi-square or χ2 test. Let yi be the observed number in bin i and Ei be the expected value. The chi-square statistic is χ2 = M i=1 (yi − Ei)2 Ei . (7.61) For the example in part (b) with N = 104 and M = 100, we have Ei = 100. The magnitude of the number χ2 is a measure of the agreement between the observed and expected distributions; chi2 should not be too big or too small. In general, the individual terms in the sum CHAPTER 7. RANDOM PROCESSES 237 (7.61) are expected to be order one, and because there are M terms in the sum, we expect χ2 ≤ M. As an example, we did five independent runs of a random number generator with N = 104 and M = 100 and found χ2 ≈ 92, 124, 85, 91, and 99. These values of χ2 are consistent with this expectation. Although we usually want χ2 to be as small as possible, we would be suspicious if χ2 ≈ 0, because such a small value suggests that N is a multiple of the period of the generator and that each value in the sequence appears an equal number of times. (d) Filling sites. Although a random number sequence might be distributed in the unit interval with equal probability, the consecutive numbers might be correlated in some way. One test of this correlation is to fill a square lattice of L2 sites at random. Consider an array n(x,y) that is initially empty, where 1 ≤ xi,yi ≤ L. A site is selected randomly by choosing its two coordinates xi and yi from two consecutive numbers in the sequence. If the site is empty, it is filled and n(xi,yi) = 1; otherwise it is not changed. This procedure is repeated t times, where t is the number of Monte Carlo steps per site. That is, the time is increased by 1/L2 each time a pair of random numbers is generated. Because this process is analogous to the decay of radioactive nuclei, we expect that the fraction of empty lattice sites should decay as e−t. Determine the fraction of unfilled sites using the random number generator that you have been using for L = 10, 15, and 20. Are your results consistent with the expected fraction? Repeat the same test using (7.59) with a = 231, c = 0, and m = 65,549. The existence of triplet correlations can be determined by a similar test on a simple cubic lattice by choosing the three coordinates xi, yi, and zi from three consecutive random numbers. (e) Parking lot test. Fill sites as in part (d) and draw the sites that have been filled. Do the filled sites look random, or are there stripes of filled sites? Try a = 65.549, c = 0, and m = 231. (f) Hidden correlations. Another way of checking for correlations is to plot xi+k versus xi. If there are any obvious patterns in the plot, then there is something wrong with the generator. Use the generator (7.59) with a = 16,807,c = 0, and m = 231 − 1. Can you detect any structure in the plotted points for k = 1 to k = 5? Test the random number generator that you have been using. Do you see any evidence of lattice structure, for example, equidistant parallel lines? Is the logistic map xn+1 = 4xn(1 − xn) a suitable random number generator? (g) Short-term correlations. Another measure of short term correlations is the autocorrelation function C(k) = xi+kxi − xi 2 xixi − xi xi (7.62) where xi is the ith term in the sequence. We have used the fact that xi+k = xi ; that is, the choice of the origin of the sequence is irrelevant. The quantity xi+kxi is found for a particular choice of k by forming all the possible products of xi+kxi and dividing by the number of products. If xi+k and xi are not correlated, then xi+kxi = xi+k xi and C(k) = 0. Is C(k) identically zero for any finite sequence? Compute C(k) for a = 106, c = 1283, and m = 6075. (h) Random walk. A test based on the properties of random walks has been proposed by Vattulainen et al. Assume that a walker begins at the origin of the x-y plane and walks for N steps. Average over M walkers and count the number of walks that end in each quadrant qi. Use the χ2 test (7.61) with yi → qi, M = 4, and Ei = M/4. If χ2 > 7.815 (a 5% probability if the random number generator is perfect), we say that the run fails. The random number CHAPTER 7. RANDOM PROCESSES 238 generator fails if two out of three independent runs fail. The probability of a perfect generator failing two out of three runs is approximately 3 × 0.95 × (0.05)2 ≈ 0.007. Test several random number generators. Problem 7.36. Improving random number generators One way to reduce sequential correlation and to lengthen the period is to mix or shuffle the random numbers produced by a random number generator. A standard procedure is to begin with a list of N random numbers (between 0 and 1) using a given generator rng. The number N is arbitrary but should be less than the period of rng. Also generate one more random number rextra. Then for each desired random number use the following procedure: (i) Calculate the integer k given by (int)(N*rextra). Use the kth random number rk from your list as the desired random number. (ii) Set rextra equal to the random number rk chosen in step (i). (iii) Generate a new random number r from rng and use it to replace the number chosen in step (i); that is, rk = r. Consider a random number generator with a relatively short period and strong sequential correlation and show that this shuffling scheme improves the quality of the random number se- quence. At least some of the statistical tests given in Problem 7.35 should be done whenever serious calculations are contemplated. However, even if a random number generator passes all these tests, there can still be problems in rare cases. Typically, these problems arise when a small number of events have a large weight. In these cases a very small bias in the random number generator might lead to systematic errors, and two generators, which appear equally good as determined by various statistical tests, might give statistically different results in a specific application (see Project 15.34). For this reason it is important that the particular random number generator used be reported along with the actual results. Confidence in the results can also be increased by repeating the calculation with another random number generator. Because all random number generators are based on a deterministic algorithm, it is always possible to construct a test generator for which a particular algorithm will fail. The success of a random number generator in passing various statistical tests is necessary, but it is not a sufficient condition for its use in all applications. In Project 15.34 we discuss an application of Monte Carlo methods to the Ising model for which some popular random number generators give incorrect results. 7.10 Variational Methods Many problems in physics can be formulated in terms of a variational principle. In the following, we consider examples of variational principles in geometrical optics and classical mechanics. We then discuss how Monte Carlo methods can be applied to these problems. A more sophisticated application of Monte Carlo methods to a variational problem in quantum mechanics is discussed in Chapter 16. Our everyday experience of light leads naturally to the concept of light rays. This description of light propagation, called geometrical or ray optics, is applicable when the wavelength of CHAPTER 7. RANDOM PROCESSES 239 light is small compared to the linear dimensions of any obstacles or openings. The path of a light ray can be formulated in terms of Fermat’s principle of least time: A ray of light follows the path between two points (consistent with any constraints) that requires the least amount of time. Fermat’s principle can be adopted as the basis of geometrical optics. For example, Fermat’s principle implies that light travels from a point A to a point B in a straight line in a homogeneous medium. Because the speed of light is constant along any path within the medium, the path of shortest time is the path of shortest distance; that is, a straight line from A to B. What happens if we impose the constraint that the light must strike a mirror before reaching B? The speed of light in a medium can be expressed in terms of c, the speed of light in a vacuum, and the index of refraction n of the medium: v = c n . (7.63) Suppose that a light ray in a medium with index of refraction n1 passes through a second medium with index of refraction n2. The two media are separated by a plane surface. We now show how we can use Fermat’s principle and a simple Monte Carlo method to find the path of the light. The analytic solution to this problem using Fermat’s principle is found in many texts (cf. Feynman et al.). Our strategy, as implemented in class Fermat, is to begin with a straight path and to make changes in the path at random. These changes are accepted only if they reduce the travel time of the light. Some of the features of Fermat and FermatApp include: 1. Light propagates from left to right through N regions. The index of refraction n[i] is uniform in each region [i]. The index i increases from left to right. We have chosen units such that the speed of light in vacuum equals unity. 2. Because the light propagates in a straight line in each medium, the path of the light is given by the coordinates y[i] at each boundary. 3. The coordinates of the light source and the detector are at (0,y[0]) and (N,y[N]), respectively, where y[0] and y[N] are fixed. 4. The path is the connection of the set of points at the boundary of each region. 5. The path of the light is found by choosing the boundary i at random and generating a trial value of y[i] that differs from its previous value by a random number between -dy to dy. If the trial value of y[i] yields a shorter travel time, this value becomes the new value for y[i]. 6. The path is redrawn whenever it is changed. Listing 7.6: Fermat class. package org . opensourcephysics . sip . ch07 ; public class Fermat { double y [ ] ; / / y coordinate of l i g h t ray , index i s x coordinate / / l i g h t speed of ray f o r medium s t a r t i n g at index value double v [ ] ; int N; / / number of media / / change in index of r e f r a c t i o n from one region to the next double dn ; double dy = 0 . 1 ; / / maximum change in y p o s i t i o n CHAPTER 7. RANDOM PROCESSES 240 int steps ; public void i n i t i a l i z e ( ) { y = new double [N+1]; v = new double [N] ; double indexOfRefraction = 1 . 0 ; for ( int i = 0; i <=N; i ++) { y [ i ] = i ; / / i n i t i a l path i s a s t r a i g h t l i n e } for ( int i = 0; i 1) { int partA = 1 + ( int ) ( Math . random ( ) ( length [ i ] − 1 ) ); int partB = length [ i ] − partA ; length [ i ] = partA ; length [ numberOfObjects ] = partB ; / / new o b j e c t numberOfObjects++; } The main quantity of interest is the distribution of lengths P (L). Explore a variety of initial length distributions with a total mass of 5000 for which the distribution is peaked at about 20 mass units. Is the long time behavior of P (L) similar in shape for any initial distribution? Compute the total mass (sum of the lengths) and output this value periodically. Although the total mass will fluctuate, it should remain approximately constant. Why? (b) Collect data for three different initial distributions with the same number of objects N, and scale P (L) and L so that the three distributions roughly fall on the same curve. For example, you can scale P (L) so that the maximum of the three distributions has the same value. Then multiply each value of L by a factor so that the distributions overlap. (c) The analytic results suggest that the universal behavior can be obtained by scaling L by the total mass raised to the 1/3 power. Is this prediction consistent with your results? Test this CHAPTER 7. RANDOM PROCESSES 245 hypothesis by adjusting the initial distributions so that they all have the same total mass. Your results for the long time behavior of P (L) should fall on a universal curve. Why is this universality interesting? How can this result be used to analyze different systems? Would you need to do a new simulation for each value of L? (d) What happens if step (iii) is done more or less often than each random change of length. Does the scaling change? Project 7.41. Application of the pivot algorithm to self-avoiding walks The algorithms that we have discussed for generating self-avoiding random walks are all based on making local deformations of the walk (polymer chain) for a given value of N, the number of bonds. As discussed in Problem 7.31, the time τ between statistically independent configurations is nonzero. The problem is that τ increases with N as some power, for example, τ ∼ N3. This power law dependence of τ on N is called critical slowing down and implies that it becomes increasingly more time consuming to generate long walks. We now discuss an example of a global algorithm that reduces the dependence of τ on N. Another example of a global algorithm that reduces critical slowing down is discussed in Project 15.32. (a) Consider the walk shown in Figure 7.14a. Select a site at random and one of the four possible directions. The shorter portion of the walk is rotated (pivoted) to this new direction by treating the walk as a rigid structure. The new walk is accepted only if the new walk is self-avoiding; otherwise, the old walk is retained. (The shorter portion of the walk is chosen to save computer time.) Some typical moves are shown in Figure 7.14. Note that if an end point is chosen, the previous walk is retained. Write a program to implement this algorithm and compute the dependence of the mean square end-to-end distance R2 on N. Consider values of N in the range 10 ≤ N ≤ 80. A discussion of the results and the implementation of the algorithm can be found in MacDonald et al. and Madras and Sokal, respectively. (b) Compute the correlation time τ for different values of N using the approach discussed in Problem 7.31b. Project 7.42. Pattern formation In Problem 7.34 we saw that simple patterns can develop as a result of random behavior. The phenomenon of pattern formation is of much interest in a variety of contexts ranging from the large scale structure of the universe to the roll patterns seen in convection (for example, smoke rings). In the following, we explore the patterns that can develop in a simple reaction diffusion model based on the reactions, A+2B → 3B and B → C, where C is inert. Such a reaction is called autocatalytic. In Problem 7.34 we considered chemical reactions in a closed system where the reactions can proceed to equilibrium. In contrast, open systems allow a continuous supply of fresh reactants and a removal of products. These two processes allow steady states to be realized and oscillatory conditions to be maintained indefinitely. In this problem we assume that A is added at a constant rate, and that both A and B are removed by the feed process. Pearson (see references) modeled these processes by two coupled reaction diffusion equations: ∂A ∂t = DA 2 A − AB2 + f (1 − A) (7.67a) ∂B ∂t = DB 2 B + AB2 − (f + k)B. (7.67b) CHAPTER 7. RANDOM PROCESSES 246 (a) (b) (c) (d) (e) (f) Figure 7.14: Examples of the first several changes generated by the pivot algorithm for a selfavoiding walk of N = 10 bonds (11 sites). The open circle denotes the pivot point. This figure is adopted from the article by MacDonald et al. The AB2 term represents the reaction A + 2B → 3B. This term is negative in (7.67a) because the reactant A decreases and is positive in (7.67b) because the reactant B increases. The term +f represents the constant addition of A, and the terms −f A and −f B represent the removal process; the term −kB represents the reaction B → C. All the quantities in (7.67) are dimensionless. We assume that the diffusion coefficients are DA = 2 × 10−5 and DB = 10−5, and the behavior of the system is determined by the values of the rate constant k and the feed rate f . (a) We first consider the behavior of the reaction kinetics that results when the diffusion terms in (7.67) are neglected. It is clear from (7.67) that there is a trivial steady state solution with A = 1, B = 0. Are there other solutions, and if so, are they stable? The steady state solutions can be found by solving (7.67) with ∂A/∂t = ∂B/∂t = 0. To determine the stability, we can add a perturbation and determine whether the perturbation grows or not. However, without the diffusion terms, it is more straightforward to solve (7.67) numerically using a simple Euler algorithm. Choose a time step equal to unity and let A = 0.1 and B = 0.5 at t = 0. Determine the steady state values for 0 < f ≤ 0.3 and 0 < k ≤ 0.07 in increments of ∆f = 0.02 and ∆k = 0.005. Record the steady state values of A and B. Then repeat this exercise for the initial values A = 0.5 and B = 0.1. You should find that for some values of f and k, only one steady state solution can be obtained for the two initial conditions, and for other initial values of A and B there are two steady state solutions. Try other initial conditions. If you obtain a new solution, change the initial A or B slightly to see if your new solution is stable. On an f versus k plot indicate, where there are two solutions and where there are one. In this way you can determine the approximate phase diagram for this process. (b) There is a small region in f -k space where one of the steady state solutions becomes unstable and periodic solutions occur (the mechanism is known as a Hopf bifurcation). Try f = 0.009, k = 0.03, and set A = 0.1 and B = 0.5 at t = 0. Plot the values of A and B versus the time t. Are they periodic? Try other values of f and k and estimate where the periodic solutions occur. CHAPTER 7. RANDOM PROCESSES 247 Figure 7.15: Evolution of the pattern starting from the initial conditions suggested in Project 7.42c. (c) Numerical solutions of the full equation with diffusion (7.67) can be found by making a finite difference approximation to the spatial derivatives as in (3.16) and using a simple Euler algorithm for the time integration. Adopt periodic boundary conditions. Although it is straightforward to write a program to do the numerical integration, an exploration of the dynamics of this system requires much computer resources. However, we can find some preliminary results with a small system and a coarse grid. Consider a 0.5 × 0.5 system with a spatial mesh of 128 × 128 grid points on a square lattice. Choose f = 0.18, k = 0.057, and ∆t = 0.1. Let the entire system be in the initial trivial state (A = 1, B = 0) except for a 20 × 20 grid located at the center of the system where the sites are A = 1/2, B = 1/4 with a ±1% random noise. The effect of the noise is to break the square symmetry. Let the system evolve for approximately 80,000 time steps and look at the patterns that develop. Color code the grid according to the concentration of A, with red representing A = 1 and blue representing A ≈ 0.2 and with several intermediate colors. Very interesting patterns have been found by Pearson (see Figure 7.15). Appendix 7A: Random Walks and the Diffusion Equation To gain some insight into the relation between random walks and the diffusion equation, we first show that the latter implies that x(t) is zero and x2(t) is proportional to t. We rewrite the diffusion equation (7.27) here for convenience: ∂P (x,t) ∂t = D ∂2P (x,t) ∂x2 . (7.68) CHAPTER 7. RANDOM PROCESSES 248 To derive the t dependence of x(t) and x2(t) from (7.68), we write the average of any function of x as f (x,t) = ∞ −∞ f (x)P (x,t)dx. (7.69) The average displacement is given by x(t) = ∞ −∞ xP (x,t)dx. (7.70) To do the integral on the right-hand side of (7.70), we multiply both sides of (7.68) by x and formally integrate over x: ∞ −∞ x ∂P (x,t) ∂t dx = D ∞ −∞ x ∂2P (x,t) ∂x2 dx. (7.71) The left-hand side can be expressed as ∞ −∞ x ∂P (x,t) ∂t dx = ∂ ∂t ∞ −∞ xP (x,t)dx = d dt x . (7.72) The right-hand side of (7.71) can be written in the desired form by doing an integration by parts: D ∞ −∞ x ∂2P (x,t) ∂x2 dx = D x ∂P (x,t) ∂x x=∞ x=−∞ − D ∞ −∞ ∂P (x,t) ∂x dx. (7.73) The first term on the right hand side of (7.73) is zero because P (x = ±∞,t) = 0 and all the spatial derivatives of P at x = ±∞ are zero. The second term is also zero because it integrates to D[P (x = ∞,t) − P (x = −∞,t)]. Hence, we find that d x dt = 0 (7.74) or x is a constant, independent of time. Because x = 0 at t = 0, we conclude that x = 0 for all t. To calculate x2(t) , we can use a similar procedure and perform two integrations by parts. The result is d dt x2 (t) = 2D (7.75) or x2 (t) = 2Dt. (7.76) We see that the random walk and the diffusion equation have the same time dependence. In d-dimensional space, 2D is replaced by 2dD. The solution of the diffusion equation shows that the time dependence of x2(t) is equivalent to the long time behavior of a simple random walk on a lattice. In the following, we show directly that the continuum limit of the one-dimensional random walk model is a diffusion equation. If there is an equal probability of taking a step to the right or left, the random walk can be written in terms of the simple master equation P (i,N) = 1 2 [P (i + 1,N − 1) + P (i − 1,N − 1)] (7.77) CHAPTER 7. RANDOM PROCESSES 249 where P (i,N) is the probability that the walker is at site i after N steps. To obtain a differential equation for the probability density P (x,t), we identify t = Nτ, x = ia, and P (i,N) = aP (x,t), where τ is the time between steps and a is the lattice spacing. This association allows us to rewrite (7.77) in the equivalent form P (x,t) = 1 2 [P (x + a,t − τ) + P (x − a,t − τ)]. (7.78) We rewrite (7.78) by subtracting P (x,t − τ) from both sides of (7.78) and dividing by τ: 1 τ P (x,t) − P (x,t − τ) = a2 2τ P (x + a,t − τ) − 2P (x,t − τ) + P (x − a,t − τ) a−2 . (7.79) If we expand P (x,t − τ) and P (x ± a,t − τ) in a Taylor series and take the limit a → 0 and τ → 0 with the ratio D ≡ a2/2τ finite, we obtain the diffusion equation ∂P (x,t) ∂t = D ∂2P (x,t) ∂x2 . (7.80a) The generalization of (7.80a) to three dimensions is ∂P (x,y,z,t) ∂t = D 2 P (x,y,z,t) (7.80b) where 2 = ∂2/∂x2 + ∂2/∂y2 + ∂2/∂x2 is the Laplacian operator. Equation (7.80) is known as the diffusion equation and is frequently used to describe the dynamics of fluid molecules. The direct numerical solution of the prototypical parabolic partial differential equation (7.80) is a nontrivial problem in numerical analysis (cf. Press et al. or Koonin and Meredith). An indirect method of solving (7.80) numerically is to use a Monte Carlo method; that is, replace the partial differential equation (7.80) by a corresponding random walk on a lattice with discrete time steps. Because the asymptotic behavior of the partial differential equation and the random walk model are equivalent, this approach uses the Monte Carlo technique as a method of numerical analysis. In contrast, if our goal is to understand a random walk lattice model directly, the Monte Carlo technique is a simulation method. The difference between simulation and numerical analysis is sometimes in the eyes of the beholder. Problem 7.43. Biased random walk Show that the form of the differential equation satisfied by P (x,t) corresponding to a random walk with a drift; that is, a walk for p q, is ∂P (x,t) ∂t = D 2 P (x,y,z,t) − v ∂P (x,t) ∂x . (7.81) How is v related to p and q? References and Suggestions for Further Reading Daniel J. Amit, G. Parisi, and L. Peleti, “Asymptotic behavior of the “true" self-avoiding walk,” Phys. Rev. B 27, 1635–1645 (1983). Panos Argyrakis, “Simulation of diffusion-controlled chemical reactions,” Computers in Physics 6, 525–579 (1992). CHAPTER 7. RANDOM PROCESSES 250 G. T. Barkema, Parthapratim Biswas, and Henk van Beijeren, “Diffusion with random distribution of static traps,” Phys. Rev. Lett. 87, 170601 (2001). J. M. Bernardo and A. F. M. Smith, Bayesian Theory (John Wiley & Sons, 1994). Bayes theorem is stated concisely on page 2. J. Bernasconi and L. Pietronero, “True self-avoiding walk in one dimension,” Phys. Rev. B 29, 5196–5198 (1984). The authors present results for the exponent ν accurate to 1%. Philip R. Bevington and D. Keith Robinson, Data Reduction and Error Analysis for the Physical Sciences, 3rd ed. (McGraw–Hill, 2003). I. Carmesin and Kurt Kremer, “The bond fluctuation model: A new effective algorithm for the dynamics of polymers in all spatial dimensions,” Macromolecules 21, 2819–2823 (1988). The bond fluctuation model is an efficient method for simulating the dynamics of polymer chains and would be the basis of an excellent project. S. Chandrasekhar, “Stochastic problems in physics and astronomy,” Rev. Mod. Phys. 15, 1– 89 (1943). This article is reprinted in M. Wax, Selected Papers on Noise and Stochastic Processes (Dover, 1954). William S. Cleveland and Robert McGill, “Graphical perception and graphical methods for analyzing scientific data,” Science 229, 828–833 (1985). There is more to analyzing data than least squares fits. Mohamed Daoud, “Polymers,” Chapter 6 in Armin Bunde and Shlomo Havlin, editors, Fractals in Science, Springer-Verlag (1994). Roan Dawkins and Daniel ben–Avraham, “Computer simulations of diffusion-limited reactions," Comput. Sci. Eng. 3 (1), 72–76 (2001). R. Everaers, I. S. Graham, and M. J. Zuckermann, “End-to-end distance and asymptotic behavior of self-avoiding walks in two and three dimensions,” J. Phys. A 28, 1271–1293 (1995). Jesper Ferkinghoff–Borg, Mogens H. Jensen, Joachim Mathiesen, Poul Olesen, and Kim Sneppen, “Competition between diffusion and fragmentation: An important evolutionary process of nature,” Phys. Rev. Lett. 91, 266103 (2003). The results of the model were compared with experimental data on ice crystal sizes and the length distribution of α helices in proteins. Richard P. Feynman, Robert B. Leighton, and Matthew Sands, The Feynman Lectures on Physics (Addison–Wesley, 1963). See Vol. 1, Chapter 26 for a discussion of the principle of least time and Vol. 2, Chapter 19 for a discussion of the principle of least action. Pierre–Giles de Gennes, Scaling Concepts in Polymer Physics (Cornell University Press, 1979). A difficult but important text. Peter Grassberger, “Pruned-enriched Rosenbluth method: Simulations of θ polymers of chain length up to 1 000 000,” Phys. Rev. E 56, 3682–3693 (1997). Shlomo Havlin and Daniel ben–Avraham, “Diffusion in disordered media,” Adv. Phys. 36, 695 (1987). Section 7 of this review article discusses trapping and diffusion-limited reactions. Also see Daniel ben–Avraham and Shlomo Havlin, Diffusion and Reactions in Fractals and Disordered Systems (Cambridge University Press, 2001). CHAPTER 7. RANDOM PROCESSES 251 Shlomo Havlin, George H. Weiss, James E. Kiefer, and Menachem Dishon, “Exact enumeration of random walks with traps,” J. Phys. A: Math. Gen. 17, L347 (1984). The authors discuss a method based on exact enumeration for calculating the survival probability of random walkers on a lattice with randomly distributed traps. Brian Hayes, “How to avoid yourself,” Am. Scientist 86 (4), 314–319 (1998). Z. Jiang and C. Ebner, “Simulation study of reaction fronts,” Phys. Rev. A 42, 7483–7486 (1990). Peter R. Keller and Mary M. Keller, Visual Cues (IEEE Press, 1993). A well-illustrated book on data visualization techniques. Donald E. Knuth, Seminumerical Algorithms, 2nd ed., Vol. 2 of The Art of Computer Programming (Addison-Wesley, 1981). The standard reference on random number generators. Bruce MacDonald, Naeem Jan, D. L. Hunter, and M. O. Steinitz, “Polymer conformations through ‘wiggling’,” J. Phys. A 18, 2627–2631 (1985). A discussion of the pivot algorithm summarized in Project 7.41. Also see Tom Kennedy, “A faster implementation of the pivot algorithm for self-avoiding walks,” J. Stat. Phys. 106, 407–429 (2002). Vishal Mehra and Peter Grassberger, “Trapping reaction with mobile traps,” Phys. Rev. E 65, 050101-1–4 (R) (2002). This paper discusses the model of a single walker moving on a lattice of traps. Elliott W. Montroll and Michael F. Shlesinger, “On the wonderful world of random walks,” in Nonequilibrium Phenomena II: From Stochastics to Hydrodynamics, J. L. Lebowitz and E. W. Montroll, editors (North-Holland Press, 1984). The first part of this delightful review article chronicles the history of the random walk. M. E. J. Newman and G. T. Barkema, Monte Carlo Methods in Statistical Physics (Oxford University, 1999). This book has a good section on random number generators. Daniele Passerone and Michele Parrinello, “Action-derived molecular dynamics in the study of rare events,” Phys. Rev. Lett. 87, 108302 (2001). This paper describes a deterministic algorithm for finding extrema of the action. Also see D. Passerone, M. Ceccarelli, and M. Parrinello, J. Chem. Phys. 118, 2025–2032 (2003). John E. Pearson, “Complex patterns in a simple fluid,” Science 261, 189–192 (1993) or pattsol/9304003. See also P. Gray and S. K. Scott, “Sustained oscillations and other exotic patterns of behavior in isothermal reactions,” J. Phys. Chem. 89, 22–32 (1985). Thomas Prellberg, “Scaling of self-avoiding walks and self-avoiding trails in three dimensions,” J. Phys. A 34, L599–602 (2001). The author estimates that ν ≈ 0.5874(2) for the self-avoiding walk in three dimensions. Thomas Prellberg and Jaroslaw Krawczyk, “Flat histogram version of the pruned and enriched Rosenbluth method,” Phys. Rev. Lett. 92, 120602 (2004). The authors discuss an improved algorithm for simulating self-avoiding walks. William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery, Numerical Recipes, 2nd ed. (Cambridge University Press, 1992). This classic book is available online at . See Chapter 15 for a general discussion of the modeling of data, including general linear least squares and nonlinear fits and Chapter 19 for a discussion of the Crank–Nicholson method for solving diffusion-type partial equations. CHAPTER 7. RANDOM PROCESSES 252 Sidney Redner, A Guide to First-Passage Processes (Cambridge University Press, 2001). Sidney Redner and Francois Leyvraz, “Kinetics and spatial organization of competitive reactions,” Chapter 7 in Armin Bunde and Shlomo Havlin, editors, Fractals in Science (Springer– Verlag, 1994). F. Reif, Fundamentals of Statistical and Thermal Physics (McGraw–Hill, 1965). This popular text on statistical physics has a good discussion on random walks (Chapter 1) and diffusion (Chapter 12). F. Reif, Statistical and Thermal Physics, Berkeley Physics, Vol. 5 (McGraw–Hill, 1965). Chapter 2 introduces random walks. Marshall N. Rosenbluth and Arianna W. Rosenbluth, “Monte Carlo calculation of the average extension of molecular chains,” J. Chem. Phys. 23, 356–359 (1955). One of the first Monte Carlo calculations of the self-avoiding walk. Joseph Rudnick and George Gaspari, Elements of the Random Walk (Cambridge University Press, 2004). A graduate level text, but parts are accessible to undergraduates. David Ruelle, Chance and Chaos (Princeton Publishing Company, 1993). A nontechnical introduction to chaos theory that discusses the relation of chaos to randomness. Charles Ruhla, The Physics of Chance (Oxford University Press, 1992). A delightful book on probability in many contexts. Andreas Ruttor, Georg Reents, and Wolfgang Kinzel, “Synchronization of random walks with reflecting boundaries,” J. Phys. A: Math. Gen. 37, 8609–8618 (2004). G. L. Squires, Practical Physics, 4th ed. (Cambridge University Press, 2001). An excellent text on the design of experiments and the analysis of data. John R. Taylor, An Introduction to Error Analysis, 2nd ed. (University Science Books, Oxford University Press (1997). Edward R. Tuffe, The Visual Display of Quantitative Information, Graphics Press (1983) and Envisioning Information (Graphics Press, 1990). Also see Tuffe’s Web site at . I. Vattulainen, T. Ala–Nissila, and K. Kankaala, “Physical tests for random numbers in simulations,” Phys. Rev. Lett. 73, 2513 (1994) and “Physical models as tests of randomness,” Phys. Rev. E 52, 3205–3214 (1995). Also see Vattulainen’s Web site which has some useful programs: . Peter H. Verdier and W. H. Stockmayer, “Monte Carlo calculations on the dynamics of polymers in dilute solution,” J. Chem. Phys. 36, 227–235 (1962). Frederick T. Wall and Frederic Mandel, “Macromolecular dimensions obtained by an efficient Monte Carlo method without sample attrition,” J. Chem. Phys. 63, 4592–4595 (1975). An exposition of the reptation method. George H. Weiss, “A primer of random walkology,” Chapter 5 in Armin Bunde and Shlomo Havlin, editors, Fractals in Science (Springer–Verlag, 1994). George H. Weiss and Shlomo Havlin, “Trapping of random walks on the line,” J. Stat. Phys. 37, 17–25 (1984). The authors discuss an analytic approach to the asymptotic behavior of one-dimensional random walkers with randomly placed traps. CHAPTER 7. RANDOM PROCESSES 253 George H. Weiss and Robert J. Rubin, “Random walks: Theory and selected applications,” Adv. Chem. Phys. 52, 363–503 (1983). In spite of its research orientation, much of this review article can be understood by well-motivated students. Charles A. Whitney, Random Processes in Physical Systems (John Wiley and Sons, 1990). An excellent introduction to random processes with many applications to astronomy. Robert S. Wolff and Larry Yaeger, Visualization of Natural Phenomena (Springer–Verlag, 1993). Chapter 8 The Dynamics of Many-Particle Systems We simulate the dynamical behavior of many particle systems, such as dense gases, liquids, and solids, and observe their qualitative features. Some of the basic ideas of equilibrium statistical mechanics and kinetic theory are introduced. 8.1 Introduction Given our knowledge of the laws of physics at the microscopic level, how can we understand the behavior of gases, liquids, and solids and more complex systems such as polymers and proteins? For example, consider two cups of water prepared under similar conditions. Each cup contains approximately 1025 molecules which mutually interact and, to a good approximation, move according to the laws of classical physics. Although the intermolecular forces produce a complicated trajectory for each molecule, the observable properties of the water in each cup are indistinguishable and are easy to describe. For example, the temperature of the water in each cup is independent of time even though the positions and velocities of the individual molecules are changing continually. One way to understand the behavior of a classical many particle system is to simulate the trajectory of each particle. This approach, known as molecular dynamics, has been applied to systems of up to 109 particles and has given us much insight into a variety of systems in which the particles obey the laws of classical dynamics. A calculation of the trajectories of many particles would not be very useful unless we know the right questions to ask. Saving these trajectories would quickly fill up any storage medium, and we do not usually care about the trajectory of any particular particle. What are the useful quantities needed to describe these many particle systems? What are the essential characteristics and regularities they exhibit? Questions such as these are addressed by statistical mechanics, and some of the ideas of statistical mechanics are discussed in this chapter. However, the only background needed for this chapter is a knowledge of Newton’s laws of motion. 254 CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 255 8.2 The Intermolecular Potential The first step is to specify the model system we wish to simulate. We assume that the dynamics can be treated classically, the molecules are spherical and chemically inert and their internal structure can be ignored, and the interaction between any pair of particles depends only on the distance between them. In this case the total potential energy U is a sum of two-particle interactions: U = u(r12) + u(r13) + ··· + u(r23) + ··· = N−1 i=1 N j=i+1 u(rij) (8.1) where u(rij) depends only on the magnitude of the distance rij between particles i and j. The pairwise interaction form (8.1) is appropriate for simple liquids such as liquid argon. The form of u(r) for electrically neutral molecules can be constructed by a first principles quantum mechanical calculation. Such a calculation is very difficult, and it is usually sufficient to choose a simple phenomenological form for u(r). The most important features of u(r) are a strong repulsion for small r and a weak attraction at large r. The repulsion for small r is a consequence of the Pauli exclusion principle. That is, the electron wave functions of two molecules must distort to avoid overlap, causing some of the electrons to be in different quantum states. The net effect is an increase in kinetic energy and an effective repulsive interaction between the electrons, known as core repulsion. The dominant weak attraction at larger r is due to the mutual polarization of each molecule; the resultant attractive potential is called the van der Waals potential. One of the most common phenomenological forms of u(r) is the Lennard–Jones potential: u(r) = 4 σ r 12 − σ r 6 . (8.2) A plot of the Lennard–Jones potential is shown in Figure 8.1. The r−12 form of the repulsive part of the interaction was chosen for convenience only and has no fundamental significance. The attractive 1/r6 behavior at large r corresponds to the van der Waals interaction. The Lennard–Jones potential is parameterized by a length σ and an energy . Note that u(r) = 0 at r = σ, and that u(r) is close to zero for r > 2.5σ. The parameter is the depth of the potential at the minimum of u(r); the minimum occurs at a separation r = 21/6 σ. Problem 8.1. Qualitative properties of the Lennard–Jones interaction Write a short program to plot the Lennard–Jones potential (8.1) and the magnitude of the corresponding force: f(r) = − u(r) = 24 r 2 σ r )12 − σ r 6 ˆr. (8.3) At what value of r is the force equal to zero? For what values of r is the force repulsive? What is the value of u(r) for r = 0.8σ? How much does u increase if r is decreased to r = 0.72σ, a 10% change in r? What is the value of u at r = 2.5σ? 8.3 Units As usual, it is convenient to choose units so that the computed quantities are neither too small nor too large. Because the values of the distance and the energy associated with typical liquids CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 256 u r ε σ Figure 8.1: Plot of the Lennard–Jones potential u(r). Note that the potential is characterized by a length σ and an energy . are very small in SI units, we choose the Lennard–Jones parameters σ and as the units of distance and energy, respectively. We also choose the unit of mass to be the mass of one atom, m. We can express all other quantities in terms of σ, , and m. For example, we measure velocities in units of ( /m)1/2, and the time in units of σ(m/ )1/2. The values of σ, , and m for argon are given in Table 8.1. If we use these values, we find that the unit of time is 2.17×10−12 s. The units of some of the other physical quantities of interest are also shown in Table 8.1. All program variables are in reduced units; for example, the time in our molecular dynamics program is expressed in units of σ(m/ )1/2. Suppose that we run our molecular dynamics program for 2000 time steps with a time step ∆t = 0.01. The total time of our run is 2000×0.01 = 20 in reduced units or 4.34 × 10−11 s for argon (see Table 8.1). The duration of a typical molecular dynamics simulation is in the range of 10–104 in reduced units, corresponding to a duration of approximately 10−11–10−8 s. The longest practical runs are the order of 10−6 s. 8.4 The Numerical Algorithm Now that we have specified the interaction between the particles, we need to introduce a numerical method for computing the trajectory of each particle. As we have learned, the criteria for a good numerical integration method include that it conserve the phase-space volume and is consistent with the known conservation laws, is time reversible, and is accurate for relatively large time steps to reduce the CPU time needed for the total time of the simulation. These requirements mean that we should use a symplectic algorithm for the relatively long times of interest in molecular dynamics simulations. We adopt the commonly used second-order algorithm: xn+1 = xn + vn∆t + 1 2 an(∆t)2 (8.4a) vn+1 = vn + 1 2 (an+1 + an)∆t. (8.4b) CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 257 Quantity Unit Value for Argon length σ 3.4 × 10−10 m energy 1.65 × 10−21 J mass m 6.69 × 10−26 kg time σ(m/ )1/2 2.17 × 10−12 s velocity ( /m)1/2 1.57 × 102 m/s force /σ 4.85 × 10−12 N pressure /σ2 1.43 × 10−2 N · m−1 temperature /k 120 K Table 8.1: The system of units used in the molecular dynamics simulations of particles interacting via the Lennard–Jones potential. The numerical values of σ, , and m are for argon. The quantity k is Boltzmann’s constant and has the value k = 1.38 × 10−23 J/K. The unit of pressure is for a two-dimensional system. To simplify the notation, we have written the algorithm for only one component of the particle’s motion. The new position is used to find the new acceleration an+1, which is used together with an to obtain the new velocity vn+1. The algorithm represented by (8.4) is known as the Verlet (or sometimes the velocity Verlet) algorithm (see Appendix 3A). We will use the Verlet implementation of the ODESolver interface to implement the algorithm. Thus, the x, vx, y, and vy values for the ith particle will be stored in the state array at state[4*i], state[4*i+1], state[4*i+2], and state[4*i+3], respectively. 8.5 Periodic Boundary Conditions A useful simulation must incorporate as many of the relevant features of the physical system of interest as possible. Usually we want to simulate a gas, liquid, or a solid in the bulk, that is, systems of at least N ∼ 1023 particles. In such systems the fraction of particles near the walls of the container is negligibly small. The number of particles that can be studied in a molecular dynamics simulation is typically 103–105, although we can simulate the order of 109 particles using clusters of computers. For these relatively small systems, the fraction of particles near the walls of the container is significant, and hence, the behavior of such a system would be dominated by surface effects. The most common way of minimizing surface effects and to simulate more closely the properties of a bulk system is to use what are known as periodic boundary conditions, although the minimum image approximation would be a more accurate name. This boundary condition is familiar to anyone who has played the Pacman computer game. Consider N particles that are constrained to move on a line of length L. The application of periodic boundary conditions is equivalent to considering the line to be a circle, and hence, the maximum separation between any two particles is L/2 (see Figure 8.2). The generalization of periodic boundary conditions to two dimensions is equivalent to imagining a box with opposite edges joined so that the box becomes the surface of a torus (the shape of a doughnut or a bagel). The three-dimensional version of periodic boundary conditions cannot be visualized easily, but the same methods can be used. The implementation of periodic boundary conditions is straightforward. If a particle leaves the box by crossing a boundary in a particular direction, we add or subtract the length L of the box in that direction to the position. One simple way is to use an if else statement as shown: CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 258 1 2 3 0 (a) 0 4 (b) Figure 8.2: (a) Two particles at x = 0 and x = 3 on a line of length L = 4; the distance between the particles is 3. (b) The application of periodic boundary conditions for short range interactions is equivalent to thinking of the line as forming a circle of circumference L. In this case the minimum distance between the two particles is 1. Listing 8.1: Calculation of the position of particle in the central cell. private double pbcPosition ( double s , double L) { i f ( s > L) { s −= L ; } else i f ( s < 0) { s += L ; } return s ; } To compute the minimum distance ds in a particular direction between two particles, we can use the method pbcSeparation (see Figure 8.2): Listing 8.2: Calculation of the minimum separation. private double pbcSeparation ( double ds , double L) { i f ( ds > 0.5 L) { ds −= L ; } else i f ( ds < −0.5 L) { ds += L ; } return ds ; } The equivalent static methods, PBC.position and PBC.separation in the Open Source Physics numerics package can also be used. Exercise 8.2. Use of the % operator (a) Another way to compute the position of a particle in the central cell is to use the % (modulus) operator. For example, 17 % 5 equals 2 because 17 divided by 5 leaves a remainder of 2. The % operator can also be used with floating point numbers. For example, 10.2 % 5 = 0.2. Write a little test program to see how the % function works and determine the result of 10.2 % 3.3, -10.2 % 3.3, 10.2 % -3.3, and -10.2 % -3.3. In what way does % act like a remainder operator? (b) From the results of part (a) we might consider writing x = x % L as an alternative to Listing 8.1. What about negative values of x? In this case -17 % 5 = -2. Because we want the resultant position to be positive, we write return x<0 ? x%L+L : x%L ; CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 259 Explain this syntax and write a program to test if this statement works as claimed. (c) Write a simple program to determine if the % operator is faster than the if-else construction in Listing 8.1. Write another program that compares the speed of calling the PCB.position method to that of inlining the PBC code. In other words, replace the method call by the above statement. We now discuss the nature of periodic boundary conditions. Imagine a set of N particles in a two-dimensional box or cell. The use of periodic boundary conditions implies that the central cell is duplicated an infinite number of times to fill the space. Figure 8.3 shows the first several image cells for N = 2. The shape of the central cell must be such that the cell fills space under successive translations. Each image cell contains the original particles in the same relative positions as the central cell. That is, periodic boundary conditions yield an infinite system, although the positions of the particles in the image cells are identical to the positions of the particles in the central cell. These boundary conditions also imply that every point in the cell is equivalent and that there is no surface. As a particle moves in the original cell, its periodic images move in the image cells. Hence, only the motion of the particles in the central cell needs to be followed. When a particle enters or leaves the central cell, the move is accompanied by an image of that particle leaving or entering a neighboring cell through the opposite face. The total force on a given particle i is due to the force from every other particle j within the central cell and from the periodic images of particle j. That is, if particle i interacts with particle j in the central cell, then particle i interacts with all the periodic replicas of particle j. Hence, in general, there are an infinite number of contributions to the force on any given particle. For long-range interactions such as the Coulomb potential, these contributions have to be included using special methods. For short-range interactions, we can reduce the number of contributions by adopting the minimum image approximation, which assumes that particle i in the central cell interacts only with the nearest image of particle j; the interaction is set equal to zero if the distance of the image from particle i is greater than L/2. An example of the minimum image approximation is shown in Figure 8.3. 8.6 A Molecular Dynamics Program In this section we develop a molecular dynamics program to simulate a two-dimensional system of particles interacting via the Lennard–Jones potential. We choose two rather than three dimensions because it is easier to visualize the results and the calculations are not as time con- suming. In principle, we could define a class for a particle and instantiate an object for each particle. However, this use would be very inefficient and would take up more memory and CPU time than using one class to represent all N particles. Instead we will store the x- and y-components of the positions and velocities in the state array and store the accelerations of the particles in a separate array. As usual, we will develop two classes, LJParticles and LJParticlesApp. Because the system is deterministic, the nature of the motion is determined by the initial conditions. An appropriate choice of the initial conditions is more difficult than might first appear. For example, how can we choose the initial configuration (a set of positions and velocities) to correspond to a liquid at a desired temperature? According to the equipartition theorem, the CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 260 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 Ly Lx 1 2 Figure 8.3: Example of the minimum image approximation in two dimensions. The minimum image distance convention implies that the separation between particles 1 and 2 is given by the lesser of the two distances shown. mean kinetic energy of a particle per degree of freedom is kT /2, where k is Boltzmann’s constant and T is the temperature. We can generalize this relation to define the temperature at time t by kT (t) = 2 d K(t) N = 1 Nd N i=1 mivi(t) · vi(t) (8.5) where K is the total kinetic energy of the system, vi is the velocity of particle i with mass mi, and d is the spatial dimension of the system. We can use (8.5) to choose an initial set of velocities. The following method gives the particles a random set of velocities, sets the total velocity (momentum) to zero, and then rescales the velocities so that the desired initial kinetic energy is achieved. Listing 8.3: Method for choosing the initial velocities. public void s e t V e l o c i t i e s ( ) { double vxSum = 0 . 0 ; double vySum = 0 . 0 ; for ( int i = 0; i = " +decimalFormat . format (md. getMeanTemperature ( ) ) ) ; control . println ( " = "+decimalFormat . format (md. getMeanEnergy ( ) ) ) ; control . println ( "Heat capacity = " +decimalFormat . format (md. getHeatCapacity ( ) ) ) ; CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 268 control . println ( " = " +decimalFormat . format (md. getMeanPressure ( ) ) ) ; } public void startRunning ( ) { md. dt = control . getDouble ( "dt" ) ; double Lx = control . getDouble ( "Lx" ) ; double Ly = control . getDouble ( "Ly" ) ; i f ( ( Lx!=md. Lx ) | | ( Ly!=md. Ly ) ) { md. Lx = Lx ; md. Ly = Ly ; md. computeAcceleration ( ) ; display . setPreferredMinMax (0 , Lx , 0 , Ly ) ; resetData ( ) ; } } public void reset ( ) { control . setValue ( "nx" , 8 ) ; control . setValue ( "ny" , 8 ) ; control . setAdjustableValue ( "Lx" , 2 0 . 0 ) ; control . setAdjustableValue ( "Ly" , 1 5 . 0 ) ; control . setValue ( "initial kinetic energy per particle" , 1 . 0 ) ; control . setAdjustableValue ( "dt" , 0 . 0 1 ) ; control . setValue ( "initial configuration" , "rectangular" ) ; enableStepsPerDisplay ( true ) ; / / draw c o n f i g u r a t i o n s every 10 s t e p s super . setStepsPerDisplay ( 1 0 ) ; / / so p a r t i c l e s w i l l appear as c i r c u l a r d i s k s display . setSquareAspect ( true ) ; } public void resetData ( ) { md. resetAverages ( ) ; / / c l e a r s old data from the p l o t frames GUIUtils . clearDrawingFrameData ( false ) ; } public s t a t i c XML. ObjectLoader getLoader ( ) { return new LJParticlesLoader ( ) ; } public s t a t i c void main ( String [ ] args ) { SimulationControl control = SimulationControl . createApp ( new LJParticlesApp ( ) ) ; control . addButton ( "resetData" , "Reset Data" ) ; } } Listing 8.12: The LJParticlesLoader class for saving configurations. package org . opensourcephysics . sip . ch08 .md; import org . opensourcephysics . controls . ; import org . opensourcephysics . display . GUIUtils ; CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 269 public class LJParticlesLoader implements XML. ObjectLoader { public Object createObject ( XMLControl element ) { return new LJParticlesApp ( ) ; } public void saveObject ( XMLControl control , Object obj ) { LJParticlesApp model = ( LJParticlesApp ) obj ; control . setValue ( "initial_configuration" , model .md. initialConfiguration ) ; control . setValue ( "state" , model .md. s t a t e ) ; } public Object loadObject ( XMLControl control , Object obj ) { / / GUI has been loaded with the saved values ; now r e s t o r e the LJ s t a t e LJParticlesApp model = ( LJParticlesApp ) obj ; / / reads values from the GUI into the LJ model model . i n i t i a l i z e ( ) ; model .md. initialConfiguration = control . getString ( "initial_configuration" ) ; model .md. s t a t e = ( double [ ] ) control . getObject ( "state" ) ; int N = ( model .md. s t a t e . length −1)/4; model .md. ax = new double [N] ; model .md. ay = new double [N] ; model .md. computeAcceleration ( ) ; model .md. resetAverages ( ) ; / / c l e a r s old data from the p l o t frames GUIUtils . clearDrawingFrameData ( false ) ; return obj ; } } Problem 8.3. Approach to equilibrium (a) Consider N = 64 particles interacting via the Lennard–Jones potential in a square central cell of linear dimension L = 10. Start the system on a square lattice with an initial temperature corresponding to T = 1.0. Let ∆t = 0.01 and run the application to make sure that it is working properly. The total energy should be approximately conserved and the trajectories of all 64 particles should be seen on the screen. (b) The kinetic temperature of the system is given by (8.5). View the evolution of the temperature of the system starting from the initial temperature. Does the temperature reach an equilibrium value? That is, does it eventually fluctuate about some mean value? What is the mean value of the temperature for the given total energy of the system? (c) Modify method setRectangularLattice so that all the particles are initially on the left side of a box of linear dimensions Lx = 20 and Ly = 10. Does the system become more or less random as time increases? (d) Modify the program so it computes n(t), the number of particles in the left half of the cell, and plot n(t) as a function of t. What is the qualitative behavior of n(t)? What is the mean number of particles on the left half after the system has reached equilibrium? Compare your qualitative results with the results you found in Problem 7.2. CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 270 Figure 8.4: Example of a special initial condition; the arrows represent the magnitude and the direction of each particle’s velocity. Problem 8.4. Sensitivity to initial conditions (a) Modify your program to consider the following initial condition corresponding to N = 11 particles moving in the same direction with the same velocity (see Figure 8.4). Choose Lx = Ly = 10 and ∆t = 0.01. for ( int i = 0; i < N; i ++) { x [ i ] = Lx /2; y [ i ] = ( i − 0.5) Ly/N; vx [ i ] = 1; vy [ i ] = 0; } Does the system eventually reach equilibrium? Why or why not? (b) Change the velocity of particle 6 so that vx(6) = 0.99999 and vy(6) = 0.00001. Is the behavior of the system qualitatively different than in part (a)? Does the system eventually reach equilibrium? Are the trajectories of the particles sensitive to the initial conditions? Explain why this behavior implies that almost all initial states lead to the same qualitative behavior (for a given total energy). (c) Modify LJParticlesApp so that the application runs for a predetermined time interval, such as 100 time steps, and then continues with the time reversed process, that is, the motion that would occur if the direction of time was reversed. This reversal is equivalent to letting v → −v for all particles or letting ∆t → −∆t. Do the particles return to their original positions? What happens if you reverse the velocities at a later time? What happens if you choose a smaller value of ∆t? (d) Explain why you can conclude that the system is chaotic. Are the computed trajectories the same as the “true” trajectories? From Problems 8.3 and 8.4, we see that from the microscopic point of view, the trajectories appear rather complex. In contrast, from the macroscopic point of view, the system can be described more simply. For example, in Problem 8.3 we described the approach of the system to equilibrium by specifying n(t), the number of particles in the left half of the cell at time t. Your observations of the macroscopic variable n(t) should be consistent with the following two general properties of systems of many particles: CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 271 1. After the removal of an internal constraint, an isolated system changes in time from a “less random” to a “more random” state. 2. A system whose macroscopic state is independent of time is said to be in equilibrium. The equilibrium macroscopic state is characterized by relatively small fluctuations about a mean that is independent of time. The relative fluctuations become smaller as the number of particles becomes larger. In Problems 8.3b and 8.3c we found that the particles filled the box and did not return to their initial configuration. Hence, we were able to define a direction of time. This direction becomes better defined if we consider more particles. Note that Newton’s laws of motion are time reversible, and there is no a priori reason that gives the time a preferred direction. Before we consider other macroscopic quantities, we need to monitor the total energy and verify our claim that the Verlet algorithm maintains conservation of energy with a reasonable choice of ∆t. We also introduce a check for momentum conservation. Problem 8.5. Tests of the Verlet algorithm (a) One essential check of a molecular dynamics program is that the total energy be conserved to the desired accuracy. Determine the value of ∆t necessary for the total energy to be conserved to a given accuracy over a time interval of t = 2. One way is to compute ∆Emax(t), the maximum value of the difference, |E(t) − E(0)|, over the time interval t, where E(0) is the initial total energy, and E(t) is the total energy at time t. Verify that ∆Emax(t) decreases when ∆t is made smaller for fixed t. If your application is working properly, ∆Emax(t) should decrease as approximately (∆t)2 because the Verlet algorithm is a second-order algorithm. (b) A simple way of monitoring how well the program is conserving the total energy is to use a least squares fit of the times series of E(t) to a straight line. The slope of the line can be interpreted as the drift, and the root mean square deviation from the straight line can be interpreted as the noise (σy in the notation of Section 7.6). How do the drift and the noise depend on ∆t for a fixed time interval t? Most research applications conserve the energy to 1 part in 104 or better over the duration of the run. (c) Because of the use of periodic boundary conditions, all points in the central cell are equivalent and the system is translationally invariant. As you might have learned, translational invariance implies that the total linear momentum is conserved. However, floating point error and the truncation error associated with a finite difference algorithm can cause the total linear momentum to drift. Programming errors might also be detected by checking for conservation of momentum. Hence, it is a good idea to monitor the total linear momentum at regular intervals and reset the total momentum equal to zero if necessary. The method setVelocities in Listing 8.3 chooses the velocities so that the total momentum is initially zero. Add a method that resets the total momentum to zero and call it at regular intervals, for example, every 1000–10,000 time steps. How well does class LJParticles conserve the total linear momentum for ∆t = 0.01? 8.7 Thermodynamic Quantities In the following, we discuss how some of the macroscopic quantities of interest, such as the temperature and the pressure, can be related to time averages over the phase space trajectories of the particles. CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 272 We have already introduced the definition of the kinetic temperature in (8.5). The temperature that we measure in a laboratory experiment is the mean temperature, which corresponds to the time average of T (t) over many configurations of the particles. For two dimensions (d = 2), we write the mean temperature T as kT = 1 2N N i=1 mivi(t) · vi(t) (two dimensions) (8.6) where X denotes the time average of X(t). The relation (8.6) is an example of the relation of a macroscopic quantity (the mean temperature) to a time average over the trajectories of the particles. (This definition of temperature is not adequate for particles moving relativistically, or if quantum mechanics is important.) The relation (8.5) holds only if the momentum of the center of mass of the system is zero— we do not want the motion of the center of mass to change the temperature. In a laboratory system, the walls of the container ensure that the center of mass motion is zero (if the mean momentum of the walls is zero). In our simulation, we impose the constraint that the center of mass momentum (in each of the d directions) be zero. Consequently, the system has dN − d independent velocity components rather than dN components, and we should replace (8.6) by kT = 1 (N − 1)d N i=1 mivi(t) · vi(t) (correction for fixed center of mass). (8.7) The presence of the factor (N −1)d rather than Nd in (8.7) is an example of a finite size correction that becomes unimportant for large N. We shall ignore this correction in the following. Another macroscopic quantity of interest is the mean pressure. The pressure is related to the force per unit area normal to an imaginary surface in the system. By Newton’s second law, this force is related to the momentum that crosses the surface per unit time. We could use this relation to determine the pressure, but this relation uses information only from the fraction of particles that are crossing an arbitrary surface at a given time. Instead, we will use the relation of the pressure to the virial, which involves all the particles in the system. In general, the momentum flux across a surface has two contributions. The contribution, NkT /V , where V is the volume (area) of the system, is due to the motion of the particles and is derived in many texts using simple kinetic theory arguments. The other contribution to the momentum flux arises from the momentum transferred across the surface due to the forces between particles on different sides of the surface. It can be shown that the instantaneous pressure at time t, including both contributions to the momentum flux, is given by P (t)V = NkT (t) + 1 d i 0.6, T ≈ 1.0) with N ≥ 64. How many peaks in g(r) can you observe? In what ways do they change as the density is increased? How does the behavior of g(r) for a dense liquid compare to that of a dilute gas and a solid? 8.9 Hard Disks How can we understand the temperature and density dependence of the equation of state and the structure of a dense liquid? One way to gain more insight into this dependence is to modify the interaction and see how the properties of the system change. In particular, we would like to understand the relative role of the repulsive and attractive parts of the interaction. For this reason, we consider an idealized system of hard disks for which the interaction u(r) is purely repulsive: u(r) =    +∞, r < σ 0, r ≥ σ . (8.19) The length σ is the diameter of the hard disks (see Figure 8.7). In three dimensions the interaction (8.19) describes the interaction of hard spheres (billiard balls); in one dimension (8.19) describes the interaction of hard rods. Because the interaction u(r) between hard disks is a discontinuous function of r, the dynamics of hard disks is qualitatively different than it is for a continuous interaction such as the Lennard–Jones potential. For hard disks the particles move in straight lines at constant speed between collisions and change their velocities instantaneously when a collision occurs. Hence the problem becomes finding the next collision and computing the change in the velocities of the colliding pair. The dynamics is event driven and can be computed exactly in principle; in practice, it is limited only by roundoff errors. The dynamics of a system of hard disks can be treated as a sequence of two-body elastic collisions. The idea is to consider all pairs of particles i and j and to find the collision time tij for their next collision, ignoring the presence of all other particles. In many cases the particles will be going away from each other and the collision time is infinite. From the collection of collision times for all pairs of particles, we find the minimum collision time. We then move all the particles forward in time until the collision occurs and calculate the postcollision velocities of the colliding pair. The main problem is dealing with the large number of possible collision events. We first determine the particle velocities of the colliding pair. Consider a collision between particles i and j. Let vi and vj be their velocities before the collision and vi and vj be their CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 282 velocities after the collision. Because the particles have equal mass, it follows from conservation of energy and linear momentum that vi 2 + vj 2 = vi 2 + vj 2 (8.20) vi + vj = vi + vj. (8.21) From (8.21) we have ∆vi = vi − vi = −(vj − vj) = −∆vj. (8.22) When two hard disks collide, the force is exerted along the line connecting their centers, rij = ri − rj. Hence, the components of the velocities parallel to rij are exchanged, and the perpendicular components of the velocities are unchanged. It is convenient to write the velocity of particles i and j as a vector sum of their components parallel and perpendicular to the unit vector ˆrij = rij/|rij|. We write the velocity of particle i as vi = vi, + vi,⊥ (8.23) where vi, = (vi · ˆrij)ˆrij, and vi, = vj, vj, = vi, (8.24a) vi,⊥ = vi,⊥ vj,⊥ = vj,⊥. (8.24b) Hence, we can write vi as vi = vi, + vi,⊥ = vj, + vi,⊥ = vj, − vi, + vi, + vi,⊥ = (vj − vi) · ˆrij ˆrij + vi. (8.25) The change in the velocity of particle i at a collision is given by ∆vi = vi − vi = − (vi − vj) · ˆrij ˆrij (8.26) or ∆vi = −∆vj = rij bij σ2 contact (8.27) where bij = vij · rij, vij = vi − vj, and we have used the fact that |rij| = σ at contact. Exercise 8.14. Velocity distribution of hard rods Use (8.20) and (8.21) to show that vi = vj and vj = vi in one dimension; that is, two colliding hard rods of equal mass exchange velocities. If you start a system of hard rods with velocities chosen from a uniform random distribution, will the velocity distribution approach the equilibrium Maxwell–Boltzmann distribution? We now consider the criteria for a collision to occur. Consider disks i and j at positions ri and rj at t = 0. If they collide at a time tij later, their centers will be separated by a distance σ: |ri(tij) − rj(tij)| = σ. (8.28) CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 283 During the time tij, the disks move with constant velocities. Hence, we have ri(tij) = ri(0) + vi(0)tij (8.29) and rj(tij) = r2(0) + v2(0)tij. (8.30) If we substitute (8.29) and (8.30) into (8.28), we find [rij + vijtij]2 = σ2 (8.31) where rij = ri(0) − rj(0), vij = vi(0) − vj(0), and tij = −vij · rij ± (vij · rij)2 − vij 2(rij 2 − σ2) vij 2 . (8.32) Because tij > 0 for a collision to occur, we see from (8.32) that the condition vij · rij < 0 (8.33) must be satisfied. That is, if vij · rij > 0, the particles are moving away from each other and there is no possibility of a collision. If the condition (8.33) is satisfied, then the discriminant in (8.32) must satisfy the condition (vij · rij)2 − vij 2 (rij 2 − σ2 ) ≥ 0. (8.34) If the condition (8.34) is satisfied, then the quadratic in (8.32) has two roots. The smaller root corresponds to the physically significant collision because the disks are impenetrable. Hence, the physically significant solution for the time of a collision tij for particles i and j is given by tij = −bij − bij 2 − vij 2 (rij 2 − σ2) 1/2 vij 2 . (8.35) Exercise 8.15. Calculation of collision times Write a short program that determines the collision times (if any) of the following pairs of particles. It would be a good idea to draw the trajectories to confirm your results. Consider the cases: r1 = (2,1), v1 = (−1,−2), r2 = (1,3), v2 = (1,1); r1 = (4,3), v1 = (2,−3), r2 = (3,1), v2 = (−1,−1); and r1 = (4,2), v1 = (−2, 1 2 ), r2 = (3,1), v2 = (−1,1). As usual, choose units so that σ = 1. Our hard disk program implements the following steps. We first find the collision times and the collision partners for all pairs of particles i and j. We then do the following. 1. locate the minimum collision time tmin; 2. advance all particles using a straight line trajectory until the collision occurs; that is, displace particle i by vi tmin and update its next collision time; 3. compute the postcollision velocities of the colliding pair nextCollider and nextPartner; 4. calculate the physical quantities of interest and accumulate data; CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 284 5. update the collision partners of the colliding pair, nextCollider and nextPartner, and all other particles that were to collide with either nextCollider or nextPartner if nextCollider and nextPartner had not collided first; 6. repeat steps 1–5 indefinitely. Methods for carrying out these steps are listed in the following: Listing 8.16: Methods for each step of the hard disk system. public void step ( ) { / / f i n d s minimum c o l l i s i o n time from l i s t of c o l l i s i o n times minimumCollisionTime ( ) ; / / moves p a r t i c l e s f o r time equal to minimum c o l l i s i o n time move ( ) ; t += timeToCollision ; / / changes v e l o c i t i e s of two c o l l i d i n g p a r t i c l e s contact ( ) ; / / s e t s c o l l i s i o n times to bigTime f o r those p a r t i c l e s s e t to c o l l i d e with / / two c o l l i d i n g p a r t i c l e s . setDefaultCollisionTimes ( ) ; / / f i n d s new c o l l i s i o n times between a l l p a r t i c l e s and two c o l l i d i n g p a r t i c l e s newCollisionTimes ( ) ; numberOfCollisions ++; } public void minimumCollisionTime ( ) { / / s e t s c o l l i s i o n time very l a r g e so that can find minimum c o l l i s i o n time timeToCollision = bigTime ; for ( int k = 0; k0) { double t i j = (− bij −Math . sqrt ( discriminant ) ) / v2 ; i f ( t i j rc. However, this use of a cutoff implies that u(r) has a discontinuity at r = rc, which means that whenever a particle pair “crosses” the cutoff distance, the energy jumps, thus affecting the apparent energy conservation. To avoid this problem, it is a good idea to modify the potential so as to eliminate the discontinuity in both u(r) and the force −du/dr. Hence, we write ˜u(r) = u(r) − u(rc) − du(r) dr r=rc (r − rc) (8.42) where u(r) is the usual Lennard–Jones potential. The use of the interparticle potential (8.42) to calculate the force and the energy requires considering only those pairs of particles whose separation is less than rc. Because testing whether each pair satisfies this criterion is an order N2 calculation, we have to limit the number of pairs tested. One way is to divide the box into small cells and to only compute the distance between particles that are in the same cell or in nearby cells. Another method is to maintain a list for each particle of its neighbors whose separation is less than a distance rn, where rn is chosen to be slightly greater than rc. The idea is to use the same list of neighbors for several time steps (usually 10–20) so that the time consuming job of updating the list of neighbors does not have to be done too often. The cell method and the neighbor list method do not become efficient until N is approximately a few hundred particles. Usually, the neighbor list leads to the consideration of fewer particle pairs in the force calculation than the cell list. We provide a method to compute the neighbor list below. A more efficient approach is to use cells to construct the neighbor list. public void computeNeighborList ( ) { for ( int i = 0; i < N−1; i ++) { numberInList [ i ] = 0; for ( int j = i +1; j < N; j ++) { double dx = separation ( x [ i ] − x [ j ] , Lx ) ; double dy = separation ( y [ i ] − y [ j ] , Ly ) ; double r2 = dx dx + dy dy ; i f ( r2 < r2ListCutoff ) { l i s t [ i ] [ numberInList [ i ] ] = j ; numberInList [ i ]++; } } } } To use this list in method computeAcceleration, we replace the for loops by for ( int i = 0; i < N−1; i ++) { for ( int k = 0; k < numberInList [ i ] ; k++) { int j = l i s t [ i ] [ k ] ; } } CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 297 The method computeNeighborList should be called before a particle may have moved a distance equal to the difference rn − rc. This time depends on the density and the temperature. For dense systems a reasonable value for rn is 2.7σ. Simulations of small systems can be used to determine the time between calls of computeNeighborList. Note that in method computeNeighborList, only particles j > i are included in list[i]. In Section 15.10 we will consider Monte Carlo simulations where a particle is chosen at random, and its potential energy of interaction must be computed. In this case we cannot take advantage of Newton’s third law, and a neighbor list must be created for all particles that are within a distance rn of particle i. ∗Problem 8.21. Neighbor lists (a) Simulate a system of N = 64 Lennard–Jones particles in a square cell with L = 10 at a temperature T = 2.0. After the system has reached equilibrium, determine the shortest time for any particle to move a distance equal to 0.2. Use half this time in the rest of the program as the time between updates of the neighbor list. (b) Run your simulation with and without the neighbor list starting from identical initial configurations. Choose rc = 2.3 and use the modified potential given in (8.42). Calculate g(r), the pressure, the heat capacity (see Problem 8.8), and the temperature. Make sure your results are identical. Compare the amount of CPU time with and without the use of the neighbor list. (c) Repeat part (b) with N = 256 but with the same density and total energy. You can adjust the total energy by scaling the initial velocities. Increase N until the CPU time for the neighbor list version is faster. (d) Continue increasing the number of particles by a factor of four, but only use the program with the neighbor list. Determine the CPU time required for one time step as a function of N. So far we have discussed molecular dynamics simulations at fixed energy, volume, and number of particles. Molecular dynamics simulations at fixed temperature are discussed in Project 8.24. It is also possible to modify the dynamics so as to do molecular dynamics simulations at constant pressure and to do simulations in which the shape of the cell is determined by the dynamics, rather than imposed by the program. Such a simulation is essential for the study of solid-to-solid transitions where the major change is the shape of the crystal. In addition to these algorithmic advances, there is much more to learn about the properties of the system. For example, how are transport properties such as the viscosity and the thermal conductivity related to the trajectories? We have also not discussed one of the most fundamental properties of a many-body system, namely, its entropy. In brief, not all macroscopic properties of a many-body system, including the entropy, can be defined as a time average over some function of the phase space coordinates of the particles (but see Ma). However, changes in the entropy can be computed by using thermodynamic integration. The fundamental limitation of molecular dynamics is the existence of multiple time scales. We must choose the time step ∆t to be smaller than any physical time scale in the system. For a solid, the smallest time scale is the period of the oscillatory motion of individual particles about their equilibrium positions. If we want to know how the solid responds to the addition of an interstitial particle or a vacancy, we would have to run for millions of small time steps for the vacancy to move several interparticle distances. Although this particular problem can CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 298 be overcome by using a faster computer, there are many problems for which no imaginable supercomputer would be sufficient. One of the biggest current challenges is the protein folding problem. The biological function of a protein is determined by its three-dimensional structure which is encoded by the sequence of amino acids in the protein. At present, we know little about how the protein forms its three-dimensional structure. Such formidable computational challenges remind us that we cannot simply put a problem on a computer and let the computer tell us the answer. In particular, for many problems, molecular dynamics methods need to be complemented by other simulation methods, especially Monte Carlo methods (see Chapter 15). The emphasis in current applications of molecular dynamics is shifting from studies of simple equilibrium fluids to studies of more complex fluids and nonequilibrium systems. For example, how does a solid form when the temperature of a liquid is lowered quickly? How does a crack propagate in a brittle solid? What is the nature of the glass transition? Molecular dynamics and related methods will play an important role in aiding our understanding of these and many other problems. 8.12 Projects Many of the pioneering applications of molecular dynamics were done on relatively small systems. It is interesting to peruse the research literature of the past three decades to see how much physical insight was obtained from these simulations. Many research-level problems can be generated by first reproducing previously published work and then extending the work to larger systems or longer run times to obtain better statistics. Many related projects are discussed in Chapter 15. Project 8.22. The classical Heisenberg model of magnetism Magnetism is intrinsically a quantum phenomenon. One common model of magnetism is the Heisenberg model which is defined by the Hamiltonian or energy function: H = −J Si · Sj (8.43) where Si is the spin operator at the ith lattice site. The sum is over nearest neighbor sites of the lattice, and the (positive) coupling constant J is a measure of the strength of the interaction between spins. The negative sign indicates that the lowest energy state is ferromagnetic. The magnetic moment of a particle on a site is proportional to the particle’s spin, and the proportionality constant is absorbed into the constant J. For many models of magnetism, such as the Ising model (see Section 15.5), there is no obvious dynamics. However, for the Heisenberg model we can motivate a dynamics using the standard rule for the time evolution of an operator given in quantum mechanics texts. For simplicity, we will consider a one-dimensional lattice. The equation for the time development becomes (see Slani˘c et al.) dSi dt = JSi × (Si−1 + Si+1). (8.44) In general, S in (8.44) is an operator. However, if the magnitude of the spin is sufficiently large, the system can be treated classically, and S can be interpreted as a three-dimensional unit vector. The dynamics in (8.44) conserves the total energy given in (8.43) and the total magnetization, M = i Si. We can simulate the classical Heisenberg magnet using an ODE solver to solve the firstorder differential equation (8.44). CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 299 (a) Explain why there is no obvious way to determine the mean temperature of this system. (b) Write a program to simulate the Heisenberg model on a one-dimensional lattice using periodic boundary conditions. Choose J = 1 and N ≥ 100. Use the RK4 ODE solver, and plot the energy and magnetization as a function of time. These two quantities should be constant within the accuracy of the ODE solver. Also, plot each component of the spin versus position or draw a three-dimensional representation of the spin at each site so that you can visualize the state of the system. (c) Begin with all spins in the positive z direction, except for one spin pointing in the negative z direction. Use N ≥ 1000. Define the energy of spin i as i = −Si · (Si−1 + Si+1)/2. Plot the local energy as a function of i. Describe how the local energy diffuses. What patterns do you observe? Do the locations of the peaks in the local energy move with a constant speed? (d) One of the interesting dynamical phenomena we can explore is that of spin waves. Begin with all Sz,i = 1 except for a group of 20 spins, where Sx,i = Acoski, Sy,i = Asinki, and Sz,i = 1 − S2 x,i + S2 y,i. Choose A = 0.2 and k = 1. Describe the motion of the spins. Compute the mean position of this spin wave defined by x = i i(1−Sz,i). Show that x changes linearly with time indicating a constant spin wave velocity. Vary k and A to determine what effect their values have on the speed of the spin wave. (e) Read about sympletic algorithms in the article by Tsai, Lee, and Landau and write your own ODE solver for one of them. Compare your results to the results you found for the RK4 algorithm. Is the total energy better conserved for the same value of the time step? Project 8.23. Single particle metrics and ergodic behavior As mentioned in Section 8.7, the quasi-ergodic hypothesis assumes that time averages and ensemble averages are identical for a system in thermodynamic equilibrium. The assumption is that if we run a molecular dynamics simulation for a sufficiently long time, then the dynamical trajectory will fill the accessible phase space. One way to confirm the quasi-ergodic hypothesis is to compute an ensemble average by simulating many independent copies of the system of interest using different initial configurations. Another way is to simulate a very large system and compare the behavior of different parts. A more direct measure of ergodicity (see Thirumalai and Mountain) is based on a comparison of the time averaged quantity fi(t) of fi for particle i to its average for all other particles. If the system is ergodic, then all particles see the same average environment, and the time average fi(t) for each particle will be the same if t is sufficiently long. Note that fi(t) is the average of the quantity fi over the time interval t and not the value of fi at time t. The time average of fi is defined as fi(t) = 1 t t 0 f (t )dt (8.45) and the average of fi(t) over all particles is written as f (t) = 1 N N i=1 fi(t). (8.46) One of the physical quantities of interest is the energy of a particle ei defined as ei = p2 i 2mi + 1 2 i j u(rij). (8.47) CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 300 The factor of 1/2 is included in the potential energy term in (8.47) because the interaction energy is shared between pairs of particles. The above considerations lead us to define the energy metric, Ωe(t), as Ωe(t) = 1 N N i=1 ei(t) − e(t) 2 . (8.48) (a) Compute Ωe(t) for a system of Lennard–Jones particles at a relatively high temperature. Determine ei(t) at time intervals of 0.5 or less and average Ωe over as many time origins as possible. If the system is ergodic over the time interval t, then it can be shown that Ωe(t) decreases as 1/t. Plot 1/Ωe(t) versus t. Do you find that 1/Ωe(t) eventually behaves linearly with t? Nonergodic behavior might be found by rapidly reducing the kinetic energy (a temperature quench) and obtaining an amorphous solid or glass rather than a crystalline solid. However, it would be necessary to consider three-dimensional rather than two-dimensional systems because the latter system forms a crystalline solid very quickly. (b) Another quantity of interest is the velocity metric Ωv: Ωv(t) = 1 dN N i=1 vi(t) − v(t) 2 . (8.49) The factor of 1/d in (8.49) is included because the velocity is a vector with d components. If we choose the total momentum of the system to be zero, then v(t) = 0, and we can write (8.49) as Ωv(t) = 1 dN N i=1 vi(t) · vi(t). (8.50) As we will see, the time dependence of Ωv(t) is not a good indicator of ergodicity, but can be used to determine the diffusion coefficient D. We write vi(t) = 1 t t 0 vi(t )dt = 1 t ri(t) − ri(0) . (8.51) If we substitute (8.51) into (8.50), we can express the velocity metric in terms of the mean square displacement: Ωv(t) = 1 dNt2 N i=1 ri(t) − ri(0) 2 = R2(t) d t2 . (8.52) The average in (8.52) is over all particles. If the particles are diffusing during the time interval t, then R2(t) = 2dDt and Ωv(t) = 2D/t. (8.53) From (8.53) we see that Ωv(t) goes to zero as 1/t as claimed in part (a). However, if the particles are localized (as in a crystalline solid and a glass), then R2 is bounded for all t and Ωv(t) ∼ 1/t2. Because a crystalline solid is ergodic and a glass is not, the velocity metric is not a good measure of the lack of ergodicity. Use the t-dependence of Ωv(t) in (8.53) to determine D for the same configurations as in Problem 8.19. Project 8.24. Constant temperature molecular dynamics In the molecular dynamics simulations we have discussed so far, the energy is constant up to truncation, discretization, and floating point errors, and the temperature fluctuates. However, CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 301 sometimes it is more convenient to do simulations at constant temperature. In Chapter 15 we will see how to simulate systems at constant T , V , and N (the canonical ensemble) by using Monte Carlo methods. However, we can also do constant temperature simulations by modifying the dynamics. A crude way of maintaining a constant temperature is to rescale the velocities after every time step to keep the mean kinetic energy per particle constant. This approach is equivalent to a constant temperature simulation when N → ∞. However, the fluctuations of the kinetic energy can be non-negligible in small systems. For such systems keeping the total kinetic energy constant in this way is not equivalent to a constant temperature simulation. A better way of maintaining a constant temperature is based on imagining that every particle in the system is connected to a much larger system called a heat bath. The heat bath is sufficiently large so that it has a constant temperature even if it loses or gains energy. The particles in the system of interest occasionally collide with particles in this heat bath. The effect of these collisions is to give the particles random velocities with the desired probability distribution (see Problem 8.6). We first list the algorithm and give its rationale later. Add the following statements to method step after all the particles have been moved. Listing 8.22: Andersen thermostat. for ( int i = 0; i < N; i ++) { i f (Math . random ( ) < c o l l i s i o n P r o b a b i l i t y ) { double r1 = Math . random ( ) ; double r2 = Math . random ( ) 2 . 0 Math . PI ; / / vx s t a t e [4 i +1] = Math . sqrt ( −2.0 temperature Math . log ( r1 ) ) Math . cos ( r2 ) ; / / vy s t a t e [4 i +3] = Math . sqrt ( −2.0 temperature Math . log ( r1 ) ) Math . sin ( r2 ) ; } } The parameter collisionProbability is much less than unity and determines how often there is a collision with the heat bath. This way of maintaining constant temperature is known as the Andersen thermostat. (a) Do a constant energy simulation as before, using an initial configuration for which the desired temperature is equal to 1.0. Make sure the total momentum is zero. Choose N = 64 and place the particles initially on a triangular lattice with Lx = 10 and Ly = √ 3Lx/2. Plot the instantaneous temperature defined as in (8.5) and compute the average temperature. Estimate the magnitude of the temperature fluctuations. Repeat your simulation for some other initial configurations. (b) Modify your program to use the Andersen thermostat at a constant temperature set equal to 1.0. Set collisionProbability = 0.0001. Repeat the calculations of part (a) and compare them. Discuss the differences. Do the results change significantly? (c) Modify your program to do a simple constant kinetic energy ensemble where the velocities are rescaled after every time step so that the total kinetic energy does not change. What is the final temperature now? How do your results compare with parts (a) and (b)? Are the differences in the computed thermodynamic averages statistically significant? (d) Compute the velocity probability distribution for each case. How do they compare? Consider collisionProbability = 0.001 and 0.00001. CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 302 (e) A deterministic algorithm for constant temperature molecular dynamics is the Nosé–Hoover thermostat. The idea is to introduce an additional degree of freedom s that plays the role of the heat bath. The derivation of the appropriate equations of motion is an excellent example of the Lagrangian formulation of mechanics. The equations of motion of Nosé–Hoover dynamics are dpi dt = Fi(t) − spi (8.54) ds dt = 1 M i p2 i mi − dNkT (8.55) where T is the desired temperature, and M is a parameter that can be interpreted as the mass associated with the extra degree of freedom. Equation (8.54) is similar to Newton’s equations of motion with an additional friction term. However, the coefficient s can be positive or negative. Equation (8.55) defines the way s is changed to control the temperature. Apply the Nosé–Hoover algorithm to simulate a simple harmonic oscillator at constant temperature. Plot the phase space trajectory. If the energy was constant, the trajectory would be an ellipse. How does the shape of the trajectory depend on M? Choose M so that the period of any oscillations due to the finite value of M is much longer than the period of the system. Project 8.25. Simulations on the surface of a sphere Because of the long-range nature of the Coulomb potential, we have to sum all the periodic images of the particles to compute the force on a given particle. Although there are special methods to do these sums so that they converge quickly (Ewald sums), the simulation of systems of charged particles is more difficult than systems with short-range potentials. An alternative approach that avoids periodic boundary conditions is to not have any boundaries at all. For example, if we wish to simulate a two-dimensional system, we can consider the motion of the particles on the surface of a sphere. If the radius of the sphere is sufficiently large, the curvature of the surface can be neglected. Of course, there is a price—the coordinate system is no longer Cartesian. Although this approach can also be applied to systems with short-range interactions, it is more interesting to apply it to charged particles. The simplest system of interest is a model of charged particles moving in a uniform background of opposite charge to ensure overall charge neutrality, the one-component plasma (OCP). In two dimensions this system is a simplified model of electrons on the surface of liquid Helium. The properties of the OCP are determined by the dimensionless parameter Γ given by the ratio of the potential energy between nearest neighbor particles to the mean kinetic energy of a particle, Γ = (e2/a)/kT , where ρπa2 = 1 and ρ is the number density. Systems with Γ >> 1 are called strongly coupled. For Γ ∼ 100 in two dimensions, the system forms a solid. Strongly coupled one-component plasmas in three dimensions are models of dense astrophysical matter. Assume that the origin of the coordinate system is at the center of the sphere and that ui is a unit vector from the origin to the position of particle i on the sphere. Then Rθij is the length of the chord joining particle i and j, where cosθij = ui · uj. Newton’s equation of motion for the ith electron has the form m¨ui = − e2 R2 j i 1 θ2 ij sinθij [uj − (cosθijui]. (8.56) CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 303 Note that the unit vector wij = [uj − (cosψijui]/ sinθij is orthogonal to ui. In addition, we must take into account that the particles must stay on the surface of the sphere, so there is an additional force on particle i toward the center of magnitude m| ˙ui|2/R. (a) What are the appropriate units for length, time, and the self-diffusion constant? (b) Write a program to compute the velocity correlation function given by C(t) = 1 v2 0 ˙u(t) · ˙u(0) (8.57) where v2 0 = u(0) · u(0). To compute the self-diffusion constant D, we let cosθ(t) = u(t) · u(0), so that Rθ is the circular arc from the initial position of a particle to its position on the sphere at time t. We then define D(t) = 1 a2 θ2(t) 4t (8.58) where D and t are dimensionless variables. The self-diffusion constant D corresponds to the limit t → ∞. Choose N = 104 and a radius R corresponding to Γ ≈ 36 as in the original simulations by Hansen et al. and then consider bigger systems. Can you conclude that the self-diffusion exists for the two-dimensional OCP? (c)∗ Use a similar procedure to compute the velocity autocorrelation function and the selfdiffusion constant D for a two-dimensional system of Lennard–Jones particles. Can you conclude that the self-diffusion exists for this two-dimensional system? Project 8.26. Granular matter Recently, physicists have become very interested in granular matter such as sand. The key difference between molecular systems and granular systems is that the interparticle interactions in the latter are inelastic. The lost energy goes into the internal degrees of freedom of a grain and ultimately is dissipated. From the point of view of the motion of the granular particles, the energy is lost. Experimentalists have studied models of granular material composed of small steel balls or glass beads using sophisticated imaging techniques that can track the motion of individual particles. There have also been many complementary computer simulation studies. What are some of the interesting properties of granular matter? Because the interactions are inelastic, granular particles will ultimately come to rest unless there is an external source of energy, usually a vibrating wall or gravity (for example, the fall of particles through a funnel). When granular particles come to rest, they can form a granular solid that is different than molecular solids. One difference is that there frequently exists a complex network of force lines within the solid. In addition, unlike ordinary liquids, the pressure does not increase with depth because the walls of the container help support the grains. As a consequence, sand flowing out of an aperture flows at a constant rate independent of the height of the sand above the aperture. For this reason sand is used in hour glasses. Another interesting property is that under some conditions, the large grains in a mixture of large and small grains can move to the top while the container is being vibrated—the “Brazil nut” effect. Under other conditions, the large grains might move to the bottom. What happens depends on the size and density of the large grains compared to the small grains (see Sanders et al.). It is also known that there is a critical angle for the slope of a sand pile, above which the sand pile is unstable. This slope is called the angle of repose. These and many other effects have been studied using theoretical, computational, and experimental techniques. CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 304 The first step in simulating granular matter is to determine the effective force law between particles. For granular gases the details of the force do not influence the qualitative results, as long as the force is purely repulsive and short range, and there is some mechanism for dissipating energy. Common examples of force laws are spring-like forces with stiff spring constants and hard disks with inelastic collisions. For simplicity, we will consider the Lennard–Jones potential with a cut off at rc = 21/6 so that the force is always repulsive. To remove energy during a collision, we will introduce a viscous damping force given by fij = −γ(vij · rij) rij r2 ij (8.59) where the viscous damping coefficient γ equals 100 in reduced units. A more realistic force model necessary for granular flow problems is given in Hirchfeld et al. (a) Modify class LJParticles so that the cutoff is at 21/6. Is the total energy conserved. Include a viscous damping force as in (8.59) and plot the kinetic energy per particle versus time. We will define the kinetic temperature to be the mean kinetic energy per particle. Why does this definition of temperature not have the same significance as the temperature in molecular systems in which the energy is conserved? Choose N = 64, L = 20, and ∆t = 0.001. Begin with a random configuration and initial kinetic temperature equal to 10. How long does it take for the kinetic temperature to decrease to 10% of its initial value? Describe the spatial distribution of the particles at this time. (b) Compute the mean kinetic temperature versus time averaged over three runs. What functional form describes your results for the mean kinetic temperature at long times? (c) To prevent “granular collapse” where the particles ultimately come to rest, we need to add energy to the system. The simplest way of doing so is to give random kicks to randomly selected particles. You can use the same algorithm we used to set the initial velocities in LJParticles: int i = ( int ) (N math . random ( ) ) / / s e l e c t s random p a r t i c l e / / use to generate Gaussian d i s t r i b u t i o n double r = Math . random ( ) ; double a = −Math . log ( r ) ; double theta = 2.0 Math . PI Math . random ( ) ; / / assign v e l o c i t i e s according to Maxwell−Boltzmann d i s t r i b u t i o n / / using Box−Muller method s t a t e [4 i +1] = Math . sqrt (2.0 desiredKE a ) Math . cos ( theta ) ; / / vx s t a t e [4 i +3] = Math . sqrt (2.0 desiredKE a ) Math . sin ( theta ) ; / / vy (The Box–Muller method is described in Section 11.5.) Assume that at each time step one particle is chosen at random and receives a random kick. Adjust desiredKE so that the mean kinetic energy per particle remains roughly constant at about 5.0. Compute the velocity distribution function for each component of the velocity. Compare this distribution on the same plot to the Gaussian distribution: p(vx) = 1 √ 2πσ2 e−(vx− vx )2/2σ2 (8.60) where σ2 = v2 x − vx 2. Is the velocity distribution function of the system a Gaussian? If not, give a physical explanation for the difference. CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 305 Appendix 8A: Reading and Saving Configurations For most of the problems in this chapter, qualitative results can be obtained fairly quickly. However, in research applications, the time for running a simulation is likely to be much longer than a few minutes and runs that require days or even months are not uncommon. In such cases it is important to be able to save the intermediate configurations to prevent the potential loss of data in the case of a computer crash or power failure. Also, in many cases it is easier to save the configurations periodically and then use a separate program to analyze the configurations and compute the quantities of interest. In addition, if we wish to compute averages as a function of a parameter, such as the temperature, it is convenient to make small changes in the temperature and use the last configuration from the previous run as the initial configuration for the simulation at the new temperature. The standard Java API has methods for reading and writing files. The usual way of saving a configuration is to use these methods to simply write all the positions and velocities as numbers into a file. Additional simulation parameters and information about the configuration would be saved using a custom format. Although this approach is the traditional one for data storage, the use of a custom format means that you might not remember the format later, and sharing data between programs and other users becomes more difficult. An alternative is to use a more structured and widely shared format for storing data. The Open Source Physics library has support for the Extensive Markup Language (XML). The XML format offers a number of advantages for computational physics: clear markup of input data and results, standardized data formats, and easier exchange and archival stability of data. In simple terms the main advantage of XML is that it is a human readable format; just by looking at an XML file you can get an idea of the nature of the data. The XML classes in the Open Source Physics library can be understood by reading the XMLExampleApp example. The XML API is very similar to the control API. For example, we use setValue to add data to an XML control, and we use getInt, getDouble, and getString to read data. We start by importing the necessary definitions from the controls package and defining the main method for the ExampleXMLApp class. Note that XMLControl defines an interface and XMLControlElement defines an implementation of this interface. import org . opensourcephysics . controls . XMLControl ; import org . opensourcephysics . controls . XMLControlElement ; public class ExampleXMLApp { public s t a t i c void main ( String [ ] args ) { . . . } The following Java statements are placed in the body of the main method. An empty XML document is created using an XMLControl object by calling the XMLControlElement constructor without any parameters. XMLControl xmlOut = new XMLControlElement ( ) ; Invoking the control’s setValue method creates an XML element consisting of a tag and a data value. The tag is the first parameter and the data to be stored is the second. Data that can be stored includes numbers, number arrays, and strings. Because the tag is unique, the data can later be retrieved from the control using the appropriate get method. xmlOut . setValue ( "comment" , "An XML description of an array." ) ; CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 306 xmlOut . setValue ( "x positions" , new double [ ] { 1 , 3 , 4 } ) ; xmlOut . setValue ( "x velocities" , new double [ ] { 0 , − 1 , 1 } ) ; Once the data has been stored in an XMLControl object, it can be exported to a file by calling the write method. In this example, the name of the file is MDconfiguration.xml. xmlOut . write ( "MDconfiguration.xml" ) ; An XMLControl can also be used to read XML data from a file. In the next example, we will read from the file that we just saved. We start by instantiating a new XMLControl named xmlIn. XMLControl xmlIn = new XMLControlElement ( "particle_configuration.xml" ) ; The new XMLControl object xmlIn contains the same data as the object we saved, xmlOut. Its data can be accessed using a tag name. Note that the getObject method returns a generic Object and must be cast to the appropriate data type. System . out . println ( xmlIn . getString ( "comment" ) ) ; double [ ] xPos = ( double [ ] ) xmlIn . getObject ( "x positions" ) ; double [ ] xVel = ( double [ ] ) xmlIn . getObject ( "x velocities" ) ; for ( int i = 0; i < xPos . length ; i ++) { System . out . println ( "x[i] = " + xPos [ i ] +" vx[i] = " + xVel [ i ] ) ; } Exercise 8.27. Saving XML data (a) Combine the above statements to create a working XMLControlApp class. Examine the saved data using a text editor. Describe how the parameters are stored. (b) Run HardDisksApp and save the control’s configuration using the Save As item under the file menu in the toolbar. Examine the saved file using a text editor and describe how this file is different from the file you generated in part (a). (c) What is the minimum amount of information that must be stored in a configuration file to specify the current HardDisks state? (d) Add custom buttons to HardDisksApp to store and load the current HardDisks state. Test your code by showing that quantities, such as the temperature, remain the same if a configuration is stored and reloaded. Open Source Physics user interfaces, such as a SimulationControl, store a program’s configuration in two steps. During the first step, parameters from the graphical user interface are stored. During the second step, the model is given the opportunity to store runtime data using an ObjectLoader. Study the LJParticlesLoader class and note how storing and loading are done in the saveObject and loadObject methods, respectively. You will adapt this ObjectLoader to store HardDisks data in Problem 8.28. Additional information about how Open Source Physics applications store XML-based configurations is provided in the Open Source Physics Users Guide. Problem 8.28. Hard disk configuration (a) Create a HardDisksLoader class that stores the HardDisks runtime data. (b) Add the getLoader method to HardDisksApp and test the loader. CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 307 public s t a t i c XML. ObjectLoader getLoader ( ) { return new HardDisksLoader ( ) ; } Method XML.ObjectLoader getLoader allows the SimulationControl to obtain the HardDisksLoader, which will be used to store the runtime data. Data written by the loader’s saveObject method will be included in the output file when the user saves a program configuration. Describe how the initialization parameters and the runtime data are separated in the XML file. Because XML allows for the creation of custom tags, various companies and professional organizations have defined other XML grammars, such as MathML. See for another example of the use of XML in computational physics. References and Suggestions for Further Reading One of the best ways of testing your programs is by comparing your results with known results. The website, , maintained by the National Institute of Standards and Technology (U.S.) provides some useful benchmark results for the Lennard– Jones fluid. Farid F. Abraham, “Computational statistical mechanics: Methodology, applications and supercomputing,” Adv. Phys. 35, 1–111 (1986). The author discusses both molecular dynamics and Monte Carlo techniques. B. J. Alder and T. E. Wainwright, “Phase transition for a hard sphere system,” J. Chem. Phys. 27, 1208–1209 (1957). M. P. Allen and D. J. Tildesley, Computer Simulation of Liquids (Clarendon Press, 1987). A classic text on molecular dynamics and Monte Carlo methods. Jean–Louis Barrat and Jean–Pierre Hansen, Basic Concepts for Simple and Complex Liquids (Cambridge University Press, 2003). Also see Jean–Pierre Hansen and Ian R. McDonald, Theory of Simple Liquids, 2nd ed. (Academic Press, 1986). Excellent graduate level texts that derive most of the theoretical results mentioned in this chapter. Kurt Binder, Jürgen Horbach, Walter Kob, Wolfgang Paul, and Fathollah Varnik, “Molecular dynamics simulations,” J. Phys.: Condens. Matter 16, S429–S453 (2004). R. P. Bonomo and F. Riggi, “The evolution of the speed distribution for a two-dimensional ideal gas: A computer simulation,” Am. J. Phys. 52, 54–55 (1984). The authors consider a system of hard disks and show that the system evolves to the Maxwell–Boltzmann distribution. J. P. Boon and S. Yip, Molecular Hydrodynamics (Dover Publications,1991). Their discussion of transport properties is an excellent supplement to our brief discussion. Giovanni Ciccotti and William G. Hoover, editors, Molecular-Dynamics Simulation of StatisticalMechanics Systems (North–Holland, 1986). Giovanni Ciccotti, Daan Frenkel, and Ian R. McDonald, editors, Simulation of Liquids and Solids (North–Holland, 1987). A collection of reprints on the simulation of many body systems. Of particular interest are B. J. Alder and T. E. Wainwright, “Phase transition in CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 308 elastic disks,” Phys. Rev. 127, 359–361 (1962) and earlier papers by the same authors; A. Rahman, “Correlations in the motion of atoms in liquid argon,” Phys. Rev. 136, A405– A411 (1964), the first application of molecular dynamics to systems with continuous potentials; and Loup Verlet, “Computer ‘experiments’ on classical fluids. I. Thermodynamical properties of Lennard–Jones molecules,” Phys. Rev. 159, 98–103 (1967). Daan Frenkel and Berend Smit, Understanding Molecular Simulation: From Algorithms to Applications, 2nd ed. (Academic Press, 2002). This monograph is one of the best on molecular dynamics and Monte Carlo simulations. It is particularly strong on simulations in various ensembles and on methods for computing free energies. J. M. Haile, Molecular Dynamics Simulation (John Wiley & Sons, 1992). A derivation of the mean pressure using periodic boundary conditions is given in Appendix B. J. P. Hansen, D. Levesque, and J. J. Weis, “Self-diffusion in the two-dimensional, classical electron gas,” Phys. Rev. Lett. 43, 979–982 (1979). D. Hirchfeld, Y. Radzyner, and D. C. Rapaport, “Molecular dynamics studies of granular flow through an aperture,” Phys. Rev. E 56, 4404–4415 (1997). W. G. Hoover, Molecular Dynamics (Springer–Verlag, 1986) and W. G. Hoover, Computational Statistical Mechanics (Elsevier, 1991). K. Kadau, T. C. Germann and P. S. Lomdahl, “Large-scale molecular-dynamics simulation of 19 billion particles,” Int. J. Mod. Phys. C 15, 193–201 (2004). J. Krim, “Friction at macroscopic and microscopic length scales,” Am. J. Phys. 70, 890–897 (2002). J. Kushick and B. J. Berne, “Molecular dynamics methods: Continuous potentials” in Statistical Mechanics Part B: Time-Dependent Processes, Bruce J. Berne, editor (Plenum Press, 1977). Also see the article by Jerome J. Erpenbeck and William Wood on “Molecular dynamics techniques for hard-core systems” in the same volume. Shang–keng Ma, “Calculation of entropy from data of motion,” J. Stat. Phys. 26, 221 (1981). Also see Chapter 25 of Ma’s graduate level text, Statistical Mechanics (World Scientific, 1985). Ma discusses a novel approach for computing the entropy directly from the trajectories. Note that the coincidence rate in Ma’s approach is related to the recurrence time for a finite system to return to an arbitrarily small neighborhood of almost any given initial state. The approach is intriguing, but is practical only for small systems. A. McDonough, S. P. Russo, and I. K. Snook, “Long-time behavior of the velocity autocorrelation function for moderately dense, soft-repulsive, and Lennard–Jones fluids,” Phys. Rev. E 63, 026109-1–9 (2001). S. Ranganathan, G. S. Dubey, and K. N. Pathak, “Molecular-dynamics study of two-dimensional Lennard–Jones fluids,” Phys. Rev. A 45, 5793–5797 (1992). Dennis Rapaport, The Art of Molecular Dynamics Simulation, 2nd ed. (Cambridge University Press, 2004). The most complete text on molecular dynamics written by one of its leading practitioners. John R. Ray and H. W. Graben, “Direct calculation of fluctuation formulae in the microcanonical ensemble,” Mol. Phys. 43, 1293 (1981). CHAPTER 8. THE DYNAMICS OF MANY-PARTICLE SYSTEMS 309 F. Reif, Fundamentals of Statistical and Thermal Physics (McGraw–Hill, 1965). An intermediate level text on statistical physics with a more thorough discussion of kinetic theory than found in most undergraduate texts. Statistical Physics, Vol. 5 of the Berkeley Physics Course (McGraw–Hill, 1965), by Reif was one of the first texts to use computer simulations to illustrate the approach of macroscopic systems to equilibrium. Marco Ronchetti and Gianni Jacucci, editors, Simulation Approach to Solids (Kluwer Academic Publishers, 1990). Another excellent collection of classic reprints. James Ringlein and Mark O. Robbins, “Understanding and illustrating the atomic origins of friction,” Am. J. Phys. 72 (7), 884–891 (2004). A very readable paper on the microscopic origins of sliding friction. Duncan A. Sanders, Michael R. Swift, R. M. Bowley, and P. J. King, “Are Brazil nuts attractive?,” Phys. Rev. Lett. 93, 208002 (2004). An example of a simulation of granular matter. Tamar Schlick, Molecular Modeling and Simulation (Springer–Verlag, 2002). Although the book is at the graduate level, it is an accessible introduction to computational molecular biology. Leonardo E. Silbert, Deniz Ertas, Gary S. Grest, Thomas C. Halsey, Dov Levine, and Steven J. Plimpton, “Granular flow down an inclined plane: Bagnold scaling and rheology,” Phys. Rev. E 64, 051302-1–14 (2001). This paper discusses the contact force model, which captures the major features of granular interactions. R. M. Sperandeo Mineo and R. Madonia, “The equation of state of a hard-particle system: A model experiment on a microcomputer,” Eur. J. Phys. 7, 124–129 (1986). D. Thirumalai and Raymond D. Mountain, “Ergodic convergence properties of supercooled liquids and glasses,” Phys. Rev. A 42, 4574–4587 (1990). Shan-Ho Tsai, H. K. Lee, and D. P. Landau, “Molecular and spin dynamics simulations using modern integration methods,” Am. J. Phys. 73, 615–624 (2005). James H. Williams and Glenn Joyce, “Equilibrium properties of a one-dimensional kinetic system,” J. Chem. Phys. 59, 741–750 (1973). Simulations in one dimension are even easier than in two. Zoran Slani˘c, Harvey Gould, and Jan Tobochnik, “Dynamics of the classical Heisenberg chain,” Computers in Physics 5 (6), 630–635 (1991). Chapter 9 Normal Modes and Waves We discuss the physics of wave phenomena and the motivation and use of Fourier transforms. 9.1 Coupled Oscillators and Normal Modes Terms such as period, amplitude, and frequency are used to describe both waves and oscillatory motion. To understand the relation between waves and oscillatory motion, consider a flexible rope that is under tension with one end fixed. If we flip the free end, a pulse propagates along the rope with a speed that depends on the tension and on the inertial properties of the rope. At the macroscopic level, we observe a transverse wave that moves along the length of the rope. In contrast, at the microscopic level we see discrete particles undergoing oscillatory motion in a direction perpendicular to the motion of the wave. One goal of this chapter is to use simulations to understand the relation between the microscopic dynamics of a simple mechanical model and the macroscopic wave motion that the model can support. For simplicity, we first consider a one-dimensional chain of N particles each of mass m. The particles are coupled by massless springs with force constant k. The equilibrium separation between the particles is a. We denote the displacement of particle j from its equilibrium position at time t by uj(t) (see Figure 9.1). For many purposes the most realistic boundary conditions are to attach particles j = 1 and j = N to springs which are attached to fixed walls. We denote the walls by j = 0 and j = N + 1 and require that u0(t) = uN+1(t) = 0. The force on an individual particle is determined by the compression or extension of its adjacent springs. The equation of motion of particle j is given by m d2uj(t) dt2 = −k uj(t) − uj+1(t) − k uj(t) − uj−1(t) = −k 2uj(t) − uj+1(t) − uj−1(t) . (9.1) Equation (9.1) couples the motion of particle j to its two nearest neighbors and describes longitudinal oscillations; that is, motion along the length of the system. It is straightforward to show that identical equations hold for the transverse oscillations of N identical mass points equally spaced on a stretched massless string (cf. French). Because the equations of motion (9.1) are linear, that is, only terms proportional to the displacements appear, it is straightforward to obtain analytic solutions of (9.1). We first discuss 310 CHAPTER 9. NORMAL MODES AND WAVES 311 a 0 a 1 a 2 3 N + 1 u1 u2 u3 Figure 9.1: A one-dimensional chain of N particles of mass m coupled by massless springs with force constant k. The first and last particles (0 and N + 1) are attached to fixed walls. The top chain shows the oscillators in equilibrium. The bottom chain shows the oscillators displaced from equilibrium. these solutions because they will help us interpret the nature of the numerical solutions. To find the normal modes, we look for oscillatory solutions for which the displacement of each particle is proportional to sinωt or cosωt. We write uj(t) = uj cosωt (9.2) where uj is the amplitude of the displacement of the jth particle. If we substitute the form (9.2) into (9.1), we obtain − ω2 uj = − k m [2uj − uj+1 − uj−1]. (9.3) We next assume that the amplitude uj depends sinusoidally on the distance ja: uj = C sinqja (9.4) where the constants q and C will be determined. If we substitute (9.4) into (9.3), we find the following condition for ω: − ω2 sinqja = − k m 2sinqja − sinq(j − 1)a − sinq(j + 1)a . (9.5) We write sinq(j ± 1)a = sinqjacosqa ± cosqjasinqa and find that (9.4) is a solution if ω2 = 2 k m 1 − cosqa . (9.6) We need to find the values of the wavenumber q that satisfy the boundary conditions u0 = 0 and uN+1 = 0. The former condition is automatically satisfied by assuming a sine instead of a cosine solution in (9.4). The latter boundary condition implies that q = qn = πn a(N + 1) (fixed boundary conditions) (9.7) CHAPTER 9. NORMAL MODES AND WAVES 312 where n = 1, ..., N. The corresponding possible values of the wavelength λ are related to q by q = 2π/λ, and the corresponding values of the angular frequencies are given by ωn 2 = 2 k m [1 − cosqna] = 4 k m sin2 qna 2 , (9.8) or ωn = 2 k m sin qna 2 . (9.9) The relation (9.9) between ωn and qn is known as a dispersion relation. A particular value of the integer n corresponds to the nth normal mode. We write the (timeindependent) normal mode solutions as uj,n = C sinqnja. (9.10) The linear nature of the equation of motion (9.1) implies that the time dependence of the displacement of the jth particle can be written as a superposition of normal modes: uj(t) = C N n=1 An cosωnt + Bn sinωnt sinqnja. (9.11) The coefficients An and Bn are determined by the initial conditions: uj(t = 0) = C N n=1 An sinqnja (9.12a) vj(t = 0) = C N n=1 ωnBn sinqnja. (9.12b) To solve (9.12) for An and Bn, we note that the normal mode solutions uj,n are orthogonal; that is, they satisfy the condition N j=1 uj,n uj,m ∝ δn,m. (9.13) The Kronecker δ symbol δn,m = 1 if n = m and is zero otherwise. It is convenient to normalize the uj,n so that they are orthonormal; that is, N j=1 uj,n uj,m = δn,m. (9.14) It is easy to show that the choice, C = 1/ (N + 1)/2, in (9.4) and (9.10) insures that (9.14) is satisfied. We now use the orthonormality condition (9.14) to determine the An and Bn coefficients. If we multiply both sides of (9.12) by C sinqmja, sum over j, and use the orthogonality condition (9.14), we obtain An = C N j=1 uj(0)sinqnja (9.15a) Bn = C N j=1 (vj(0)/ωn)sinqnja. (9.15b) CHAPTER 9. NORMAL MODES AND WAVES 313 For example, if the initial displacement of every particle is zero, and the initial velocity of every particle is zero except for v1(0) = 1, we find An = 0 for all n, and Bn = C ωn sinqna. (9.16) The corresponding solution for uj(t) is uj(t) = 2 N + 1 N n=1 1 ωn cosωnt sinqnasinqnja. (9.17) What is the solution if the particles start in a normal mode; that is, uj(t = 0) ∝ sinq2ja? The Oscillators class in Listing 9.1 displays the analytic solution (9.11) of the oscillator displacements. The draw method uses a single circle that is repeatedly set equal to the appropriate world coordinates. The initial positions are calculated and stored in the y array in the Oscillators constructor. When an oscillator is drawn, the position array is multiplied by the given mode’s sinusoidal (phase) factor to produce a time-dependent displacement. Listing 9.1: The Oscillators class models the time evolution of a normal mode of a chain of coupled oscillators. package org . opensourcephysics . sip . ch09 ; import java . awt . Graphics ; import org . opensourcephysics . display . ; public class O s c i l l a t o r s implements Drawable { OscillatorsMode normalMode ; Circle c i r c l e = new Circle ( ) ; double [ ] x ; / / drawing p o s i t i o n s double [ ] u ; / / displacement double time = 0; public O s c i l l a t o r s ( int mode, int N) { u = new double [N+2]; / / i n c l u d e s the two ends of the chain x = new double [N+2]; / / i n c l u d e s the two ends of the chain normalMode = new OscillatorsMode (mode, N) ; double xi = 0; for ( int i = 0; i ) or a stand-alone program such as Octave () can be used to obtain the solutions. 9.2 Numerical Solutions Because we are also interested in the effects of nonlinear forces between the particles, for which the matrix approach is inapplicable, we study the numerical solution of the equations of motion (9.1) directly. CHAPTER 9. NORMAL MODES AND WAVES 316 To use the ODE interface, we need to remember that the ordering of the variables in the coupled oscillator state array is important because the implementations of some ODE solvers, such as Verlet and Euler–Richardson, make explicit assumptions about the ordering. Our standard ordering is to follow a variable by its derivative. For example, the state vector of an N oscillator chain is ordered as {u0,v0,u1,v1,...,uN ,vN ,uN+1,vN+1,t}. Note that the state array includes variables for the chain’s end points although the velocity rate corresponding to the end points is always zero. We include the time as the last variable because we will sometimes model time-dependent external forces. With this ordering, the getRate method is implemented as follows: s t a t i c final double OMEGA_SQUARED = 1; / / equals k /m public void getRate ( double [ ] state , double [ ] rate ) { for ( int i = 1 , N = x . length −1; i N/2 are greater than the Nyquist frequency ωQ. We can interpret the frequencies for k > N/2 as negative frequencies equal to (k − N)ω0 (see Problem 9.13). The occurrence of negative frequency components is a consequence of the use of the exponential functions rather than sines and cosines. Note that f (t) is real if g(−ωk) = g(ωk) because the sinωk terms in (9.37) cancel due to symmetry. The calculation of a single Fourier coefficient using (9.30) requires approximately O(N) multiplications. Because the complete Fourier transform contains N complex coefficients, the calculation requires O(N2) multiplications and may require hours to complete if the sample contains just a few megabytes of data. Because many of the calculations are redundant, it is possible to organize the calculation so that the computational time is order N logN. Such an algorithm is called a fast Fourier transform (FFT) and is discussed in Appendix 9B. The improvement in speed is dramatic. A dataset containing 106 points requires ≈ 6 × 106 multiplications rather than ≈ 1012. Because we will use this algorithm to study diffraction and other phenomena, and because coding this algorithm is nontrivial, we have provided an implementation of the FFT in the Open Source Physics numerics package. We can use this FFT class to transform between time and frequency or position and wavenumber. The FFTApp program shows how the FFT class is used. Listing 9.7: The FFTApp program computes the fast Fourier transform of a function and displays the coefficients. package org . opensourcephysics . sip . ch09 ; import java . text . DecimalFormat ; import org . opensourcephysics . controls . ; import org . opensourcephysics . numerics . FFT ; public class FFTApp extends AbstractCalculation { public void calculate ( ) { / / output format DecimalFormat decimal = new DecimalFormat ( "0.0000" ) ; / / number of Fourier c o e f f i c i e n t s int N = 8; / / array that w i l l be transformed double [ ] z = new double [2 N] ; / / FFT implementation f o r N points FFT f f t = new FFT(N) ; / / mode or harmonic of e ^( i x ) int mode = control . getInt ( "mode" ) ; double x = 0 , delta = 2 Math . PI/N; for ( int i = 0; i ωQa and the corresponding sampling times ∆b < ∆a. The calculation with ∆ = ∆b is more accurate because the sampling time is smaller. Suppose that this calculation of the spectrum yields the result that P (ω > ωa) > 0. What happens if we compute the power spectrum using ∆ = ∆a? The power associated with ω > ωa must be “folded” back into the ω < ωa frequency components. For example, the frequency component at ω+ωa is added to the true value at ω−ωa to produce an incorrect value at ω −ωa in the computed power spectrum. This phenomenon is called aliasing and leads to spurious results. Aliasing occurs in calculations of P (ω) if the latter does not vanish above the Nyquist frequency. To avoid aliasing, it is necessary to sample more frequently or to remove the high frequency components from the signal before sampling the data. Although the power spectrum can be computed by a simple modification of AnalyzeApp, it is a good idea to use the FFT for many of the following problems. Problem 9.18. Aliasing Sample the sinusoidal function, sin2πt, and display the resulting power spectrum using sampling frequencies above and below the Nyquist frequency. Start with a sampling time of ∆ = 0.1 and increase the time until ∆ = 10.0. (a) Is the power spectrum sharp? That is, is all the power located in a single frequency? Does your answer depend on the ratio of the period to the sampling time? (b) Explain the appearance of the power spectrum for ∆ = 1.25, ∆ = 1.75, and ∆ = 2.5. (c) What is the power spectrum if you sample at the Nyquist frequency or twice the Nyquist frequency? Problem 9.19. Examples of power spectra (a) Create a data set with N points corresponding to f (t) = 0.3cos(2πt/T ) + r, where r is a uniform random number between 0 and 1 and T = 4. Plot f (t) versus t in time intervals of ∆ = 4T /N for N = 128. Can you visually detect the periodicity? Compute the power spectrum using the same sampling interval ∆ = 4T /N. Does the frequency dependence of the power spectrum indicate that there are any special frequencies? Repeat with T = 16. Are high or low frequency signals easier to pick out from the random background? (b) Simulate a one-dimensional random walk and compute the time series x2(t), where x(t) is the distance from the origin of the walk after t steps. Average x2(t) over several trials. Compute the power spectrum for a walk of t ≤ 256. In this case ∆ = 1, the time between steps. Do you observe any special frequencies? (c) Let fn be the nth member of a random number sequence. As in part (b), ∆ = 1. Compute the power spectrum of the random number generator. Do you detect any periodicities? If so, is the random number generator acceptable? Problem 9.20. Power spectrum of coupled oscillators (a) Modify your program developed in Problem 9.2 so that the power spectrum of the position of one of the N particles is computed at the end of the simulation. Set ∆ = 0.1 so that CHAPTER 9. NORMAL MODES AND WAVES 334 the Nyquist frequency is ωQ = π/∆ ≈ 31.4. Choose the time of the simulation equal to T = 25.6 and let k/m = 1. Plot the power spectrum P (ω) at frequency intervals equal to ∆ω = ω0 = 2π/T . First choose N = 2 and choose the initial conditions so that the system is in a normal mode. What do you expect the power spectrum to look like? What do you find? Then choose N = 10 and choose initial conditions corresponding to various normal modes. Is the power spectrum the same for all particles? (b) Repeat part (a) for N = 2 and N = 10 with random initial particle displacements between −0.5 and +0.5 and zero initial velocities. Can you detect all the normal modes in the power spectrum? Repeat for a different set of random initial displacements. (c) Repeat part (a) for initial displacements corresponding to the equal sum of two normal modes. Do the power spectrum show two peaks? Are these peaks of equal height? (d) Recompute the power spectrum for N = 10 with T = 6.4. Is this time long enough? How can you tell? Problem 9.21. Quasiperiodic power spectra (a) Write a program to compute the power spectrum of the circle map (6.62). Begin by exploring the power spectrum for K = 0. Plot lnP (ω) versus ω, where P (ω) is proportional to the modulus squared of the Fourier transform of xn. Begin with 256 iterations. How do the power spectra differ for rational and irrational values of the parameter Ω? How are the locations of the peaks in the power spectra related to the value of Ω? (b) Set K = 1/2 and compute the power spectra for 0 < Ω < 1. Does the power spectra differ from the spectra found in part (a)? (c) Set K = 1 and compute the power spectra for 0 < Ω < 1. How does the power spectra compare to those found in parts (a) and (b)? In Problem 9.20 we found that the peaks in the power spectrum yield information about the normal mode frequencies. In Problems 9.22 and 9.23 we compute the power spectra for a system of coupled oscillators with disorder. Disorder can be generated by having random masses or random spring constants (or both). We will see that one effect of disorder is that the normal modes are no longer simple sinusoidal functions. Instead, some of the modes are localized, meaning that only some of the particles move significantly while the others remain essentially at rest. This effect is known as Anderson localization. Typically, we find that modes above a certain frequency are localized, and those below this threshold frequency are extended. The threshold frequency is well defined for large systems. All states are localized in the limit of an infinite chain with any amount of disorder. The dependence of localization on disorder in systems of coupled oscillators in higher dimensions is more complicated. Problem 9.22. Localization with a single defect (a) Modify your program developed in Problem 9.2 so that the mass of one oscillator is equal to one fourth that of the others. Set N = 20 and use fixed boundary conditions. Compute the power spectrum over a time T = 51.2 using random initial displacements between −0.5 and +0.5 and zero initial velocities. Sample the data at intervals of ∆ = 0.1. The normal mode frequencies correspond to the well-defined peaks in P (ω). Consider at least three different sets of random initial displacements to insure that you find all the normal mode frequencies. CHAPTER 9. NORMAL MODES AND WAVES 335 (b) Apply an external force Fe = 0.3sinωt to each particle. (The steady-state behavior occurs sooner if we apply an external force to each particle instead of just one particle.) Because the external force pumps energy into the system, it is necessary to add a damping force to prevent the oscillator displacements from becoming too large. Add a damping force equal to −γvi to all the oscillators with γ = 0.1. Choose random initial displacements and zero initial velocities and use the frequencies found in part (a) as the driving frequencies ω. Describe the motion of the particles. Is the system driven to a normal mode? Take a “snapshot” of the particle displacements after the system has run for a sufficiently long time, so that the patterns repeat themselves. Are the particle displacements simple sinusoidal functions? Sketch the approximate normal mode patterns for each normal mode frequency. Which of the modes appear localized and which modes appear to be extended? What is the approximate cutoff frequency that separates the localized from the extended modes? Problem 9.23. Localization in a disordered chain of oscillators (a) Modify your program so that the spring constants can be varied by the user. Set N = 10 and use fixed boundary conditions. Consider the following set of 11 spring constants: 0.704, 0.388, 0.707, 0.525, 0.754, 0.721, 0.006, 0.479, 0.470, 0.574, 0.904. To help you determine all the normal modes, we provide two of the normal mode frequencies: ω ≈ 0.28 and 1.15. Find the power spectrum using the procedure outlined in Problem 9.22a. (b) Apply an external force Fe = 0.3sinωt to each particle and find the normal modes as outlined in Problem 9.22b. (c) Repeat parts (a) and (b) for another set of random spring constants for N = 40. Discuss the nature of the localized modes in terms of the specific values of the spring constants. For example, is the edge of a localized mode at a spring that has a relatively large or small spring constant? (d) Repeat parts (a) and (b) for uniform spring constants but random masses between 0.5 and 1.5. Is there a qualitative difference between the two types of disorder? In 1955 Fermi, Pasta, and Ulam used the Maniac I computer at Los Alamos to study a chain of oscillators. Their surprising discovery might have been the first time a qualitatively new result, instead of a more precise number, was found from a simulation. To understand their results, we need to discuss an idea from statistical mechanics that was discussed in Project 8.23. A fundamental assumption of statistical mechanics is that an isolated system of particles is quasi-ergodic; that is, the system will evolve through all configurations consistent with the conservation of energy. A system of linearly coupled oscillators is not quasi-ergodic, because if the system is initially in a normal mode, it stays in that normal mode forever. Before 1955 it was believed that if the interaction between the particles is weakly nonlinear (and the number of particles is sufficiently large), the system would be quasi-ergodic and evolve through the different normal modes of the linear system. In Problem 9.24 we will find, as did Fermi, Pasta, and Ulam, that the behavior of the system is much more complicated. The question of ergodicity in this system is known as the FPU problem. CHAPTER 9. NORMAL MODES AND WAVES 336 Problem 9.24. Nonlinear oscillators (a) Modify your program so that cubic forces between the particles are added to the linear spring forces. That is, let the force on particle i due to particle j be Fij = −(ui − uj) − α(ui − uj)3 (9.48) where α is the amplitude of the nonlinear term. Choose the masses of the particles to be unity. Consider N = 10 and choose initial displacements corresponding to a normal mode of the linear (α = 0) system. Compute the power spectrum over a time T = 51.2 with ∆ = 0.1 for α = 0, 0.1, 0.2, and 0.3. For what value of α does the system become ergodic; that is, for what value of α are the heights of all the normal mode peaks approximately the same? (b) Repeat part (a) for the case where the displacements of the particles are initially random. Use the same set of random displacements for each value of α. (c)∗ We now know that the number of oscillators is not as important as the magnitude of the nonlinear interaction. Repeat parts (a) and (b) for N = 20 and 40 and discuss the effect of increasing the number of particles. 9.7 Wave Motion Our simulations of coupled oscillators have shown that the microscopic motion of the individual oscillators leads to macroscopic wave phenomena. To understand the transition between microscopic and macroscopic phenomena, we reconsider the oscillations of a linear chain of N particles with equal spring constants k and equal masses m. As we found in Section 9.1, the equations of motion of the particles can be written as [see (9.1)] d2uj(t) dt2 = − k m 2uj(t) − uj+1(t) − uj−1(t) (j = 1,...,N). (9.49) We consider the limits N → ∞ and a → 0 with the length of the chain Na fixed. We will find that the discrete equations of motion (9.49) can be replaced by the continuous wave equation ∂2u(x,t) ∂t2 = c2 ∂2u(x,t) ∂x2 (9.50) where c has the dimension of velocity. We obtain the wave equation (9.50) as follows. First we replace uj(t), where j is a discrete variable, by the function u(x,t), where x is a continuous variable, and rewrite (9.49) in the form ∂2u(x,t) ∂t2 = ka2 m 1 a2 u(x + a,t) − 2u(x,t) + u(x − a,t) . (9.51) We have written the time derivative as a partial derivative because the function u depends on two variables. If we use the Taylor series expansion u(x ± a) = u(x) ± a du dx + a2 2 d2u dx2 + ... (9.52) CHAPTER 9. NORMAL MODES AND WAVES 337 it is easy to show that as a → 0, the quantity 1 a2 u(x + a,t) − 2u(x,t) + u(x − a,t) → ∂2u(x,t) ∂x2 . (9.53) (We have written a spatial derivative as a partial derivative for the same reason as before.) The wave equation (9.50) is obtained by substituting (9.53) into (9.51) with c2 = ka2/m. If we introduce the linear mass density µ = M/a and the tension T = ka, we can express c in terms of µ and T and obtain the familiar result c2 = T /µ. It is straightforward to show that any function of the form f (x ± ct) is a solution to (9.50). Among these many solutions to the wave equation are the familiar forms: u(x,t) = Acos 2π λ (x ± ct) (9.54a) u(x,t) = Asin 2π λ (x ± ct). (9.54b) Because the wave equation is linear and hence, satisfies a superposition principle, we can understand the behavior of a wave of arbitrary shape by representing its shape as a sum of sinusoidal waves. One way to solve the wave equation (9.50) numerically is to retrace our steps back to the discrete equations (9.49) to find a discrete form of the wave equation that is convenient for numerical calculations. The conversion of a continuum equation to a physically motivated discrete form frequently leads to useful numerical algorithms. From (9.53) we see how to approximate the second derivative by a finite difference. If we replace a by ∆x and take ∆t to be the time step, we can rewrite (9.49) by 1 (∆t)2 [u(x,t + ∆t) − 2u(x,t) + u(x,t − ∆t)] = c2 (∆x)2 [u(x + ∆x,t) − 2u(x,t) + u(x − ∆x,t)]. (9.55) The quantity ∆x is the spatial interval. The result of solving (9.55) for u(x,t + ∆t) is u(x,t + ∆t) = 2(1 − b)u(x,t) + b[u(x + ∆x,t) + u(x − ∆x,t)] − u(x,t − ∆t) (9.56) where b = (c∆t/∆x)2. Equation (9.56) expresses the displacements at time t + ∆t in terms of the displacements at the current time t and at the previous time t − ∆t. Problem 9.25. Solution of the discrete wave equation (a) Write a program to compute the numerical solutions of the discrete wave equation (9.56). Three spatial arrays corresponding to u(x) at times t + ∆t, t, and t − ∆t are needed. Denote the displacement u(j∆x) by the array element u[j] where j = 0,...,N + 1. Use periodic boundary conditions so that u0 = uN and u1 = uN+1. Draw lines between the displacements at neighboring values of x. Note that the initial conditions require the specification of u at t = 0 and at t = −∆t. Let the waveform at t = 0 and t = −∆t be u(x,t = 0) = exp(−(x−10)2) and u(x,t = −∆t) = exp(−(x − 10 + c∆t)2), respectively. What is the direction of motion implied by these initial conditions? CHAPTER 9. NORMAL MODES AND WAVES 338 (b) Our first task is to determine the optimum value of the parameter b. Let ∆x = 1 and N ≥ 100 and try the following combinations of c and ∆t: c = 1,∆t = 0.1; c = 1,∆t = 0.5; c = 1,∆t = 1; c = 1,∆t = 1.5; c = 2,∆t = 0.5; and c = 2,∆t = 1. Verify that the value b = (c∆t)2 = 1 leads to the best results; that is, for this value of b, the initial form of the wave is preserved. (c) It is possible to show that the discrete form of the wave equation with b = 1 is exact up to numerical roundoff error (cf. DeVries). Hence, we can replace (9.56) by the simpler algo- rithm u(x,t + ∆t) = u(x + ∆x,t) + u(x − ∆x,t) − u(x,t − ∆t). (9.57) That is, the solutions of (9.57) are equivalent to the solutions of the original partial differential equation (9.50). Try several different initial waveforms and show that if the displacements have the form f (x ± ct), the waveform maintains its shape with time. For the remaining problems, we will use (9.57) corresponding to b = 1. Unless otherwise specified, choose c = 1, ∆x = ∆t = 1, and N ≥ 100 in the following problems. Problem 9.26. Velocity of waves (a) Use the waveform given in Problem 9.25a and verify that the speed of the wave is unity by determining the distance traveled in a given amount of time. Because we have set ∆x = ∆t = 1 and b = 1, the speed c = 1. (A way of incorporating different values of c is discussed in Problem 9.27c.) (b) Replace the waveform considered in part (a) by a sinusoidal wave that fits exactly; that is, choose u(x,t) = sin(qx − ωt) such that sinq(N + 1) = 0. Measure the period T of the wave by measuring the time it takes for successive maxima to pass a given point. What is the wavelength λ of the wave? Does it depends on the value of q? The frequency of the wave is given by f = 1/T . Verify that λf = c. Problem 9.27. Reflection of waves (a) Consider a wave of the form u(x,t) = e−(x−10−ct)2 . Use fixed boundary conditions so that u0 = uN+1 = 0. What happens to the reflected wave? (b) Modify your program so that free boundary conditions are incorporated: u0 = u1 and uN = uN+1. Compare the phase of the reflected wave to your result from part (a). (c) What happens to a pulse at the boundary between two media? Set c = 1 and ∆t = 1 on the left side of your grid and c = 2 and ∆t = 0.5 on the right side. These choices of c and ∆t imply that b = 1 on both sides, but that the right side is updated twice as often as the left side. What happens to a pulse that begins on the left side and moves to the right? Is there both a reflected and transmitted wave at the boundary between the two media? What is their relative phase? Find a relation between the amplitude of the incident pulse and the amplitudes of the reflected and transmitted pulses. Repeat for a pulse starting from the right side. Problem 9.28. Superposition of waves (a) Consider the propagation of the wave determined by u(x,t = 0) = sin(4πx/N). What must u(x,−∆t) be so that the wave moves in the positive x direction? Test your answer by doing a simulation. Use periodic boundary conditions. Repeat for a wave moving in the negative x direction. CHAPTER 9. NORMAL MODES AND WAVES 339 (b) Simulate two waves moving in opposite directions each with the same spatial dependence given by u(x,0) = sin(4πx/N). Describe the resultant wave pattern. Repeat the simulation for u(x,0) = sin(8πx/N). (c) Assume that u(x,0) = sinq1x + sinq2x with q1 = 10π/N and q2 = 12π/N. Describe the qualitative form of u(x,t) for fixed t. What is the distance between modulations of the amplitude? Estimate the wavelength associated with the fine ripples of the amplitude. Estimate the wavelength of the envelope of the wave. Find a simple relation for these two wavelengths in terms of the wavelengths of the two sinusoidal terms. This phenomena is known as beats. (d) Consider the motion of the two Gaussian pulses moving in opposite directions, u1(x,0) = e−(x−10)2 and u2(x,0) = e−(x−90)2 . Choose the array at t = −∆t as in Problem 9.25. What happens to the two pulses when they overlap or partially overlap? Do they maintain their shape? While they are going through each other, is the displacement u(x,t) given by the sum of the displacements of the individual pulses? Problem 9.29. Standing waves (a) In Problem 9.28c we considered a standing wave, the continuum analog of a normal mode of a system of coupled oscillators. As is the case for normal modes, each point of the wave has the same time dependence. For fixed boundary conditions, the displacement is given by u(x,t) = sinqxcosωt, where ω = cq, and the wavenumber q is chosen so that sinqN = 0. Choose an initial condition corresponding to a standing wave for N = 100. Describe the motion of the particles and compare it with your observations of standing waves on a rope. (b) Establish a standing wave by displacing one end of a system periodically. The other end is fixed. Let u(x,0) = u(x,−∆t) = 0, and u(x = 0,t) = Asinωt with A = 0.1. How long must the simulation run before you observe standing waves? How large is the standing wave amplitude? We have seen that the wave equation can support pulses that propagate indefinitely without distortion. In addition, because the wave equation is linear, the sum of any two solutions is also a solution, and the principle of superposition is satisfied. As a consequence, we know that two pulses can pass through each other unchanged. We have also seen that similar phenomena exist in the discrete system of linearly coupled oscillators. What happens if we create a pulse in a system of nonlinear oscillators? As an introduction to nonlinear wave phenomena, we consider a system of N coupled oscillators with the potential energy of interaction given by V = 1 2 N j=1 e−(uj −uj−1) − 1 2 . (9.58) This form of the interaction is known as the Morse potential. All parameters in the potential (such as the overall strength of the potential) have been set to unity. The force on the jth particle is Fj = − ∂V ∂uj = Qj(1 − Qj) − Qj+1(1 − Qj+1) (9.59a) where Qj = e−(uj −uj−1) . (9.59b) CHAPTER 9. NORMAL MODES AND WAVES 340 In linear systems it is possible to set up a pulse of any shape and maintain the shape of the pulse indefinitely. In a nonlinear system, there also exist solutions that maintain their shape, but we will find in Problem 9.30 that not all pulse shapes do so. The pulses that maintain their shape are called solitons. Problem 9.30. Solitons (a) Modify the program developed in Problem 9.2 so that the force on particle j is given by (9.59). Use periodic boundary conditions. Choose N ≥ 60 and an initial pulse of the form u(x,t) = 0.5e−(x−10)2 . You should find that the initial pulse splits into two pulses plus some noise. Describe the motion of the pulses (solitons). Do they maintain their shape, or is this shape modified as they move? Describe the motion of the particles far from the pulse. Are they stationary? (b) Save the displacements of the particles when the peak of one of the solitons is located near the center of your display. Is it possible to fit the shape of the soliton to a Gaussian? Continue the simulation and after one of the solitons is relatively isolated, set u(j) = 0 for all j far from this soliton. Does the soliton maintain its shape? (c) Repeat part (b) with a pulse given by u(x,0) = 0 everywhere except for u(20,0) = u(21,0) = 1. Do the resulting solitons have the same shape as in part (b)? (d) Begin with the same Gaussian pulse as in part (a) and run until the two solitons are well separated. Then change at random the values of u(j) for particles in the larger soliton by about 5% and continue the simulation. Is the soliton destroyed? Increase the perturbation until the soliton is no longer discernible. (e) Begin with a single Gaussian pulse as in part (a). The two resultant solitons will eventually “collide.” Do the solitons maintain their shape after the collision? The principle of superposition implies that the displacement of the particles is given by the sum of the displacements due to each pulse. Does the principle of superposition hold for solitons? (f) Compute the speeds, amplitudes, and width of the solitons produced from a single Gaussian pulse. Take the amplitude of a soliton to be the largest value of its displacement and the half-width to correspond to the value of x at which the displacement is half its maximum value. Repeat these calculations for solitons of different amplitudes by choosing the initial amplitude of the Gaussian pulse to be 0.1, 0.3, 0.5, 0.7, and 0.9. Plot the soliton speed and width versus the corresponding soliton amplitude. (g) Change the boundary conditions to free boundary conditions and describe the behavior of the soliton as it reaches a boundary. Compare this behavior with that of a pulse in a system of linear oscillators. (h) Begin with an initial sinusoidal disturbance that would be a normal mode for a linear system. Does the sinusoidal mode maintain its shape? Compare the behavior of the nonlinear and linear systems. 9.8 Interference Interference is one of the most fundamental characteristics of all wave phenomena. The term interference is used when there are a small number of sources, and the term diffraction when the CHAPTER 9. NORMAL MODES AND WAVES 341 Figure 9.3: The computed energy density in the vicinity of two point sources. number of sources is large and can be treated as a continuum. Because it is relatively easy to observe interference and diffraction phenomena with light, we discuss these phenomena in this context. Consider the field from one or more point sources lying in a plane. The electric field at position r associated with the light emitted from a monochromatic point source at r1 is a spherical wave radiating from that point. This wave can be thought of as the real part of a complex exponential: E(r,t) = A |r − r1| ei(q|r−r1|−ωt) (9.60) where |r − r1| is the distance between the source and the point of observation and q is the wavenumber 2π/λ. The superposition principle implies that the total electric field at r from N point sources at ri is E(r,t) = e−iωt N n=1 An |r − rn| ei(q|r−rn|) = e−iωt E(r). (9.61) The time evolution can be expressed as an oscillatory complex exponential e−iωt that multiplies a complex space part E(r). The spatial part E(r) is a phasor which contains both the maximum value of the electric field and the time within a cycle when the physical field reaches its maximum value. As the system evolves, the complex electric field E(r,t) oscillates between purely real and purely imaginary values. Both the energy density (the energy per unit volume) and the light intensity (the energy passing through a unit area) are proportional to the square of the magnitude of the phasor. Because light fields oscillate at ≈ 6 × 1014 Hz, typical optics experiments observe the time average (rms value) of E and do not observe the phase angle. Huygens’s principle states that each point on a wavefront (a surface of constant phase) can be treated as the source of a new spherical wave or Huygens’s wavelet. The wavefront at some later time is constructed by summing these wavelets. The HuygensApp program implements Huygens’s principle by assuming superposition from an arbitrary number of point sources and displaying a two-dimensional animation of (9.61) as shown in Figure 9.3 and described in Appendix 9C. Sources are represented by circles and are added to the frame when a custom button invokes the createSource method. CHAPTER 9. NORMAL MODES AND WAVES 342 public void createSource ( ) { InteractiveShape ishape = InteractiveShape . createCircle (0 , 0 , 0 . 5 ) ; frame . addDrawable ( ishape ) ; initPhasors ( ) ; frame . repaint ( ) ; } Users can create as many sources as they wish. The program later retrieves a list of sources from the frame using the latter’s getDrawables method. The program uses n × n arrays to store the real and imaginary values. The code fragment from the initPhasors method shown in the following starts the process by obtaining a list of point sources in the frame. We then use an Iterator to access each source as we sum the vector components at each grid point. ArrayList l i s t =frame . getDrawables ( ) ; / / g e t s l i s t of point sources / / c r e a t e s an i t e r a t o r f o r the l i s t I t e r a t o r i t = l i s t . i t e r a t o r ( ) ; / / t h e s e two statements are combined in the f i n a l code List and Iterator are interfaces implemented by the objects returned by frame.getDrawables and list.iterator, respectively. As the name implies, an iterator is a convenient way to access a list without explicitly counting its elements. The iterator’s getNext method retrieves elements from the list, and the hasNext method returns true if the end of the list has not been reached. The initPhasors method in HuygensApp computes the phasors at every point by summing the phasors at each grid point. Note how the distance from the source to the observation point is computed by converting the grid’s index values to world coordinates. I t e r a t o r i t = frame . getDrawables ( ) . i t e r a t o r ( ) ; / / source i t e r a t o r while ( i t . hasNext ( ) ) { InteractiveShape source = ( InteractiveShape ) i t . next ( ) ; / / world c o o r d i n a t e s f o r source double xs = source . getX ( ) , ys = source . getY ( ) ; for ( int ix = 0; ix g ri d p o in t x se parat ion for ( int iy = 0; iy g ri d p o in t y se parat ion double r = Math . sqrt ( dx dx+dy dy ) ; realArray [ ix ] [ iy ] += ( r==0) ? 0 : Math . cos ( PI2 r )/ r ; imagArray [ ix ] [ iy ] += ( r==0) ? 0 : Math . sin ( PI2 r )/ r ; } } } To calculate the real and imaginary components of the phasor, the distance from the source to the grid point is determined in terms of the wavelength λ, and the time is is determined in terms of the period T . For example, for green light one unit of distance is ≈ 5 × 10−7 m and one unit of time is ≈ 1.6 × 10−15 s. The simulation is performed by multiplying the phasors by e−iωt in the doStep method. Multiplying each phasor by e−iωt mixes the phasor’s real and imaginary components. We then obtain the physical field from (9.61) by taking the real part: E(r,t) = Re[e−iωt E(r)] = Re[E]cosωt + Im[E]sinωt. (9.62) CHAPTER 9. NORMAL MODES AND WAVES 343 Listing 9.10 shows the entire HuygensApp class. A custom button is used to create sources at the origin. Because the source is an InteractiveShape, it can be repositioned using the mouse. The program also implements the InteractiveMouseHandler interface to recalculate the phasors when the source is moved. (See Section 5.7 for a discussion of interactive handlers.) Listing 9.10: The HuygensApp class simulates the energy density from one or more point sources. package org . opensourcephysics . sip . ch09 ; import java . u t i l . ; import java . awt . event . ; import org . opensourcephysics . controls . ; import org . opensourcephysics . display . ; import org . opensourcephysics . display2d . ; import org . opensourcephysics . frames . ; public class HuygensApp extends AbstractSimulation implements InteractiveMouseHandler { s t a t i c final double PI2 = Math . PI 2; Scalar2DFrame frame = new Scalar2DFrame ( "x" , "y" , "Intensity from point sources" ) ; double time = 0; double [ ] [ ] realPhasor , imagPhasor , amplitude ; int n ; / / grid points on a s i d e double a ; / / s i d e length public HuygensApp ( ) { / / i n t e r p o l a t e d p l o t l o o k s b e s t frame . convertToInterpolatedPlot ( ) ; frame . setPaletteType ( ColorMapper .RED) ; frame . setInteractiveMouseHandler ( this ) ; } public void i n i t i a l i z e ( ) { n = control . getInt ( "grid size" ) ; a = control . getDouble ( "length" ) ; frame . setPreferredMinMax(−a /2 , a /2 , −a /2 , a / 2 ) ; realPhasor = new double [n ] [ n ] ; imagPhasor = new double [n ] [ n ] ; amplitude = new double [n ] [ n ] ; frame . setAll ( amplitude ) ; initPhasors ( ) ; } void initPhasors ( ) { for ( int ix = 0; ix0) { double phase = z Math . sqrt ( radical ) ; double real = Math . cos ( phase ) ; double imag = Math . sin ( phase ) ; double temp = cdata [ o f f s e t +2 iy ] ; cdata [ o f f s e t +2 iy ] = real cdata [ o f f s e t +2 iy ] −imag cdata [ o f f s e t +2 iy +1]; cdata [ o f f s e t +2 iy +1] = real cdata [ o f f s e t +2 iy +1]+imag temp ; } else { / / evanescent waves decay e x p o n e n t i a l l y double decay = Math . exp(−z Math . sqrt (− radical ) ) ; cdata [ o f f s e t +2 iy ] = decay ; cdata [ o f f s e t +2 iy +1] = decay ; } } } fft2d . inverse ( cdata ) ; double max = 0; for ( int i = 0; i 0) { i f (N%2==0) { / / N should be even pow++; N /= 2; } else { throw new IllegalArgumentException ( "Number of points in this FFT implementation must be even." ) ; } } int N2 = N/2; int j j = N2; / / rearrange input according to b i t r e v e r s a l for ( int i = 1; i 0) { eField [ 0 ] [ ix ] [ iy ] += charge . q dx/r3 ; eField [ 1 ] [ ix ] [ iy ] += charge . q dy/r3 ; } } CHAPTER 10. ELECTRODYNAMICS 364 } } frame . setAll ( eField ) ; } public void handleMouseAction ( InteractivePanel panel , MouseEvent evt ) { panel . handleMouseAction ( panel , evt ) ; / / panel moves the charge i f ( panel . getMouseAction ()== InteractivePanel .MOUSE_DRAGGED) { / / remove t h i s l i n e i f user i n t e r f a c e i s s l u g g i s h calculateField ( ) ; panel . repaint ( ) ; } } public s t a t i c void main ( String [ ] args ) { CalculationControl . createApp (new ElectricFieldApp ( ) ) ; } } To make the program interactive, the ElectricFieldApp class implements the InteractiveMouseHandler to process mouse events when a charge is dragged. (See Section 5.7 for a discussion of interactive panels and interactive mouse handlers.) The class registers its interest in handling these events using the setInteractiveMouseHandler method. The handler passes the event to the panel to move the charge and then recalculates the field. Note that the Charge class in Listing 10.2 inherits from the InteractiveCircle class.2 Listing 10.2: The Charge class extends the InteracticeCircle class and adds the charge prop- erty. package org . opensourcephysics . sip . ch10 ; import java . awt . Color ; import org . opensourcephysics . display . I n t e r a c t i v e C i r c l e ; public class Charge extends I n t e r a c t i v e C i r c l e { double q = 0; public double getQ ( ) { return q ; } public Charge ( double x , double y , double q ) { / / super ( x , y ) ; this . q = q ; i f (q>0) { color = Color . red ; } else { color = Color . blue ; } } } 2Dragging may become sluggish if too many computations are performed within the mouse action method. CHAPTER 10. ELECTRODYNAMICS 365 Problem 10.1. Motion of a charged particle in an electric field (a) Test ElectricFieldApp by adding one charge at a time at various locations. Do the electric field patterns look reasonable? For example, does the electric field point away from positive charges and toward negative charges? How well is the magnitude of the electric field represented? (b) Modify ElectricFieldApp so that it uses AbstractSimulation to compute the motion of a test particle of mass m and charge q in the presence of the electric field created by a fixed distribution of point charges. That is, create a drawable test charge that implements the ODE interface and add it to the vector field frame. Use the same approach that was used for the trajectory problems in Chapter 5. The acceleration of the charge is given by qE/m, where E is the electric field due to the fixed point charges. Use a higher-order algorithm to advance the position and velocity of the particle. (Ignore the effects of radiation due to accelerating charges.) (c) Assume that E is due to a charge q(1) = 1.5 fixed at the origin. Simulate the motion of a charged particle of mass m = 0.1 and charge q = 1 initially at x = 1,y = 0. Consider the following initial conditions for its velocity: vx = 0,vy = 0; vx = 1,vy = 0; vx = 0,vy = 1; and vx = −1, vy = 0. Is the trajectory of the particle tangent to the field vectors? Explain. (d) Assume that the electric field is due to two fixed point charges: q(1) = 1 at x(1) = 2,y(1) = 0 and q(2) = −1 at x(2) = −2,y(2) = 0. Place a charged particle of unit mass and unit positive charge at x = 0.05,y = 0. What do you expect the motion of this charge to be? Do the simulation and determine the qualitative nature of the motion. e)∗ Consider the motion of a charged particle in the vicinity of the electric dipole defined in part (d). Choose the initial position to be five times the separation of the charges in the dipole. Do you find any bound orbits? Do you find any closed orbits or do all orbits show some precession? 10.3 Electric Field Lines Another way of visualizing the electric field is to draw electric field lines. The properties of these lines are as follows: 1. An electric field line is a directed line whose tangent at every position is parallel to the electric field at that position. 2. The lines are smooth and continuous except at singularities such as point charges. (It makes no sense to talk about the electric field at a point charge.) 3. The density of lines at any point in space is proportional to the magnitude of the field at that point. This property implies that the total number of electric field lines from a point charge is proportional to the magnitude of that charge. The value of the proportionality constant is chosen to provide the clearest pictorial representation of the field. The drawing of field lines is art plus science. The FieldLineApp program draws electric field lines in two dimensions. The program makes extensive use of the FieldLine class which implements the following algorithm: CHAPTER 10. ELECTRODYNAMICS 366 1. Begin at a point (x,y) and compute the components Ex and Ey of the electric field vector E using (10.1). 2. Draw a small line segment of size ∆s = |∆s| tangent to E at that point. The components of the line segment are given by ∆x = ∆s Ex |E| and ∆y = ∆s Ey |E| . (10.4) 3. Iterate the process beginning at the new point (x+∆x,y +∆y). Continue until the field line approaches a point charge singularity or escapes toward infinity. This field line algorithm is equivalent to solving the following differential equations: dx ds = Ex |E| (10.5a) dy ds = Ey |E| . (10.5b) Because a field line extends in both directions from the algorithm’s starting point, the computation must be repeated in the (−Ex/|E|,−Ey/|E|) direction to obtain a complete visualization of the field line. Note that this algorithm draws a correct field line but does not draw a collection of field lines with a density proportional to the field intensity. To draw the field lines, a computation starts when a user double clicks in the panel and end the computation when the field line approaches a point charge or when the magnitude of the field becomes too small. Although we can easily describe these stopping conditions, we do not know how long the computation will take, and we might want to compute multiple field lines simultaneously. An elegant way to do this computation is to use threads. As we discussed in Section 2.6, Java programs can have multiple threads to separate and organize related tasks. A thread is an independent task within a single program that shares the program’s data with other threads.3 In the following example, we create a thread to compute the solution of the differential equation for an electric field line. It is natural to use threads in this context because the drawing of a field line involves starting the field line, drawing each piece of the field line, and then stopping the calculation when some stopping condition is met. The computation begins when the FieldLine object is created and ends when the stopping condition is satisfied. A thread executes statements within an object, such as FieldLine, that implements the Runnable interface. This interface consists of a single method, the run method, and the thread executes the code within this method. The run method is not invoked directly, but is invoked automatically by the thread after the thread is started. When the run method exits, the thread that invoked the run method stops executing and is said to die. After a thread dies, it cannot be restarted. Another thread must be created if we wish to invoke the run method a second time. We build a FieldLine class by subclassing Thread and adding the necessary drawing and differential equation capabilities using the Drawable and ODE interfaces, respectively. This class is shown in Listing 10.3. Listing 10.3: The FieldLine class computes an electric field line using a Thread. package org . opensourcephysics . sip . ch10 ; 3The Open Source Physics User’s Guide describes simulation threads in more detail. CHAPTER 10. ELECTRODYNAMICS 367 import java . awt . Graphics ; import java . u t i l . ; import org . opensourcephysics . display . ; import org . opensourcephysics . numerics . ; public class FieldLine implements Drawable , ODE, Runnable { DrawingFrame frame ; double [ ] s t a t e = new double [ 2 ] ; / / Ex and Ey f o r ODE ODESolver odeSolver = new RK45MultiStep ( this ) ; ArrayList chargeList ; / / l i s t of charged p a r t i c l e s Trail t r a i l ; double stepSize ; volatile boolean done = false ; public FieldLine ( DrawingFrame frame , double x0 , double y0 , double stepSize ) { this . stepSize = stepSize ; this . frame = frame ; odeSolver . setStepSize ( stepSize ) ; s t a t e [ 0 ] = x0 ; s t a t e [ 1 ] = y0 ; chargeList = frame . getDrawables ( Charge . class ) ; t r a i l = new Trail ( ) ; t r a i l . addPoint ( x0 , y0 ) ; Thread thread = new Thread ( this ) ; thread . s t a r t ( ) ; } public double [ ] getState ( ) { return s t a t e ; } public void getRate ( double [ ] state , double [ ] rate ) { double ex = 0; double ey = 0; for ( I t e r a t o r i t = chargeList . i t e r a t o r ( ) ; i t . hasNext ( ) ; ) { Charge charge = ( Charge ) i t . next ( ) ; double dx = ( charge . getX () − s t a t e [ 0 ] ) ; double dy = ( charge . getY () − s t a t e [ 1 ] ) ; double r2 = dx dx+dy dy ; double r = Math . sqrt ( r2 ) ; i f ( ( r <2 stepSize ) | | ( r >100)) { / / done i f too c l o s e or too f a r done = true ; } ex += ( r==0) ? 0 : charge . q dx/r2/ r ; ey += ( r==0) ? 0 : charge . q dy/r2/ r ; } double mag = Math . sqrt ( ex ex+ey ey ) ; rate [ 0] = (mag==0) ? 0 : ex/mag; rate [ 1] = (mag==0) ? 0 : ey/mag; } public void run ( ) { CHAPTER 10. ELECTRODYNAMICS 368 int counter = 0; while ( ( ( counter <1000)&&!done ) ) { odeSolver . step ( ) ; t r a i l . addPoint ( s t a t e [ 0 ] , s t a t e [ 1 ] ) ; i f ( counter%50==0) { / / repaint every 50th st ep frame . repaint ( ) ; try { Thread . sleep ( 2 0 ) ; / / give the event queue a chance } catch ( InterruptedException ex ) { } } counter ++; Thread . yield ( ) ; } frame . repaint ( ) ; } public void draw ( DrawingPanel panel , Graphics g ) { t r a i l . draw ( panel , g ) ; } } The FieldLine constructor saves a reference to the list of charges to calculate the electric field using (10.1). The loop in the run method solves the differential equation and stores the solution in a drawable trail. The loop is exited when the field line is close to a charge or when the magnitude of the field becomes too small. Because there are situations where the field line will never stop, this loop is executed no more than 1000 times. The FieldLineApp program instantiates a field line when the user double clicks within the panel. Adding a charge or moving a charge removes all field lines from the panel. Study how the handleMouseAction allows the user to drag charges and to initiate the drawing of field lines. You are asked to modify this program in Problem 10.2. Listing 10.4: The FieldLineApp program computes an electric field line when the user clicks within the panel. package org . opensourcephysics . sip . ch10 ; import java . awt . event . MouseEvent ; import org . opensourcephysics . controls . ; import org . opensourcephysics . display . ; import org . opensourcephysics . frames . DisplayFrame ; public class FieldLineApp extends AbstractCalculation implements InteractiveMouseHandler { DisplayFrame frame = new DisplayFrame ( "x" , "y" , "Field lines" ) ; public FieldLineApp ( ) { frame . setInteractiveMouseHandler ( this ) ; frame . setPreferredMinMax ( −10 , 10 , −10, 10); } public void calculate ( ) { / / remove old f i e l d l i n e s frame . removeObjectsOfClass ( FieldLine . class ) ; double x = control . getDouble ( "x" ) ; double y = control . getDouble ( "y" ) ; CHAPTER 10. ELECTRODYNAMICS 369 double q = control . getDouble ( "q" ) ; Charge charge = new Charge ( x , y , q ) ; frame . addDrawable ( charge ) ; } public void reset ( ) { control . println ( "Calculate creates a new charge and clears the field lines." ) ; control . println ( "You can drag charges." ) ; control . println ( "Double click in display to compute a field line." ) ; frame . clearDrawables ( ) ; / / remove charges and f i e l d l i n e s control . setValue ( "x" , 0 ) ; control . setValue ( "y" , 0 ) ; control . setValue ( "q" , 1 ) ; } public void handleMouseAction ( InteractivePanel panel , MouseEvent evt ) { panel . handleMouseAction ( panel , evt ) ; / / panel handles dragging switch ( panel . getMouseAction ( ) ) { case InteractivePanel .MOUSE_DRAGGED : i f ( panel . getInteractive ()== null ) { return ; } / / f i e l d i s i n v a l i d frame . removeObjectsOfClass ( FieldLine . class ) ; / / repaint to keep the screen up to date frame . repaint ( ) ; break ; case InteractivePanel .MOUSE_CLICKED : / / check f o r double c l i c k i f ( evt . getClickCount () >1) { double x = panel . getMouseX ( ) , y = panel . getMouseY ( ) ; FieldLine fieldLine = new FieldLine ( frame , x , y , +0.1); panel . addDrawable ( fieldLine ) ; fieldLine = new FieldLine ( frame , x , y , −0.1); panel . addDrawable ( fieldLine ) ; } break ; } } public s t a t i c void main ( String [ ] args ) { CalculationControl . createApp (new FieldLineApp ( ) ) ; } } Problem 10.2. Verification of field line program (a) Draw field lines for a few simple sets of one, two, and three charges. Choose sets of charges for which all have the same sign and sets for which they are different. Verify that the field lines never connect charges of the same sign. Why do field lines never cross? Are the units of charge and distance relevant? CHAPTER 10. ELECTRODYNAMICS 370 (b) Compare FieldLineApp and ElectricFieldApp. Which representation conveys more information? Consider how each program provides (or does not provide) information about the electric field magnitude and direction. Discuss some of the difficulties with making an accurate field line diagram. (c) FieldLine uses a constant value for ∆s. Modify the algorithm so that the calculation continues when a field line moves off the screen but speed up the algorithm by increasing the value of ∆s. (d) Removing a field line from the drawing panel in the reset method does not stop the thread. Improve the performance of the program by modifying ElectricFieldApp so that a field line’s done variable is set to false when it is removed from the drawing panel. Problem 10.3. Electric field lines from point charges (a) Modify FieldLineApp so that a charge starts ten field lines per unit of charge whenever a new charge is added to the panel or when a charge is moved. Start these field lines close to each charge in such a way that they propagate away from the charge. Should you start these field lines on both positive and negative charges? Explain your answer. (b) Draw the field lines for an electric dipole. (c) Draw the field lines for the electric quadrupole with q(1) = 1, x(1) = 1, y(1) = 1, q(2) = −1, x(2) = −1, y(2) = 1, q(3) = 1, x(3) = −1, y(3) = −1, q(4) = −1, x(4) = 1, and y(4) = −1. (d) A continuous charge distribution can be approximated by a large number of closely spaced point charges. Draw the electric field lines due to a row of ten equally spaced unit charges located between −2.5 and +2.5 on the x-axis. How does the electric field distribution compare to the distribution due to a single point charge? (e) Repeat part (c) with two rows of equally spaced positive charges on the lines y = 0 and y = 1, respectively. Then consider one row of positive charges and one row of negative charges. Problem 10.4. Field lines due to infinite line of charge (a) The FieldLineApp program plots field lines in two dimensions. Sometimes this restriction can lead to spurious results (see Freeman). Consider four identical charges placed at the corners of a square. Use the program to plot the field lines. What, if anything, is wrong with the results? What should happen to the field lines near the center of the square? (b) The two-dimensional analog of a point charge is an infinite line (thin cylinder) of charge perpendicular to the plane. The electric field due to an infinite line of charge is proportional to the linear charge density and inversely proportional to the distance (instead of the distance squared) from the line of charge to a point in the plane. Modify the FieldLine class to compute the field lines from line charges with E(r) = 1/r. Use your modified class to draw the field lines due to four identical line charges located at the corners of a square and compare the field lines with your results in part (a). (c) Use your modified program from part (b) to draw the field lines for the two-dimensional analogs of the distributions considered in Problem 10.3. Compare the results for two and three dimensions and discuss any qualitative differences. (d) Can your program be used to demonstrate Gauss’s law using point charges? What about line charges? CHAPTER 10. ELECTRODYNAMICS 371 10.4 Electric Potential It often is easier to analyze the behavior of a system using energy rather than force concepts. We define the electric potential V (r) by the relation V (r2) − V (r1) = − r2 r1 E · dr (10.6) or E(r) = − V (r). (10.7) Only differences in the potential between two points have physical significance. The gradient operator is given in Cartesian coordinates by = ∂ ∂x ˆx + ∂ ∂y ˆy + ∂ ∂z ˆz (10.8) where the vectors ˆx, ˆy, and ˆz are unit vectors along the x-, y-, and z-axes, respectively. If V depends only on the magnitude of r, then (10.7) becomes E(r) = −dV (r)/dr. Recall that V (r) for a point charge q relative to a zero potential at infinity is given by V (r) = q r (Gaussian units). (10.9) The surface on which the electric potential has an equal value everywhere is called an equipotential surface (a curve in two dimensions). Because E is in the direction in which the electric potential decreases most rapidly, the electric field lines are orthogonal to the equipotential surfaces at any point. The Open Source Physics frames package contains the Scalar2DFrame class to provide graphical representations of scalar fields (see Appendix 9B). Problem 10.5 uses a scalar field plot to show the electric potential. The following code fragment shows how to calculate the electric potential at a grid point. List chargeList = frame . getDrawables ( Charge . class ) ; I t e r a t o r i t = chargeList . i t e r a t o r ( ) ; while ( i t . hasNext ( ) ) { Charge charge = ( Charge ) i t . next ( ) ; double xs = charge . getX ( ) , ys = charge . getY ( ) ; for ( int ix = 0; ix < n ; ix ++) { double x= frame . indexToX ( ix ) ; double dx = ( xs − x ) ; / / charge g r i dp o i n t separ ation for ( int iy = 0; iy < n ; iy ++) { double y= frame . indexToY ( iy ) ; double dy = ( ys −y ) ; / / charge g r i dp o i n t separ ation double r2 = dx dx + dy dy ; double r = Math . sqrt ( r2 ) ; i f ( r > 0) { eField [ ix ] [ iy ] += charge . q/ r ; } } } } frame . setAll ( eField ) ; CHAPTER 10. ELECTRODYNAMICS 372 Problem 10.5. Equipotential contours (a) Write a program based on ElectricFieldApp that draws equipotential lines using the charge distributions considered in Problem 10.3. (b) Explain why equipotential surfaces (lines in two dimensions) never cross. We can use the orthogonality between the electric field lines and the equipotential lines to modify FieldLineApp so that it draws the latter. Because the components of the line segment ∆s parallel to the electric field line are given by ∆x = ∆s(Ex/E) and ∆y = ∆s(Ey/E), the components of the line segment perpendicular to E, and hence, parallel to the equipotential line, are given by ∆x = −∆s(Ey/E) and ∆y = ∆s(Ex/E). It is unimportant whether the minus sign is assigned to the x or y component, because the only difference would be the direction that the equipotential lines are drawn. Problem 10.6. Equipotential lines (a) Write a program that is based on FieldLineApp and FieldLine to draw some of the equipotential lines for the charge distributions considered in Problem 10.3. Use a mouse click to determine the initial position of an equipotential line. The equipotential calculation should stop when the line returns close to the starting point or after an unreasonable number of calculations. The program should also kill the thread when the user moves a charge, hits the Reset button, or when the application terminates. (b) What would a higher density of equipotential lines mean if we drew lines such that each adjacent line differed from a neighboring one by a fixed potential difference? (c) Explain why equipotential surfaces never cross. Problem 10.7. The electric potential due to a finite sheet of charge Consider a uniformly charged nonconducting plate of total charge Q and linear dimension L centered at (0,0,0) in the x-y plane. In the limit L → ∞ with the charge density σ = Q/L2 a constant, we know that the electric field is normal to the sheet and its magnitude is given by 2πσ (Gaussian units). What is the electric field due to a finite sheet of charge? A simple method is to divide the plate into a grid of p square regions on a side such that each region is sufficiently small to be approximated by a point charge of magnitude q = Q/p2. Because the potential is a scalar, it is easier to compute the total potential rather than the total electric field from the N = p2 point charges. Use the relation (10.9) for the potential from a point charge and write a program to compute V (z) and hence, Ez = −∂V (z)/∂z for points along the z-axis and perpendicular to the sheet. Take L = 1, Q = 1, and p = 10 for your initial calculations. Increase p until your results for V (z) do not change significantly. Plot V (z) and Ez as a function of z and compare their z-dependence to their infinite sheet counterparts. ∗Problem 10.8. Electrostatic shielding We know that the (static) electric field is zero inside a conductor, all excess charges reside on the surface of the conductor, and the surface charge density is greatest at the points of greatest curvature. Although these properties are plausible, it is instructive to do a simulation to see how these properties follow from Coulomb’s law. For simplicity, consider the conductor to be two-dimensional so that the potential energy is proportional to lnr rather than 1/r (see Problem 10.4). It is also convenient to choose the surface of the conductor to be an ellipse. CHAPTER 10. ELECTRODYNAMICS 373 (a) If we are interested only in the final distribution of the charges and not in the dynamics of the system, we can use a Monte Carlo method. Our goal is to find the minimum energy configuration beginning with the N charges randomly placed within a conducting ellipse. One method is to choose a charge i at random and make a trial change in the position of the charge. The trial position should be no more than δ from the old position and still be within the ellipse. Choose δ ≈ b/10, where b is the semiminor axis of the ellipse. Compute the change in the total potential energy given by (in arbitrary units) ∆U = − j [lnr (new) ij − lnr (old) ij ]. (10.10) The sum is over all charges in the system not including i. If ∆U > 0, then reject the trial move; otherwise accept it. Repeat this procedure many times until very few trial moves are accepted. Write a program to implement this Monte Carlo algorithm. Run the simulation for N ≥ 20 charges inside a circle and then repeat the simulation for an ellipse. How are the charges distributed in the (approximately) minimum energy distribution? Which parts of the ellipse have a higher charge density? (b) Repeat part (a) for a two-dimensional conductor, but assume that the potential energy U ∼ 1/r. Do the charges move to the surface? (c) Is it sufficient that the interaction be repulsive for the results of parts (a) and (b) to hold? (d) Repeat part (a) with the added condition that there is a fixed positive charge of magnitude N/2 located outside the ellipse. How does this fixed charge affect the charge distribution? Are the excess free charges still at the surface? Try different positions for the fixed charge. (e) Repeat parts (a) and (b) for N = 50 charges located within an ellipsoid in three dimensions. 10.5 Numerical Solutions of Boundary Value Problems In Section 10.1 we found the electric fields and potentials due to a fixed distribution of charges. Suppose that we do not know the positions of the charges but instead know only the potential on a set of boundaries surrounding a charge-free region. This information is sufficient to determine the potential V (r) at any point within the charge-free region. The direct method of solving for V (x,y,z) is based on Laplace’s equation which can be expressed in Cartesian coordinates as 2 V (x,y,z) ≡ ∂2V ∂x2 + ∂2V ∂y2 + ∂2V ∂z2 = 0. (10.11) The problem is to find the function V (x,y,z) that satisfies (10.11) and the specified boundary conditions. This type of problem is an example of a boundary value problem. Because analytic methods for regions of arbitrary shape do not exist, the only general approach is to use numerical methods. Laplace’s equation is not a new law of physics, but can be derived directly from (10.7) and the relation ·E = 0 or indirectly from Coulomb’s law in regions of space where there is no charge. CHAPTER 10. ELECTRODYNAMICS 374 For simplicity, we consider only two-dimensional boundary value problems for V (x,y). We use a finite difference method and divide space into a discrete grid of sites located at the coordinates (x,y). In Problem 10.9b, we show that in the absence of a charge at (x,y), the discrete form of Laplace’s equation satisfies the relation V (x,y) ≈ 1 4 [V (x + ∆x,y) + V (x − ∆x,y) + V (x,y + ∆y) + V (x,y − ∆y)] (two dimensions) (10.12) where V (x,y) is the value of the potential at the site (x,y). Equation (10.12) says that V (x,y) is the average of the potential of its four nearest neighbor sites. This remarkable property of V (x,y) can be derived by approximating the partial derivatives in (10.11) by finite differences (see Problem 10.9b). In Problem 10.9(a) we verify (10.12) by calculating the potential due to a point charge at a point in space we select and at the four nearest neighbors. As the form of (10.12) implies, the average of the potential at the four neighboring sites should equal the potential at the center site. We assume the form (10.9) for the potential V (r) due to a point charge, a form that satisfies Laplace’s equation for r 0. Problem 10.9. Verification of the difference equation for the potential (a) Modify PotentialFieldApp to compare the computed potential at a point to the average of the potential at its four nearest neighbor sites. Choose reasonable values for the spacings ∆x and ∆y and consider a point that is not too close to the source charge. Do similar measurements for other points. Does the relative agreement with (10.12) depend on the distance of the point to the source charge? Choose smaller values of ∆x and ∆y and determine if your results are in better agreement with (10.12). Does it matter whether ∆x and ∆y have the same value? (b) Derive the finite difference equation (10.12) for V (x,y) using the second-order Taylor ex- pansion: V (x + ∆x,y) = V (x,y) + ∆x ∂V (x,y) ∂x + 1 2 (∆x)2 ∂2V (x,y) ∂x2 + ··· (10.13) V (x,y + ∆y) = V (x,y) + ∆y ∂V (x,y) ∂y + 1 2 (∆y)2 ∂2V (x,y) ∂y2 + ··· . (10.14) The effect of including higher derivatives is discussed by MacDonald (see references). Now that we have found that (10.12), a finite difference form of Laplace’s equation, is consistent with Coulomb’s law, we adopt (10.12) as the basis for computing the potential for systems for which we cannot calculate the potential directly. In particular, we consider problems where the potential is specified on a closed surface that divides space into interior and exterior regions in which the potential is independently determined. For simplicity, we consider only two-dimensional geometries. The approach, known as the relaxation method, is based on the following algorithm: 1. Divide the region of interest into a rectangular grid spanning the region. The region is enclosed by a surface (curve in two dimensions) with specified values of the potential along the curve. 2. Assign to a boundary site the potential of the boundary nearest the site. CHAPTER 10. ELECTRODYNAMICS 375 3. Assign all interior sites an arbitrary potential (preferably a reasonable guess). 4. Compute new values for the potential V for each interior site. Each new value is obtained by finding the average of the previous values of the potential at the four nearest neighbor sites. 5. Repeat step (4) using the values of V obtained in the previous iteration. This iterative process is continued until the potential at each interior site is computed to the desired accuracy. The program shown in Listing 10.5 implements this algorithm using a grid of voltages and a boolean grid to signal the presence of a conductor. Listing 10.5: The LaplaceApp program solves the Laplace equation using the relaxation method. package org . opensourcephysics . sip . ch10 ; import java . awt . event . ; import org . opensourcephysics . controls . ; import org . opensourcephysics . display . ; import org . opensourcephysics . display2d . ; import org . opensourcephysics . frames . ; public class LaplaceApp extends AbstractSimulation implements InteractiveMouseHandler { Scalar2DFrame frame = new Scalar2DFrame ( "x" , "y" , "Electric potential" ) ; boolean [ ] [ ] isConductor ; double [ ] [ ] potential ; / / e l e c t r i c p o t e n t i a l double maximumError ; int gridSize ; / / number of s i t e s on s i d e of grid public LaplaceApp ( ) { frame . setInteractiveMouseHandler ( this ) ; } public void i n i t i a l i z e ( ) { maximumError = control . getDouble ( "maximum error" ) ; gridSize = control . getInt ( "size" ) ; initArrays ( ) ; frame . s e t V i s i b l e ( true ) ; frame . showDataTable ( true ) ; / / show the data t a b l e } public void initArrays ( ) { isConductor = new boolean [ gridSize ] [ gridSize ] ; potential = new double [ gridSize ] [ gridSize ] ; frame . setPaletteType ( ColorMapper .DUALSHADE) ; / / isConductor array i s f a l s e by d e f a u l t / / v o l t a g e in p o t e n t i a l array i s 0 by d e f a u l t for ( int i = 0; i 0.5. Describe the field lines for t > 0.5. Does the particle accelerate at any time? Is there any radiation? Problem 10.22. Frequency dependence of an oscillating charge (a) The radiated power at any point in space is proportional to E2. Plot |E| versus time at a fixed observation point (for example, X = 10,Y = Z = 0) and calculate the frequency dependence of the amplitude of |E| due to a charge oscillating at the frequency ω. It is shown in standard textbooks that the power associated with radiation from an oscillating dipole is proportional to ω4. How does the ω-dependence that you measured compare to that for dipole radiation? Repeat for a much bigger value of R and explain any differences. (b) Repeat part (a) for a charge moving in a circle. Are there any qualitative differences? 10.8 *Maxwell’s Equations In Section 10.7 we found that accelerating charges produce electric and magnetic fields that depend on position and time. We now investigate the direct relation between changes in E and B given by the differential form of Maxwell’s equations: ∂B ∂t = − 1 c × E (10.46) ∂E ∂t = c × B − 4πj (10.47) where j is the electric current density. We can regard (10.46) and (10.47) as the basis of electrodynamics. In addition to (10.46) and (10.47), we need the relation between j and the charge density ρ that expresses the conservation of charge: ∂ρ ∂t = − · j. (10.48) A complete description of electrodynamics requires (10.46), (10.47), and (10.48), and the initial values of all currents and fields. CHAPTER 10. ELECTRODYNAMICS 393 For completeness, we obtain the Maxwell’s equations that involve · B and · E by taking the divergence of (10.46) and (10.47), substituting (10.48) for · j, and then integrating over time. If the initial fields are zero, we obtain (using the relation · ( × a) = 0 for any vector a) · E = 4πρ (10.49) · B = 0. (10.50) If we introduce the electric and magnetic potentials, it is possible to convert the first-order equations (10.46) and (10.47) to second-order differential equations. However, the familiar first-order equations are better suited for numerical analysis. To solve (10.46) and (10.47) numerically, we need to interpret the curl and divergence of a vector. As its name implies, the curl of a vector measures how much the vector twists around a point. A coordinate free definition of the curl of an arbitrary vector W is ( × W) · ˆS = lim S→0 1 S C W · dl (10.51) where S is the area of any surface bordered by the closed curve C, and ˆS is a unit vector normal to the surface S. Equation (10.51) gives the component of × W in the direction of ˆS and suggests a way of computing the curl numerically. We divide space into cubes of linear dimension ∆l. The rectangular components of W can be defined either on the edges or on the faces of the cubes. We compute the curl using both definitions. We first consider a vector B that is defined on the edges of the cubes so that the curl of B is defined on the faces. (We use the notation B because we will find that it is convenient to define the magnetic field in this way.) Associated with each cube is one edge vector and one face vector. We label the cube by the coordinates corresponding to its lower left front corner; the three components of B associated with this cube are shown in Figure 10.6a. The other edges of the cube are associated with B vectors defined at neighboring cubes. The discrete version of (10.51) for the component of × B defined on the front face of the cube (i,j,k) is ( × B) · ˆS = 1 (∆l)2 4 i=1 Bi∆li (10.52) where S = (∆l)2, and Bi and li are shown in Figures 10.6b and 10.6c, respectively. Note that two of the Bi are associated with neighboring cubes. The components of a vector can also be defined on the faces of the cubes. We call this vector E because it will be convenient to define the electric field in this way. In Figure 10.7(a) we show the components of E associated with the cube (i,j,k). Because E is normal to a cube face, the components of × E lie on the edges. The components Ei and li are shown in Figures 10.7(b) and 10.7(c), respectively. The form of the discrete version of × E is similar to (10.52) with Bi replaced by Ei, where Ei and li are shown in Figures 10.7(b) and 10.7(c), respectively. The z-component of × E is along the left edge of the front face. A coordinate free definition of the divergence of the vector field W is · W = lim V →0 1 V S W · dS (10.53) where V is the volume enclosed by the closed surface S. The divergence measures the average flow of the vector through a closed surface. An example of the discrete version of (10.53) is given in (10.54). CHAPTER 10. ELECTRODYNAMICS 394 (a) (i, j, k) Bz By Bx (∇ × B)y (b) (i, j, k) (i+1, j, k) (i, j, k+1) B1 B2 B3 B4 (c) (i, j, k) ∆l1 ∆l2 ∆l3 ∆l4 z y x Figure 10.6: Calculation of the curl of B defined on the edges of a cube. (a) The edge vector B associated with cube (i,j,k). (b) The components Bi along the edges of the front face of the cube. B1 = Bx(i,j,k), B2 = Bz(i + 1,j,k), B3 = −Bx(i,j,k + 1), and B4 = −Bz(i,j,k). (c) The vector components ∆li on the edges of the front face. (The y-component of × B defined on the face points in the negative y direction.) We now discuss where to define the quantities ρ, j, E, and B on the grid. It is natural to define the charge density ρ at the center of a cube. From the continuity equation (10.48), we see that this definition leads us to define j at the faces of the cube. Hence, each face of a cube has a number associated with it corresponding to the current density flowing parallel to the outward normal to that face. Given the definition of j on the grid, we see from (10.47) that the electric field E and j should be defined at the same places, and hence, we define the electric field on the faces of the cubes. Because E is defined on the faces, it is natural to define the magnetic field B on the edges of the cubes. Our definitions of the vectors j, E, and B on the grid are now complete. We label the faces of cube c by the symbol fc. If we use the simplest finite difference method with a discrete time step ∆t and discrete spatial interval ∆x = ∆y = ∆z ≡ ∆l, we can write the continuity equation as ρ c,t + 1 2 ∆ − ρ c,t − 1 2 ∆t = − ∆t ∆l 6 fc=1 j(fc,t). (10.54) The factor of 1/∆l comes from the area of a face (∆l)2 used in the surface integral in (10.53) divided by the volume (∆l)3 of a cube. In the same spirit, the discretization of (10.47) can be CHAPTER 10. ELECTRODYNAMICS 395 (a) (i, j, k) Ez Ex Ey z y x E1 E2 E3 E4 (b) (i, j, k) (i, j-1, k) (i-1, j, k) ∆l1 ∆l2 ∆l3 ∆l4 (∇ × E)z (i, j, k) (c) Figure 10.7: Calculation of the curl of the vector E defined on the faces of a cube. (a) The face vector E associated with the cube (i,j,k). The components associated with the left, front, and bottom faces are Ex(i,j,k),Ey(i,j,k),Ez(i,j,k), respectively. (b) The components Ei on the faces that share the front left edge of the cube (i,j,k). E1 = Ex(i,j − 1,k),E2 = Ey(i,j,k),E3 = −Ex(i,j,k),andE4 = −Ey(i − 1,j,k). The cubes associated with E1 and E4 also are shown. (c) The vector components ∆li on the faces that share the left front edge of the cube. (The z-component of the curl of E defined on the left edge points in the positive z direction.) written as E(f ,t + 1 2 ∆t) − E f ,t − 1 2 ∆t = ∆t × B − 4πj(f ,t) . (10.55) Note that E in (10.55) and ρ in (10.54) are defined at different times than j. As usual, we choose units such that c = 1. We next need to define a square around which we can discretize the curl. If E is defined on the faces, it is natural to use the square that is the border of the faces. As we have discussed, this choice implies that we should define the magnetic field on the edges of the cubes. We write (10.55) as: E(f ,t + 1 2 ∆t) − E(f ,t − 1 2 ∆t) = ∆t 1 ∆l 4 ef =1 B(ef ,t) − 4πj(f ,t) (10.56) where the sum is over ef , the four edges of the face f (see Figure 10.7b). Note that B is defined CHAPTER 10. ELECTRODYNAMICS 396 at the same time as j. In a similar way we can write the discrete form of (10.46) as B(e,t + ∆t) − B(e,t) = − ∆t ∆l 4 fe=1 E fe,t + 1 2 ∆t (10.57) where the sum is over fe, the four faces that share the same edge e (see Figure 10.7b). We now have a well-defined algorithm for computing the spatial dependence of the electric and magnetic field, the charge density, and the current density as a function of time. This algorithm was developed by Yee, an electrical engineer, in 1966, and independently by Visscher, a physicist, in 1988 who also showed that all of the integral relations and other theorems that are satisfied by the continuum fields are also satisfied for the discrete fields. Usually, the most difficult part of this algorithm is specifying the initial conditions because we cannot simply place a charge somewhere. The reason is that the initial fields appropriate for this charge would not be present. Indeed, our rules for updating the fields and the charge densities reflect the fact that the electric and magnetic fields do not appear instantaneously at all positions in space when a charge appears, but instead evolve from the initial appearance of a charge. Of course, charges do not appear out of nowhere, but appear by disassociating from neutral objects. Conceptually, the simplest initial condition corresponds to two charges of opposite sign moving oppositely to each other. This condition corresponds to an initial current on one face. From this current a charge density and thus an electric field appears using (10.54) and (10.56), respectively, and a magnetic field appears using (10.57). Because we cannot compute the fields for an infinite lattice, we need to specify the boundary conditions. The easiest method is to use fixed boundary conditions such that the fields vanish at the edges of the lattice. If the lattice is sufficiently large, fixed boundary conditions are a reasonable approximation. However, fixed boundary conditions usually lead to nonphysical reflections off the edges, and a variety of approaches have been used including boundary conditions equivalent to a conducting medium that gradually absorbs the fields. In some cases physically motivated boundary conditions can be employed. For example, in simulations of microwave cavity resonators (see Problem 10.24), the appropriate boundary conditions are that the tangential component of E and the normal component of B vanish at the boundary. As we have noted, E and ρ are defined at different times than B and j. This half-step approach leads to well behaved equations that are stable over a range of parameters. An analysis of the stability requirement for the Yee-Visscher algorithm shows that the time step ∆t must be smaller than the spatial grid ∆l by: c∆t ≤ ∆l √ 3 (stability requirement). (10.58) The Maxwell class implements the Visscher-Yee finite difference algorithm for solving Maxwell’s equations. The field and current data are stored in multi-dimensional arrays E, B, and J. The first index determines the vector component. The last three indices represent the three spatial coordinates. The current method models a positive current flowing for one time unit. This current flow produces both electric and magnetic fields. Because charge is conserved, the current flow produces an electrostatic dipole. Negative charge remains at the source and a positive charge is deposited at the destination. Note that the doStep method invokes a damping method that reduces the fields at points near the boundaries, thereby absorbing the emitted radiation and reducing the reflected electromagnetic waves. Your understanding of the Yee-Visscher algorithm for finding solutions to Maxwell’s equations will be enhanced by carefully reading the MaxwellApp program and the Maxell class. CHAPTER 10. ELECTRODYNAMICS 397 Listing 10.8: The Maxwell class implements the Yee-Visscher finite difference approximation to Maxwell’s equations. package org . opensourcephysics . sip . ch10 ; public class Maxwell { / / s t a t i c v a r i a b l e s determine units and time s c a l e s t a t i c final double pi4 = 4 Math . PI ; s t a t i c final double dt = 0.03; s t a t i c final double dl = 0 . 1 ; s t a t i c final double escale = dl /(4 Math . PI dt ) ; s t a t i c final double bscale = escale dl /dt ; s t a t i c final double j s c a l e = 1; double dampingCoef = 0 . 1 ; / / damping c o e f f i c i e n t near boundaries int size ; double t ; / / time double [ ] [ ] [ ] [ ] E , B , J ; public Maxwell ( int size ) { this . size = size ; / / 3D arrays f o r e l e c t r i c f i e l d , magnetic f i e l d , and current / / l a s t t h r e e i n d i c e s i n d i c a t e location , f i r s t index i n d i c a t e s / / x , y , or z component E = new double [ 3 ] [ size ] [ size ] [ size ] ; B = new double [ 3 ] [ size ] [ size ] [ size ] ; J = new double [ 3 ] [ size ] [ size ] [ size ] ; } public void doStep ( ) { current ( t ) ; / / update the current computeE ( ) ; / / st ep e l e c t r i c f i e l d computeB ( ) ; / / st ep magnetic f i e l d damping ( ) ; / / damp t r a n s i e n t s t += dt ; } void current ( double t ) { final int mid = size /2; double delta = 1 . 0 ; for ( int i = −3; i <5; i ++) { J [ 0 ] [ mid+i ] [ mid ] [ mid] = ( t1) ? (b−a ) / ( n−1) : 0; Function f ; try { f = new ParsedFunction ( f s t r i n g ) ; } catch ( ParserException ex ) { control . println ( ex . getMessage ( ) ) ; CHAPTER 11. NUMERICAL AND MONTE CARLO METHODS 441 return ; } plotFrame . clearData ( ) ; double [ ] range = Util . getDomain ( f , a , b , 100); plotFrame . setPreferredMinMax ( a −(b−a )/4 , b+(b−a )/4 , range [ 0 ] , range [ 1 ] ) ; FunctionDrawer func = new FunctionDrawer ( f ) ; func . color = java . awt . Color .RED; plotFrame . addDrawable ( func ) ; double x = a ; for ( int i = 0; i =0&&l a t t i c e . getAtIndex ( s i t e )==−1) { CHAPTER 12. PERCOLATION 449 / / c o l o r c l u s t e r to which s i t e belongs colorCluster ( s i t e ) ; / / c y c l e through 7 c l u s t e r c o l o r s clusterNumber = ( clusterNumber+1)%7; l a t t i c e . repaint ( / / d i s p l a y l a t t i c e with c o l o r e d c l u s t e r } } } / / Occupies a l l s i t e s with p r o b a b i l i t y p public void calculate ( ) { L = control . getInt ( "Lattice size" ) ; l a t t i c e . r e s i z e L a t t i c e (L , L ) ; / / r e s i z e l a t t i c e / / same seed w i l l generate same s e t of random numbers random . setSeed ( control . getInt ( "Random seed" ) ) ; double p = control . getDouble ( "Site occupation probability" ) ; / / occupy l a t t i c e s i t e s with p r o b a b i l i t y p for ( int i = 0; i 0) { / / get next s i t e to t e s t and remove i t from l i s t int s i t e = sitesToTest [−−numSitesToTest ] ; for ( int j = 0; j <4; j ++) { / / v i s i t four p o s s i b l e neighbors int neighborSite = getNeighbor ( site , j ) ; / / t e s t i f n e i g h b o r S i t e i s occupied , and not yet added to c l u s t e r i f ( neighborSite >=0&&l a t t i c e . getAtIndex ( neighborSite )==−1) { / / c o l o r n e i g h b o r S i t e according to clusterNumber l a t t i c e . setAtIndex ( neighborSite , clusterNumber ) ; / / add n e i g h b o r S i t e to s i t e s T o T e s t [ ] sitesToTest [ numSitesToTest++] = neighborSite ; } } } } public void reset ( ) { control . setValue ( "Lattice size" , 32); control . setValue ( "Site occupation probability" , 0.5927); control . setValue ( "Random seed" , 100); calculate ( ) ; } public s t a t i c void main ( String args [ ] ) { CalculationControl control = CalculationControl . createApp (new PercolationApp ( ) ) ; } } The percolation threshold pc is defined as the site occupation probability p at which a spanning cluster first appears in an infinite lattice. However, for a finite lattice, there is a nonzero probability of a spanning cluster connecting one side of the lattice to the opposite side for any value of p > 0. For small p, this probability is order pL (see Figure 12.4), and the probability of spanning goes to zero as L becomes large. Hence, for small p and sufficiently large L, only finite clusters exist. For a finite lattice, the definition of spanning is arbitrary. For example, we can define a spanning cluster as one that (i) spans the lattice either horizontally or vertically; (ii) spans the lattice in a fixed direction, for example, vertically; or (iii) spans the lattice both horizontally and CHAPTER 12. PERCOLATION 451 Figure 12.4: An example of a spanning cluster with a probability proportional to pL on a L = 8 lattice. The probability of a spanning cluster with more sites will be proportional to a higher power of p. vertically. These spanning rules are based on open (nonperiodic) boundary conditions, which we will use because the resulting clusters are easier to visualize and determine. The criterion for defining pc(L) for a finite lattice is also somewhat arbitrary. One possibility is to define pc(L) as the mean value of p at which a spanning cluster first appears. Another possibility is to define pc(L) as the value of p for which half of the configurations generated at random span the lattice. These criteria will lead to the same extrapolated value for pc in the limit L → ∞. In Problem 12.1 we will find an estimated value for pc(L) that is accurate to about 10%. A more sophisticated analysis discussed in Project 12.13 allows us to extrapolate our results for pc(L) to L → ∞. In Project 12.17 we will discuss the use of periodic boundary conditions to define the clusters. Problem 12.1. Site percolation on the square lattice (a) Use PercolationApp to generate random site percolation configurations on a square lattice. Estimate pc(L) by finding the mean value of p at which a spanning cluster first occurs. For a given seed, the calculate method assigns a random number to each site and determines the occupancy of each site by comparing the sites’s random number to p. Choose one of the spanning rules and begin with a value of p for which a spanning cluster is unlikely to be present. Then systematically increase p until you find a spanning cluster. Then choose a new seed and, hence, a new set of random numbers. Repeat this procedure for at least ten configurations and find the average value of pc(L). (Each configuration corresponds to a different set of random numbers.) (b) Repeat part (a) for larger values of L. Is pc(L) better defined for larger L; that is, are the values of pc(L) spread over a smaller range of values? How quickly can you visually determine the existence of a spanning cluster? Describe your visual algorithm for determining if a spanning cluster exists. (c) Choose L ≥ 1024 and generate a configuration of sites at p = pc. For this value of L, you won’t be able to distinguish the individual sites. Click on the lattice until you generate some large clusters. Describe their visual appearance. For example, are they compact or ramified? The value of pc depends on the symmetry of the lattice and on its dimension. In addition to the square lattice, the most common two-dimensional lattice is the triangular lattice. As discussed in Chapter 8, the essential difference between the square and triangular lattices is the number of nearest neighbors. CHAPTER 12. PERCOLATION 452 ∗Problem 12.2. Site percolation on the triangular lattice Modify PercolationApp to simulate random site percolation on a triangular lattice. Assume that a connected path connects the top and bottom sides of the lattice (see Figure 12.5). Do you expect pc for the triangular lattice to be smaller or larger than the value of pc for the square lattice? Estimate pc(L) for increasing values of L. Are your results for pc consistent with your expectations? As we will discuss in the following, the exact value of pc for the triangular lattice is pc = 1/2. In bond percolation each lattice site is occupied, but only a fraction of the sites have connections or occupied bonds between them and their nearest neighbor sites (see Figure 12.6). Each bond is either occupied with probability p or not occupied with probability 1 − p. A cluster is a group of sites connected by occupied bonds. The wire mesh described in Section 12.1 is an example of bond percolation if we imagine cutting the bonds between the nodes rather than removing the nodes themselves. An application of bond percolation to the description of gelation is discussed in Problem 12.3. For bond percolation on the square lattice, the exact value of pc can be obtained by introducing the dual lattice. The nodes of the dual lattice are the centers of the squares between the nodes in the original lattice (see Figure 12.7). The occupied bonds of the dual lattice are those that do not cross an occupied bond of the original lattice. Because every occupied bond on the dual lattice crosses exactly one unoccupied bond of the original lattice, the probability ˜p of an occupied bond on the dual lattice is 1 − p, where p is the probability of an occupied bond on the original lattice. If we assume that the dual lattice percolates if and only if the original lattice does not, and vice versa, then pc = 1 − pc or pc = 1/2. This assumption holds for bond percolation on a square lattice because if a cluster in the original lattice spans in both directions, then because the occupied dual lattice bonds can only cross unoccupied bonds of the original lattice, the dual lattice clusters are blocked from spanning. An example is shown in Figure 12.7. This argument does not apply to cubic lattices in three dimensions, but it can be used for site percolation on a triangular lattice to yield pc = 1/2. ∗Problem 12.3. Bond percolation on a square lattice Suppose that all the lattice sites of a square lattice are occupied by monomers, each with functionality four; that is, each monomer can form a maximum of four bonds. This model is equivalent to bond percolation on a square lattice. Assume that the presence or absence of a bond between a given pair of monomers is random and is characterized by the probability p. For small p, the system consists of only finite polymers (groups of monomers) and the system is in the sol phase. For some threshold value pc, there will be a single polymer that spans the lattice. We say that for p ≥ pc, the system is in the gel phase. How does a bowl of jello, an example of a gel, differ from a bowl of broth? Write a program to simulate bond percolation on a square lattice and determine the bond percolation threshold. Are your results consistent with the exact result pc = 1/2? We can also consider continuum percolation models. For example, we can place disks at random into a two-dimensional box. Two disks are in the same cluster if they touch or overlap. A typical continuum (off-lattice) percolation configuration is depicted in Figure 12.8. One quantity of interest is the quantity φ, the fraction of the area (volume in three dimensions) in the system that is covered by disks. In the limit of an infinite size box, it can be shown that φ = 1 − e−ρπr2 (12.1) CHAPTER 12. PERCOLATION 453 Figure 12.5: Example of a spanning site cluster on a L = 4 triangular lattice. The filled circles represent the occupied sites. Figure 12.6: Two examples of bond clusters. The occupied bonds are shown as bold lines. where ρ is the number of disks per unit area, and r is the radius of a disk (see Xia and Thorpe). Equation (12.1) is not accurate for small boxes because disks located near the edge of the box are a significant fraction of the total number of disks. Problem 12.4. Continuum percolation (a) Suppose that disks of unit diameter are placed at random on the sites of a square lattice with unit lattice spacing. Define φ as the area fraction covered by the disks. Convince yourself that φc = πpc/4. (b) Modify PercolationApp to simulate continuum percolation. Instead of placing the disks on regular lattice sites, place their centers at random in a square box of area L2. The relevant parameter is the density ρ, the number of disks per unit area, instead of the probability p. We can no longer use the LatticeFrame class. Instead, two arrays are needed to store the x and y locations of the disks. When the mouse is clicked on a disk, your program will need to determine which disk is at the location of the mouse, and then check all the other disks to see if they overlap or touch the disk you have chosen. This check is recursively continued for all overlapping disks. It is also useful to have an array that keeps track of the clusterNumber for each disk. Only disks that have not been assigned a cluster number need to be checked for overlaps. (c) Estimate the value of the density ρc at which a spanning cluster first appears. Given this value of ρc, use a Monte Carlo method to estimate the corresponding area fraction φc (see Section 11.2). Choose points at random in the box and compute the fraction of points that lie within any disk. Explain why φc is larger for continuum percolation than it is for site percolation. Compare your direct Monte Carlo estimate of φc with the indirect value of φc obtained from (12.1) using the value of ρc. Explain any discrepancy. CHAPTER 12. PERCOLATION 454 Figure 12.7: Occupied bonds on a bond percolation lattice are shown by heavy dark lines. The dual lattice consists of the open circles. The dashed lines are the occupied bonds on the dual lattice. The original lattice contains a cluster that spans both vertically and horizontally, which prevents the dual lattice from having a spanning cluster. (d)∗ Consider the simple model of the cookie problem discussed in Section 12.1. Write a program that places disks at random into a square box and chooses their diameter randomly between 0 and 1. Estimate the value of ρc at which a spanning cluster first appears and compare its value to your estimate found in part (c)? Is your value for φc more or less than what was found in part (c)? (e)∗ Another variation of the cookie problem is to place disks with unit diameter at random in a box with the constraint that the disks do not overlap. Continue to add disks until the fraction of successful attempts becomes less than 1%, that is, when one hundred successive attempts at adding a disk are not successful. Does a spanning cluster exist? If not, increase the diameters of all the disks at a constant rate (in analogy to the baking of the cookies) until a spanning cluster is attained. How does φc for this model compare with the value of φc found in part (d)? A continuum model that is applicable to random porous media is known as the Swiss cheese model. In this model the relevant quantity (the cheese) is the space between the disks. For the Swiss cheese model in two dimensions, the cheese area fraction at the percolation threshold, ψc, is given by ψc = 1 − φc, where φc is the disk area fraction at the percolation threshold of the disks. Does such a relation hold in three dimensions (see Project 12.14)? So far, we have emphasized the existence of the percolation threshold pc and the appearance of a spanning cluster or path for p ≥ pc. Another quantity that characterizes percolation is P∞(p), the probability that an occupied site belongs to the spanning cluster. The probability P∞ is defined as P∞(p) = number of sites in the spanning cluster total number of occupied sites . (12.2) As an example, P∞(p = 0.59) = 140/154 for the single configuration shown in Figure 12.3b. An accurate estimate of P∞ involves an average over many configurations for a given value of p. For an infinite lattice, P∞(p) = 0 for p < pc and P∞(p) = 1 for p = 1. Between pc and 1, P∞(p) increases monotonically. More information can be obtained from the cluster size distribution ns(p) defined as ns(p) = average number of clusters of size s total number of lattice sites . (12.3) For p ≥ pc, the spanning cluster is excluded from ns. (For historical reasons, the size of a cluster refers to the number of sites in the cluster rather than to its spatial extent.) As an example, we CHAPTER 12. PERCOLATION 455 Figure 12.8: A model of continuum (off-lattice) percolation realized by placing disks of unit diameter at random into a square box of linear dimension L. If we concentrate on the voids between the disks rather than the disks, then this model of continuum percolation is known as the Swiss cheese model. see from Figure 12.3a that ns(1) = 20/256, ns(2) = 4/256, ns(3) = 5/256, and ns(7) = 1/256 for p = 0.2 and is zero otherwise. Because N s sns is the total number of occupied sites (N is the total number of lattice sites), and Nsns is the number of occupied sites in clusters of size s, the quantity ws = sns s sns (12.4) is the probability that an occupied site chosen at random is part of an s-site cluster. The mean cluster size S is defined as S(p) = s sws = s s2ns s sns . (12.5) The sum in (12.5) is over finite clusters only. As an example, the weights corresponding to the clusters in Figure 12.3a are ws(1) = 20/50, ws(2) = 8/50, ws(3) = 15/50, and ws(7) = 7/50, and hence, S = 130/50. Problem 12.5. Qualitative behavior of ns(p), S(p), and P∞(p) (a) Use PercolationApp to visually determine the cluster size distribution ns(p) for a square lattice with L = 16 and p = 0.4, p = pc, and p = 0.8. Take pc = 0.5927. Consider at least ten configurations for each value of p and average ns(p) over the configurations. For each value of p, plot ns as a function of s and describe the observed s-dependence. Does ns decrease more rapidly with increasing s for p = pc or for p pc? Plot lnns versus s and versus lns. Does either of these plots suggest the form of the s-dependence of ns? Is there a qualitative change near pc? You probably will not be able to obtain definitive answers to these questions at this point, but we will discuss a more quantitative approach later. Better results for ns can also be found if periodic boundary conditions are used (see Project 12.17). CHAPTER 12. PERCOLATION 456 (b) Use the same configurations considered in part (a) to compute the mean cluster size S as a function of p. Remember that for p > pc, the spanning cluster is excluded. (c) Similarly, compute P∞(p) for various values of p ≥ pc and plot P (p) as a function of p and discuss its qualitative behavior. (d) Verify that s sns(p) = p for p < pc and explain this relation. How is this relation modified for p ≥ pc? It is useful to associate a characteristic linear dimension or connectedness length ξ(p) with the clusters. One way to do so is to define the radius of gyration Rs of a single cluster of s particles as R2 s = 1 s s i=1 (ri − r)2 (12.6) where r = 1 s s i=1 ri (12.7) and ri is the position of the ith site in the same cluster. The quantity r is the familiar definition of the center of mass of the cluster. From (12.6), we see that Rs is the root mean square radius of the cluster measured from its center of mass. The connectedness length ξ can be defined as an average over the radii of gyration of all the finite clusters. To find the appropriate average for ξ, consider a site in a cluster of s sites. The site is connected to s −1 other sites, and the mean square distance to these sites is the order of R2 s . The probability that a site belongs to a cluster of site s is ws = sns. These considerations suggest that a reasonable definition of ξ is ξ2 = s(s − 1)ws R2 s s(s − 1)ws (12.8) where R2 s is the average of R2 s over all clusters of s sites. To simplify the expression for ξ, we write s instead of s − 1 and let ws = sns: ξ2 = s s2 ns R2 s s s2 ns . (12.9) As before, the sum in (12.9) is over only nonspanning clusters. Problem 12.6. Simple calculation of the connectedness length To obtain a feel for how to compute the connectedness length ξ, calculate it for the configuration shown in Figure 12.3a for p = 0.2. 12.3 Finding Clusters So far we we have visually determined the clusters for a given configuration at a particular value of p. We now discuss an algorithm due to Newman and Ziff for finding clusters at many values of p. This algorithm is based on one that is well known in computer science in the context of the union-find problem. In the Newman–Ziff algorithm we begin with an empty lattice and keep track of the clusters as we randomly occupy new lattice sites. As each site is occupied, we CHAPTER 12. PERCOLATION 457 determine whether it becomes a new cluster or whether it is a neighbor of an existing cluster (or clusters). Because p = n/L2, where n is the number of occupied sites, p increases by 1/L2 each time we occupy a new site. The algorithm can be summarized as follows (see class Clusters in Listing 12.2). 1. Precompute the occupation order. One way to occupy the sites is to choose sites at random until we find an unoccupied site. However, this procedure will become inefficient when most of the sites are already occupied. Instead, we store the order in which the sites are to be occupied in order[] and generate the order by randomly permuting the integers from 0 to N − 1 in method setOccupationOrder. For example, order[0] = 2 means that we occupy site 2 first. 2. Add sites according to predetermined order. When a new site is added, we check all its neighbors to determine if the new site is an isolated cluster (all neighbors empty) or if it joins one or more existing clusters. 3. Determine the clusters. The clusters are organized in a tree-like structure, with one site of each cluster designated as the root. All sites in a given cluster, other than the root, point to another site in the same cluster, so that the root can be reached by recursively following the pointers. The “pointers”1 are stored in the parent array. To join two clusters, we add a pointer from the root of the smaller cluster to the root of the larger one. In the following example we use order = {2, 6, 8, 4, 5,...} to illustrate the method. (i) Because order[0] = 2, we first occupy site 2 and set parent[2] = −1. The negative sign distinguishes site 2 as a root. The size of the cluster is stored as −parent[root]. In this case -parent[2] = 1, and because no other sites are occupied, there is no possibility of merging clusters. (ii) For our example, order[1] = 6, and we initially set parent[6] = −1. We then consider the neighbor sites of site 6. Sites 5 (left), 7 (right), and 10 (up) are unoccupied (parent[5] = parent[7] = parent[10] = EMPTY), but site 2 (down) is occupied. Hence, we need to merge the two clusters, and we set parent[6] = 2 and parent[2] = −2. That is, the value of parent[6] points to site 2, and the value of parent[2] indicates that site 2 is the root of a cluster of size 2. (iii) The next sites to be occupied are 8 and 4, as shown in Figure 12.9. These two sites form a size 2 cluster as before. We have parent[8] = −1 and then parent[4] = 8 and parent[8] = −2. We see that the value of each element of the parent array has three functions: nonroot occupied sites contain the index for the site’s parent in the cluster tree; root sites are negative and equal to the negative of the number of sites in the cluster; and unoccupied sites have the value EMPTY. (iv) We next add site 5 and set parent[5] = −1. From Figure 12.9 we see that we have to merge two clusters. We (arbitrarily) check the left neighbor of site 5 first, and hence, we first merge the cluster of size 1 at site 5 with the cluster at site 8 of size 2. Hence, we set parent[5] = 8 and parent[8] = −3. We next check the right neighbor of site 5 and find that we need to merge two clusters again with root sites at 8 and 2. Because the cluster at site 8 is bigger, we set parent[2] = 8 and parent[8] = −5. 1We use the term “pointer” as it is used by Newman and Ziff, that is, a link to an array index. A true pointer stores a memory address and does not exist in Java. CHAPTER 12. PERCOLATION 458 0 14 2 3 15 20 1 3 4 5 6 7 8 9 10 11 12 13 14 Figure 12.9: Illustration of the Newman–Ziff algorithm. The order array is given by {2, 6, 8, 4, 5,. . . }; the number below a site denotes the order in which that site was occupied. When site 5 is occupied, we have to merge the two clusters as explained in the text. Listing 12.2: Implementation of Newman–Ziff algorithm for identifying clusters. package org . opensourcephysics . sip . ch12 ; public class Clusters { s t a t i c private final int EMPTY = Integer .MIN_VALUE; public int L ; / / l i n e a r dimension of l a t t i c e public int N; / / N = L L public int numSitesOccupied ; / / number of occupied l a t t i c e s i t e s public int [ ] numClusters ; / / number of c l u s t e r s of s i z e s , n_s / / secondClusterMoment s t o r e s sum{ s^2 n_s } , where sum i s over / / a l l c l u s t e r s ( not counting spanning c l u s t e r ) / / f i r s t c l u s t e r moment , sum{ s n_s } equals numSitesOccupied / / mean c l u s t e r s i z e S i s defined as / / S = secondClusterMoment / numSitesOccupied private int secondClusterMoment ; / / spanningClusterSize , number of s i t e s in a spanning c l u s t e r ; 0 i f / / i t doesn ’ t e x i s t . Assume at most one spanning c l u s t e r private int spanningClusterSize ; / / order [n] g i v e s index of nth occupied s i t e ; contains a l l numbers / / from [ 1 . . .N] , but in random order . For example , order [0] = 3 means / / we w i l l occupy s i t e 3 f i r s t . An a l t e r n a t i v e to using order array i s / / to choose s i t e s at random u n t i l we find an unoccupied s i t e private int [ ] order ; / / parent [ ] array s e r v e s t h r e e purposes : s t o r e s c l u s t e r s i z e / / when s i t e i s root . Otherwise , i t s t o r e s index of / / the s i t e ’ s parent or i s EMPTY. The root i s found from an / / occupied s i t e by r e c u r s i v e l y f o ll o wi n g the parent array / / Recursion terminates when we encounter a negative value in the / / parent array , which i n d i c a t e s we have found the unique c l u s t e r / / root / / i f ( parent [ s ] >= 0) parent [ s ] i s parent s i t e index CHAPTER 12. PERCOLATION 459 / / i f (0 > parent [ s ] > EMPTY) s i s root of s i z e −parent [ s ] / / i f ( parent [ s ] == EMPTY) s i t e s i s empty ( unoccupied ) private int [ ] parent ; / / A spanning c l u s t e r touches both l e f t and r i g h t boundaries of the / / l a t t i c e . As c l u s t e r s are merged , we maintain t h i s information in / / f ol l o wi n g arrays at r o o t s . For example , i f root of a / / c l u s t e r i s at s i t e 7 and t h i s c l u s t e r touches the l e f t side , / / then t o u c h e s L e f t [7] == true private boolean [ ] touchesLeft , touchesRght ; public Clusters ( int L) { this . L = L ; N = L L ; numClusters = new int [N+1]; order = new int [N] ; parent = new int [N] ; touchesLeft = new boolean [N] ; touchesRght = new boolean [N] ; } public void newLattice ( ) { setOccupationOrder ( ) ; / / choose order in which s i t e s are occupied / / i n i t i a l l y a l l s i t e s are empty , and t h e r e are no c l u s t e r s numSitesOccupied = secondClusterMoment = spanningClusterSize = 0; for ( int s = 0; s0) { return correctedSecondMoment/ correctedFirstMoment ; } else { return 0; } } / / given a s i t e index s , returns s i t e index r e p r e s e n t i n g the root / / of c l u s t e r to which s belongs private int findRoot ( int s ) { i f ( parent [ s ] <0) { return s ; / / root s i t e ( with s i z e −parent [ s ] ) } else { / / f i r s t l i n k parent [ s ] to the c l u s t e r ’ s root to improve performance / / ( path compression ) ; then return t h i s value parent [ s ] = findRoot ( parent [ s ] ) ; } return parent [ s ] ; } / / returns j t h neighbor of s i t e s ; j can be 0 ( l e f t ) , 1 ( r i g h t ) , / / 2 (down ) , or 3 ( above ) . I f no neighbor e x i s t s because of / / boundary , return value EMPTY. Change t h i s method f o r CHAPTER 12. PERCOLATION 461 / / p e r i o d i c boundary c o n d i t i o n s private int getNeighbor ( int s , int j ) { switch ( j ) { case 0 : return ( s%L==0) ? EMPTY : s −1; / / l e f t case 1 : return ( s%L==L−1) ? EMPTY : s +1; / / r i g h t case 2 : return ( s /L==0) ? EMPTY : s−L ; / / down case 3 : return ( s /L==L−1) ? EMPTY : s+L ; / / above default : return EMPTY; } } / / f i l l s order [ ] array with random permutation of s i t e i n d i c e s / / F i r s t order [ ] i s s e t to the i d e n t i t y permutation . Then f o r / / values of i in { 1 . . . N−1} , swap values of order [ i ] with / / order [ r ] , where r i s a random index in { i +1 . . . N} private void setOccupationOrder ( ) { for ( int s = 0; s −parent [ r2 ] ) / / update c l u s t e r count , and second c l u s t e r moment to account / / f o r l o s s of two small c l u s t e r s and gain of CHAPTER 12. PERCOLATION 462 / / one bigger c l u s t e r numClusters[− parent [ r1 ]] − −; numClusters[− parent [ r2 ]] − −; numClusters[− parent [ r1]− parent [ r2 ]]++; secondClusterMoment += sqr ( parent [ r1 ]+ parent [ r2 ] ) −sqr ( parent [ r1 ]) − sqr ( parent [ r2 ] ) ; / / c l u s t e r at r1 now i n c l u d e s s i t e s of old c l u s t e r at r2 parent [ r1 ] += parent [ r2 ] ; / / make r1 new parent of r2 parent [ r2 ] = r1 ; / / i f r2 touched l e f t or right , then so does merged c l u s t e r r1 touchesLeft [ r1 ] |= touchesLeft [ r2 ] ; touchesRght [ r1 ] |= touchesRght [ r2 ] ; / / i f c l u s t e r at r1 spans l a t t i c e , then remember i t s s i z e i f ( touchesLeft [ r1]&&touchesRght [ r1 ] ) { spanningClusterSize = −parent [ r1 ] ; } / / return new root s i t e r1 return r1 ; } } } Listing 12.3: ClustersApp. package org . opensourcephysics . sip . ch12 ; import org . opensourcephysics . controls . ; import org . opensourcephysics . frames . ; public class ClustersApp extends AbstractSimulation { Scalar2DFrame grid = new Scalar2DFrame ( "Newman-Ziff cluster algorithm" ) ; PlotFrame plot1 = new PlotFrame ( "p" , "Mean Cluster Size" , "Mean cluster size" ) ; PlotFrame plot2 = new PlotFrame ( "p" , "P_?" , "P_?" ) ; PlotFrame plot3 = new PlotFrame ( "p" , "P_span" , "P_span" ) ; PlotFrame plot4 = new PlotFrame ( "s" , "" , "Cluster size distribution" ) ; Clusters l a t t i c e ; double pDisplay ; double [ ] meanClusterSize ; double [ ] P_infinity ; double [ ] P_span ; / / p r o b a b i l i t y of a spanning c l u s t e r double [ ] numClustersAccum ; / / number of c l u s t e r s of s i z e s int numberOfTrials ; public void i n i t i a l i z e ( ) { int L = control . getInt ( "Lattice size L" ) ; grid . resizeGrid (L , L ) ; l a t t i c e = new Clusters (L ) ; pDisplay = control . getDouble ( "Display lattice at this value of p" ) ; grid . setMessage ( "p = "+pDisplay ) ; plot4 . setMessage ( "p = "+pDisplay ) ; plot4 . setLogScale ( true , true ) ; meanClusterSize = new double [L L ] ; CHAPTER 12. PERCOLATION 463 P_infinity = new double [L L ] ; P_span = new double [L L ] ; numClustersAccum = new double [L L+1]; numberOfTrials = 0; } public void doStep ( ) { control . clearMessages ( ) ; control . println ( "Trial "+numberOfTrials ) ; / / adds s i t e s to new c l u s t e r , and accumulate r e s u l t s l a t t i c e . newLattice ( ) ; for ( int i = 0; i 0) { plot4 . append (0 , i +1, numClustersAccum [ i +1]/ numberOfTrials ) ; } } } private void displayLattice ( ) { double display [ ] = new double [ l a t t i c e .N] ; for ( int s = 0; s Tc, the system is a paramagnet. In Chapter 15 we will use Monte Carlo methods to investigate the behavior of a magnetic system near the magnetic critical point. CHAPTER 12. PERCOLATION 465 0.0 0.2 0.4 0.6 0.8 1.0 0 10 20 30 p ξ(p) Figure 12.10: The qualitative p-dependence of the connectedness length ξ(p) for a square lattice with L = 128. The results were averaged over approximately 2000–6000 configurations for each value of p. Note that ξ is finite for a finite lattice. In the following, we will find that the properties of the geometrical phase transition in percolation are qualitatively similar to the properties of the critical point in thermodynamic transitions. We will see that in the vicinity of a critical point, the qualitative behavior of the system is governed by the occurrence of long-range correlations. We have found that the essential physics near the percolation threshold is associated with the existence of large clusters. For example, for p pc, we found in Problem 12.7 that ns decays rapidly with s. However for p = pc, the s-dependence of ns is qualitatively different, and ns decreases much more slowly. This different behavior of ns at p = pc is due to the presence of clusters of all length scales, for example, the “infinite” spanning cluster and the finite clusters of all sizes. In Figure 12.10 we show the mean connectedness length ξ(p) for a lattice with L = 128. We see that ξ is finite, and an increasing function of p for p < pc, and a decreasing function of p for p > pc. Moreover, we know that ξ(p = pc) is approximately equal to L and hence, diverges as L → ∞. These qualitative considerations lead us to conjecture that in the limit L → ∞, ξ(p) grows rapidly in the critical region, |p − pc| 1. We can describe the quantitative behavior of ξ(p) for p near pc by introducing the critical exponent ν defined by the relation ξ(p) ∼ |p − pc|−ν . (12.10) Of course, there is no a priori reason why the divergence of ξ(p) can be characterized by a simple power law. Note that the exponent ν is assumed to be the same above and below pc. How do the other quantities that we have considered behave in the critical region in the limit L → ∞? According to the definition (12.2) of P∞, P∞ = 0 for p < pc and is an increasing function of p for p > pc. We conjecture that in the critical region, the increase of P∞ with increasing p is characterized by the exponent β defined by the relation P∞(p) ∼ (p − pc)β . (12.11) Note that P∞ is assumed to approach zero continuously as p approaches pc from above; that is, the percolation transition is an example of a continuous phase transition. In the language of critical phenomena, P∞ is an example of an order parameter; that is, P∞ is nonzero in the ordered phase, p > pc and zero in the disordered phase p < pc. We will see that at p = pc, the spanning cluster is fractal and approaches zero density as the size of the system becomes larger. CHAPTER 12. PERCOLATION 466 Quantity Functional form Exponent d = 2 d = 3 Percolation order parameter P∞ ∼ (p − pc)β β 5/36 0.41 mean size of finite clusters S(p) ∼ |p − pc|−γ γ 43/18 1.80 connectedness length ξ(p) ∼ |p − pc|−ν ν 4/3 0.88 cluster numbers ns ∼ s−τ (p = pc) τ 187/91 2.19 Ising model order parameter M(T ) ∼ (Tc − T )β β 1/8 0.32 susceptibility χ(T ) ∼ |T − Tc|−γ γ 7/4 1.24 correlation length ξ(T ) ∼ |T − Tc|−ν ν 1 0.63 Table 12.1: Several of the critical exponents for the percolation and magnetism phase transitions in d = 2 and d = 3 dimensions. Ratios of integers correspond to known exact results. The critical exponents for the Ising model are discussed in Chapter 15. The mean number of sites in the finite clusters S(p) also diverges in the critical region. Its critical behavior is written as S(p) ∼ |p − pc|−γ , (12.12) which defines the critical exponent γ. The common critical exponents for percolation are summarized in Table 12.1. The analogous critical exponents of a magnetic critical point are also shown. Because we can simulate only finite lattices, a direct fit of the measured quantities ξ, P∞, and S(p) to their assumed critical behavior for an infinite lattice would not yield good estimates for the corresponding exponents ν, β, and γ (see Problem 12.8). The problem is that if p is close to pc, the connectedness length of the largest cluster becomes comparable to L, and the nature of the clusters is affected by the finite size of the system. In contrast, for p far from pc, ξ(p) is small in comparison to L, and the measured values of ξ, and hence, the values of other physical quantities, are not appreciably affected by the finite size of the lattice. Hence, for p pc and p pc, the properties of the system are indistinguishable from the corresponding properties of a truly macroscopic system (L → ∞). However, if p is close to pc, ξ(p) is comparable to L and the nature of the system differs from that of an infinite system. In particular, a finite lattice cannot exhibit a true phase transition characterized by divergent physical quantities. Instead, ξ reaches a finite maximum at p = pc(L). The effects of the finite system size can be made more quantitative by the following argument. Consider, for example, the critical behavior (12.11) of P∞. If ξ 1 but is much less than L, the power law behavior given by (12.11) is expected to hold. However, if ξ is comparable to L, ξ cannot change appreciably and (12.11) is no longer applicable. This qualitative change in the behavior of P∞ and other physical quantities occurs for ξ(p) ∼ L ∼ |p − pc|−ν . (12.13) We invert (12.13) and write |p − pc| ∼ L−1/ν . (12.14) The difference |p − pc| in (12.14) is the “distance” from the percolation threshold point at which finite size effects occur. Hence, if ξ and L are approximately the same size, we can replace (12.11) by the relation P∞(p = pc) ∼ L−β/ν . (L → ∞). (12.15) CHAPTER 12. PERCOLATION 467 The relation (12.15) between P∞ and L at p = pc is consistent with the fact that a phase transition is defined only for infinite systems. One implication of (12.15) is that we can use it to determine the ratio β/ν. This method of analysis is known as finite size scaling. Suppose that we generate percolation configurations at p = pc for different values of L and analyze P∞ as a function of L. If our values of L are sufficiently large, we can use the asymptotic relation (12.15) to estimate the ratio β/ν. A similar analysis can be used for S(p) and other quantities of interest. We use this method in Problem 12.8. Problem 12.8. Finite size scaling analysis of critical exponents (a) Compute P∞ at p = pc for at least 100 configurations for L = 10, 20, 40, and 80. Include in your average only those configurations that have a spanning cluster. Best results are obtained using the value of pc for the infinite square lattice, pc ≈ 0.5927. Plot lnP∞ versus lnL and estimate the ratio β/ν. (b) Use finite size scaling to determine the dependence of the mean cluster size S on L at p = pc. Average S over the same configurations considered in part (a). Remember that S is the mean number of sites in the nonspanning clusters. (c) Find the size (number of particles) M in the spanning cluster at p = pc as a function of L. Use the same configurations as in part (a). Determine an exponent from the slope of a plot of lnM versus lnL. This exponent is called the fractal dimension and is discussed in Chapter 13. Finite size scaling is particularly useful at the percolation threshold in comparison to thermal critical points where, as we will learn in Chapter 15, critical slowing down occurs. Critical slowing down makes it very time consuming to sample statistically independent configurations. No such slowing down occurs at the percolation threshold because we can easily create new configurations at any value of p by simply using a new set of random numbers. We found in Section 12.2 that the numerical value of the percolation threshold pc depends on the symmetry and dimension of the lattice, for example, pc ≈ 0.5927 for the square lattice and pc = 1/2 for the triangular lattice. A remarkable feature of the power law dependencies summarized in Table 12.1 is that the values of the critical exponents do not depend on the symmetry of the lattice and are independent of the existence of the lattice itself, for example, they are identical for site percolation, bond percolation, and continuum percolation. Moreover, it is not necessary to distinguish between the exponents for site and bond percolation. In the vocabulary of critical phenomena, we say that site, bond, and continuum percolation all belong to the same universality class and that their critical exponents are identical for the same spatial dimension. Another important idea in critical phenomena is the existence of relations between the critical exponents. An example of such a scaling law is 2β + γ = νd (12.16) where d is the spatial dimension of the lattice. The scaling law (12.16) indicates that the universality class depends on the spatial dimension. A more detailed discussion of finite size scaling and the scaling laws can be found in Chapter 15 and in the references. CHAPTER 12. PERCOLATION 468 12.5 The Renormalization Group In Section 12.4 we studied the properties of various quantities on different length scales to determine the values of the critical exponents. The idea of examining physical quantities near the critical point on different length scales can be extended beyond finite size scaling and is the basis of the renormalization group method, one of the more important new methods in theoretical physics developed in the past several decades.2 Although the method originated in the theory of elementary particles and was first applied to thermodynamic critical points, it is simpler to understand the method in the context of the percolation transition. We will find that the renormalization group method yields the critical exponents directly and in combination with Monte Carlo methods, is more powerful than Monte Carlo methods alone. The basic idea of the renormalization group method is the following. Imagine a percolation configuration generated at p = p0. What would happen if we average the configuration over groups of sites to obtain a configuration of occupied and empty cells? For example, the cells could be groups of four sites such that each cell is occupied or empty according to a mapping rule from the sites to the cell. If the original group of 2 × 2 sites spans, the cell would be occupied; otherwise, the cell would be empty. What value of p = p1 would describe the new configuration of cells? If p0 < pc, we would expect p1 < p0. To understand why, consider a value of p0 near p = 0 where almost all the clusters are of size one. Clearly, the occupied sites would be mapped into empty cells, and there would be a lower percentage of occupied cells than before. For p0 > pc we would find p1 > p0 because the rare isolated unoccupied sites would be grouped into occupied cells. At p = pc we might expect that this blocking procedure would lead to configurations that look like they were generated at the same value of p because of the existence of clusters of all length scales. Given the new configuration of cells at probability p1, we can group the cells according to the same mapping rule, leading to a new p = p2. The sequence p0, p1, p2,... is called a renormalization group flow. We expect for p0 < pc the flow will move to the trivial fixed point p = 0, and for p > pc the flow will move to the other trivial fixed point p = 1. At p = pc, there is a nontrivial fixed point. We will see that by analyzing the renormalization group flow, we can determine the location of the critical point and the critical exponent ν. We now consider a way of using a computer to change the configurations in a way that is similar to the procedure that we have just described. Consider a square lattice that is partitioned into cells or blocks that cover the lattice (see Figure 12.11). Note that we have defined the cells so that the new lattice of cells has the same symmetry as the original lattice. However, the replacement of sites by the cells has changed the length scale— all distances are now smaller by a factor of b, where b is the linear dimension of the cell. Hence, the effect of a “renormalization” is to replace each group of sites by a single renormalized site and to rescale the connectedness length for the renormalized lattice by a factor of b. Because we want to preserve the main features of the original lattice and hence its connectedness (and its symmetry), we assume that a renormalized site is occupied if the original group of sites spans the cell. For simplicity, we adopt the vertical spanning criterion. The effect of performing a renormalization transformation on typical percolation configurations for p above and below pc is illustrated in Figures 12.12 and 12.13, respectively. In both cases the effect of the successive transformations is to move the system away from pc. We see that for p = 0.7, the effect of the transformations is to drive the system toward p = 1. For p = 0.5, the trend is to 2Kenneth Wilson was honored with the Nobel prize in physics in 1981 for his contributions to the development of the renormalization group method. CHAPTER 12. PERCOLATION 469 Figure 12.11: An example of a b = 4 cell used on the square lattice. The cell contains b2 sites which are rescaled to a single supersite or cell after a renormalization group transformation. drive the system toward p = 0. Because we began with a finite lattice, we cannot continue the renormalization transformation indefinitely. The class RGApp implements a visual interpretation of the renormalization group. This class creates four windows with the original lattice in the first window and three renormalized lattices in the other three windows. Listing 12.4: The visual renormalization group. package org . opensourcephysics . sip . ch12 ; import org . opensourcephysics . controls . ; import org . opensourcephysics . frames . ; import java . awt . Color ; public class RGApp extends AbstractCalculation { LatticeFrame o r i g i n a l L a t t i c e = new LatticeFrame ( "Original Lattice" ) ; LatticeFrame block1 = new LatticeFrame ( "First Blocked Lattice" ) ; LatticeFrame block2 = new LatticeFrame ( "Second Blocked Lattice" ) ; LatticeFrame block3 = new LatticeFrame ( "Third Blocked Lattice" ) ; public RGApp( ) { setLatticeColors ( o r i g i n a l L a t t i c e ) ; setLatticeColors ( block1 ) ; setLatticeColors ( block2 ) ; setLatticeColors ( block3 ) ; } public void calculate ( ) { int L = control . getInt ( "L" ) ; double p = control . getDouble ( "p" ) ; newLattice (L , p , o r i g i n a l L a t t i c e ) ; block ( originalLattice , block1 , L / 2 ) ; / / block o r i g i n a l l a t t i c e block ( block1 , block2 , L / 4 ) ; / / next blocking block ( block2 , block3 , L / 8 ) ; / / f i n a l blocking o r i g i n a l L a t t i c e . s e t V i s i b l e ( true ) ; block1 . s e t V i s i b l e ( true ) ; block2 . s e t V i s i b l e ( true ) ; block3 . s e t V i s i b l e ( true ) ; } public void reset ( ) { control . setValue ( "L" , 64); control . setValue ( "p" , 0 . 6 ) ; CHAPTER 12. PERCOLATION 470 L = 16 L' = 8 L' = 4 L' = 2 Figure 12.12: A percolation configuration generated at p = 0.7. The original configuration has been renormalized three times by transforming cells of four sites into one new supersite. What would be the effect of an additional transformation? } / / new l a t t i c e public void newLattice ( int L , double p , LatticeFrame l a t t i c e ) { l a t t i c e . r e s i z e L a t t i c e (L , L ) ; for ( int i = 0; i 7, and we must use Monte Carlo methods if we wish to proceed further. Two Monte Carlo approaches are discussed in Project 12.13. The combination of Monte Carlo and renormalization group methods provides a powerful tool for obtaining information on phase transitions and other properties of materials. As summarized in Table 12.1, the various critical exponents for percolation in two dimensions are known exactly. For example, the exponent ν, corresponding to the divergence of the connectedness length, is ν = 4/3. It is interesting that the theory for this result is based on algebraic reasoning (too abstract to be summarized here), even though percolation is a geometrical phenomena. The most accurate estimate of pc for the square lattice is pc = 0.59274621(13). We note that although there has been much work on percolation, only numerical estimates for pc are known for most lattices. 12.6 Projects Most of the following projects require larger systems and more computer resources than the problems that we have considered so far, but most are not much more difficult conceptually. More ideas for projects can be obtained from the references. Project 12.12. Cell-to-cell renormalization group method In Section 12.5 we discussed the cell-to-site renormalization group transformation for a system of cells of linear dimension b. An alternative transformation is to go from cells of linear dimension b1 to cells of linear dimension b2. For this cell-to-cell transformation, the rescaling length b1/b2 can be made close to unity. Many errors in a cell-to-cell renormalization group transformation cancel, resulting in a transformation that is more accurate in the limit in which the change in length scale is infinitesimal. We can use the fact that the connectedness lengths of the two systems are related by ξ(p2) = (b1/b2)−1ξ(p1) to derive the relation ν = lnb1/b2 lnλ1/λ2 (12.28) where λi = dR(p∗,bi)/dp is evaluated at the solution to the fixed point equation, R(b2,p∗) = R(b1,p∗). Note that (12.28) reduces to (12.26) for b2 = 1. Use the results you found in Problem 12.10d for one of the spanning criteria to estimate ν from a b1 = 3 to b2 = 2 transforma- tion. Project 12.13. Estimates for two-dimensional percolation One way to estimate RL(p), the total probability of all the spanning clusters on a lattice of linear dimension L, can be understood by writing RL(p) in the form RL(p) = N n=1 S(n)PN (n,p) (12.29) where PN (n,p) = N n pn q(N−n) (12.30) and N = L2. The binomial coefficient N n = N!/ (N − n)!n! represents the number of possible configurations of n occupied sites and N − n empty sites; PN (n,p) is the probability that n sites CHAPTER 12. PERCOLATION 476 out of N are occupied with probability p. The quantity S(n) is the probability that a random configuration of n occupied sites spans the lattice. A comparison of (12.18) and (12.29) shows that for L = 2 and the vertical spanning criterion, S(1) = 0, S(2) = 2/6, S(3) = 1, and S(4) = 1. What are the values of S(n) for L = 3? We can estimate the probability S(n) by Monte Carlo methods. One way to sample S(n) is to add a particle at random to an unoccupied site and check if a spanning cluster exists. If a spanning cluster does not exist, add another particle at random to a previously unoccupied site. If a spanning cluster exists after s particles are added, then let S(n) = S(n) + 1 for all n ≥ s, and generate a new configuration. After a reasonable number of configurations, the results for S(n) can be normalized. Of course, this procedure can be made more efficient by checking for a spanning cluster only after the total number of particles added is near s ∼ pcN. (a) Write a Monte Carlo program to sample S(n). Store the location of the unoccupied sites in a separate array. To check your program, first sample S(n) for L = 2 and L = 3 and compare your results to the exact results for S(n). Consider larger values of L and determine S(n) for L = 4, 5, 8, 16, and 32. Because the number of sites in the lattice can become very large, the direct evaluation of the binomial coefficients using factorials is not possible. One way to proceed is to approximate the probability of a configuration of n occupied sites by a Gaussian: PN (n,p) ≈ N n pn q(N−n) ≈ (2πNpq)− 1 2 e−(n−pN)2/2Npq . (12.31) (b) As pointed out by Newman and Ziff, the Gaussian approximation for PN (n,p) is not sufficiently accurate for high precision studies. Instead, they used the following method. The binomial distribution is a maximum for a given N and p when n = nmax = pN. Set this value to 1 for the moment. Then compute PN (n) iteratively for all other n using PN (n,p) =    PN (N,n − 1,p)N−n+1 n p 1−p (n > nmax) PN (N,n + 1,p) n+1 N−n 1−p p . (n < nmax). (12.32) Then calculate the normalization coefficient C = n PN (n,p) and divide all the PN (n,p) by C to normalize the probability distribution. (c) Compute ν from the cell-to-cell transformation discussed in Project 12.13 for b1 = 5 and b2 = 4. (d) The article by Ziff and Newman discusses the convergence of various estimates of the percolation threshold in two dimensions. Some examples of these estimates include: (i) The cell-to-site renormalization group fixed point: RL(p) = p (12.33) where p∗ is the solution to (12.33). (ii) The average value of p at which spanning first occurs: p = 1 0 p dRL(p) dp dp = 1 − 1 0 RL(p)dp (12.34) where we have integrated by parts to obtain the second integral. CHAPTER 12. PERCOLATION 477 (iii) The estimate pmax, which is the value of p at which dRL/dp reaches a maximum: d2RL(p) dp2 = 0. (12.35) (iv) The cell-to-cell renormalization group fixed point: RL(p) = RL−1(p) (12.36a) or RL(p) = RL/2(p). (12.36b) (v) The value of p for which RL(p) = R∞(pc). For a square lattice, R∞(pc) = 1/2. Verify that the various estimates of the percolation threshold converge to the infinite lattice value pc either as pest(L) − pc ≈ cL−1/ν (12.37a) or pest(L) − pc ≈ cL−1−1/ν (12.37b) where the constant c is a fit parameter that depends on the criterion and ν = 4/3 for percolation in two dimensions. Determine which estimates converge more quickly. Project 12.14. Percolation in three dimensions (a) The value of pc for site percolation on the simple cubic lattice is approximately 0.3112. Do a simulation to verify this value. Compute φc, the volume fraction occupied at pc, if a sphere with a diameter equal to the lattice spacing is placed at each occupied site. (b) Consider continuum percolation in three dimensions where spheres of unit diameter are placed at random in a cubical box of linear dimension L. Two spheres that overlap are in the same cluster. The volume fraction occupied by the spheres is given by φ = 1 − e−ρ4πr3/3 (12.38) where ρ is the number density of the spheres, and r is their radius. Write a program to simulate continuum percolation in three dimensions and find the percolation threshold ρc. Use the Monte Carlo procedure discussed in Problem 12.4 to estimate φc and compare its value with the value determined from (12.38). How does φc for continuum percolation compare with the value of φc found for site percolation in part (a)? Which do you expect to be larger and why? (c) In the Swiss cheese model in three dimensions, we are concerned with the percolation of the space between the spheres. This model is appropriate for porous rock with the spheres representing solid material and the space between the spheres representing the pores. Because we need to compute the connectivity properties of the space between the spheres, we superimpose a regular grid with lattice spacing equal to 0.1r on the system, where r is the radius of the spheres. If a point on the grid is not within any sphere, it is “occupied.” The use of the grid allows us to determine the connectivity between different regions of the pore space. Use a cluster labeling algorithm to label the clusters and determine ˜φc, the volume fraction occupied by the pores at threshold. You might be surprised to find that ˜φc is relatively small. If time permits, use a finer grid and repeat the calculation to improve the accuracy of your results. CHAPTER 12. PERCOLATION 478 (d)∗ Use finite-size scaling to estimate the critical percolation exponents for the three models presented in parts (a)–(c). Are they the same within the accuracy of your calculation? Project 12.15. Fluctuations of the stock market Although the fluctuations of the stock market are believed to be Gaussian for long time intervals, they are not Gaussian for short time intervals. The model of Cont and Bouchaud assumes that percolation clusters act as groups of traders who influence each other. The sites are occupied with probability p as usual. Each occupied site is a trader, and clusters are groups of traders (agents) who buy and sell together an amount proportional to the number s of traders in the cluster. At each time step, each cluster is independently active with probability 2pa and is inactive with probability 1 − 2pa. If a cluster is active, it buys with probability pb and sells with probability ps = 1 − pb. In the simplest version of the model the change in the price of a stock is proportional to the difference between supply and demand; that is, R = buy sns − sell sns, (12.39) where the constant of proportionality is taken to be one. If the probability pa is small, at most one cluster trades at a time, and the distribution P (R) of relative price changes or “returns” scales as ns(p). In contrast, for large pa, the relative price variation is the sum of many clusters (not counting the spanning cluster), and the central limit theorem implies that P (R) converges to a Gaussian for large systems (except at p = pc). Confirm these statements and find the shape of P (R) for p = pc and pa = 0.25. Variations of the Cont–Bouchaud model can be found in the references. The application of methods of statistical physics and simulations to economics and finance is now an active area of research and is commonly known as econophysics. Project 12.16. The connectedness length (a) Modify class Clusters so that the connectedness length ξ defined in (12.9) is computed. One way to do so is to introduce four additional arrays, xAccum, yAccum, xSquaredAccum, and ySquaredAccum, with the data stored at indices corresponding to the root sites. We visit each occupied site in the lattice and determine its root site. For example, if the site x,y is occupied and its root is root, we set xAccum[root] += x, xSquaredAccum[root] += x*x, yAccum[root] += y, and ySquaredAccum[root] += y*y. Then R2 s for an individual cluster is given by R2 s = xSquaredAccum[root]/s + ySquaredAccum[root]/s − (xAccum[root]/s)2 − (xAccum/s[root])2 (12.40) where s is the number of sites in the cluster, which is given by -parent[root]. (b) What is the qualitative behavior of ξ(p) as a function of p for different size lattices? Is ξ(p) a monotonically increasing or decreasing function of p for p < pc and p > pc? Remember that ξ does not include the spanning cluster. Project 12.17. Spanning clusters and periodic boundary conditions For simplicity, we have used open boundary conditions, partly for historical reasons and partly because a spanning cluster is easier to visualize for open boundary conditions. An alternative CHAPTER 12. PERCOLATION 479 (a) (b) (c) Figure 12.16: (a) Example of a cluster that wraps vertically. (b) Example of a cluster that wraps vertically and horizontally. (c) Example of a single cluster that does not wrap. Periodic boundary conditions are used in each case. is to use periodic boundary conditions and define a spanning cluster as one that wraps all the way around the lattice (see Figure 12.16). A method for detecting cluster wrapping has been proposed by Machta et al. In addition to the parent array introduced on page 457, we define two integer arrays that give the net displacement in the x and y direction of each site to its parent site. When we traverse a site’s cluster tree, we sum these displacements to find the total displacement to the root site. When an added site neighbors two (or more) sites that belong to the same cluster, we compare the total displacements to the root site for these two sites. If these displacements differ by an amount that does not equal the minimum displacement between these two sites, then cluster wrapping has occurred (see Figure 12.17). Modify the Newman–Ziff algorithm so that periodic boundary conditions are used to define the clusters and the existence of a spanning cluster. Use your program to estimate pc and ns and show that periodic boundary conditions give better results for the percolation threshold pc and the cluster size distribution ns for the same size lattice. Project 12.18. Conductivity in a random resistor network (a) An important critical exponent for percolation is the conductivity exponent t defined by σ ∼ (p − pc)t (12.41) where σ is the conductance (or inverse resistance) per unit length in two dimensions. Consider bond percolation on a square lattice where each occupied bond between two neighboring sites is a resistor of unit resistance. Unoccupied bonds have infinite resistance. Because the total current into any node must equal zero by Kirchhoff’s law, the voltage at any site (node) is equal to the average of the voltages of all nearest neighbor sites connected by resistors (occupied bonds). Because this relation for the voltage is the same as the algorithm for solving Laplace’s equation on a lattice, the voltage at each site can be computed using the relaxation method discussed in Chapter 10. To compute the conductivity for a given L × L resistor network, we fix the voltage V = 0 at sites for which x = 0 and fix V = 1 at sites for which x = L + 1. In the y direction we use periodic boundary conditions. We then compute the voltage at all sites using the relaxation method. The current through each resistor connected to a site at x = 0 is I = ∆V /R = (V − 0)/1 = V . The conductivity is the sum of the currents through all the resistors connected to x = 0 divided by L. In a similar way, the conductivity can be computed from the resistors attached to the x = L + 1 boundary. Write CHAPTER 12. PERCOLATION 480 root new ∆x = 2 ∆y = 2 ∆x = -3 ∆y = -1 9 24 4 Figure 12.17: Example of cluster wrapping for periodic boundary conditions. When site 4 is occupied, it is a neighbor of sites 9 and 24 which belong to a single cluster. We compare the horizontal and vertical displacements of neighbors 9 and 24 to their root. If the difference between these displacements is not equal to the minimum displacement between them (∆xmin = 0, ∆ymin = 2), then wrapping has occurred, as is the case here. a program to implement the relaxation method for the conductivity of a random resistor network on a square lattice. An indirect, but easier way of computing the conductivity, is considered in Problem 13.8. (b) The bond percolation threshold on a square lattice is pc = 0.5. Use your program to compute the conductivity for a L = 30 square lattice. Average over at least ten spanning configurations for p = 0.51,0.52, and 0.53. Note that you can eliminate all bonds that are not part of the spanning cluster and all occupied bonds connected to only one other occupied bond. Why? If possible, consider more values of p. Estimate the critical exponent t defined in (12.41). (c) Fix p at p = pc = 1/2 and use finite size scaling to estimate the conductivity exponent t. (d)∗ Use larger lattices and the multigrid method (see Project 10.26) to improve your results. If you have sufficient computing resources, compute t for a simple cubic lattice for which pc ≈ 0.249. (In general, t is not the same for lattice and continuum percolation.) References and Suggestions for Further Reading Joan Adler, “Series expansions,” Computers in Physics 8, 287–295 (1994). The critical exponents and the value of pc can also be determined by doing exact enumeration. I. Balberg, “Recent developments in continuum percolation,” Phil. Mag. 56, 991–1003 (1987). An earlier paper on continuum percolation is by Edward T. Gawlinski and H. Eugene Stanley “Continuum percolation in two dimensions: Monte Carlo tests of scaling and universality for non-interacting discs,” J. Phys. A: Math. Gen. 14, L291–L299 (1981). These workers divide the system into cells and use the Poisson distribution to place the appropriate number of disks in each cell. CHAPTER 12. PERCOLATION 481 Jean–Philippe Bouchaud and Marc Potters, Theory of Financial Risk and Derivative Pricing: From Statistical Physics to Risk Management, 2nd ed. (Cambridge University Press, 2003); Rosario N. Mantegna and H. Eugene Stanley, An Introduction to Econophysics: Correlations and Complexity in Finance (Cambridge University Press, 2000); Johannes Voit, The Statistical Mechanics of Financial Markets, 2nd ed. (Springer, 2004). These texts introduce the general field of econophysics. Armin Bunde and Shlomo Havlin, editors, Fractals and Disordered Systems, revised edition (Springer-Verlag, 1996). Chapter 2 by the editors is on percolation. R. Cont and J.-P. Bouchaud, “Herd behavior and aggregate fluctuations in financial markets,” Macroeconomic Dynamics 4, 170–196 (2000). P. M. C. deOliveira, R. A. Nobrega, and D. Stauffer, “Are the tails of percolation thresholds Gaussians?,” J. Phys. A 37, 3743–3748 (2004). The authors compute the probability that there is a spanning cluster at p = pc. C. Domb, E. Stoll, and T. Schneider, “Percolation clusters,” Contemp. Phys. 21, 577–592 (1980). This review paper discusses the nature of the percolation transition using illustrations from a film of a Monte Carlo simulation of a percolation process. J. W. Essam, “Percolation theory,” Reports Progress Physics 53, 833–912 (1980). A mathematically oriented review paper. Jens Feder, Fractals (Plenum Press, 1988). See Chapter 7 on percolation. We discuss the fractal properties of the spanning cluster at the percolation threshold in Chapter 13. J. P. Fitzpatrick, R. B. Malt, and F. Spaepen, “Percolation theory of the conductivity of random close-packed mixtures of hard spheres,” Phys. Lett. A 47, 207–208 (1974). The authors describe a demonstration experiment done in a first year physics course. J. Hoshen and R. Kopelman, “Percolation and cluster distribution. I. Cluster multiple labeling technique and critical concentration algorithm,” Phys. Rev. B 14, 3438–3445 (1976). The original paper on an efficient cluster labeling algorithm. The Hoshen–Kopelman algorithm is well suited for very large lattices in two dimensions, but, in general, the Newman– Ziff algorithm is easier to use. Chin-Kun Hu, Chi-Ning Chen, and F. Y. Wu, “Histogram Monte Carlo position-space renormalization group: Applications to site percolation,” J. Stat. Phys. 82, 1199–1206 (1996). The authors use a histogram Monte Carlo method that is similar to the method discussed in Project 12.13. A similar Monte Carlo method was used by M. Ahsan Khan, Harvey Gould, and J. Chalupa, “Monte Carlo renormalization group study of bootstrap percolation,” J. Phys. C 18, L223–L228 (1985). J. Machta, Y. S. Choi, A. Lucke, T. Schweizer, and L. M. Chayes, “Invaded cluster algorithm for Potts models,” Phys. Rev. E 54, 1332–1345 (1996). The authors discuss the definition of a spanning cluster for periodic boundary conditions. P. H. L. Martins and J. A. Plascak, “Percolation on two- and three-dimensional lattices,” Phys. Rev. E 67, 046119-1–6 (2003). The authors use the Newman–Ziff algorithm to compute various quantities. CHAPTER 12. PERCOLATION 482 Ramit Mehr, Tal Grossman, N. Kristianpoller, and Yuval Gefen,“Simple percolation experiment in two dimensions,” Am. J. Phys. 54, 271–273 (1986). The authors discuss a simple experiment on a sheet of conducting silver paper. This type of experiment is much easier to do than the insulator-conductor transition discussed in Section 12.1. In the latter case, the results are difficult to interpret because the current depends on the contact area between two spheres and thus on the applied pressure. M. E. J. Newman and R. M. Ziff, “Fast Monte Carlo algorithm for site or bond percolation,” Phys. Rev. E 64, 016706-1–16 (2001). Our discussion of the Newman–Ziff algorithm in Section 12.3 closely follows this well-written paper. Peter J. Reynolds, H. Eugene Stanley, and W. Klein, “Large-cell Monte Carlo renormalization group for percolation,” Phys. Rev. B 21, 1223 (1980). Another especially well written research paper. Our discussion on the renormalization group in Section 12.5 is based upon this paper. Muhammad Sahimi, Applications of Percolation Theory (Taylor & Francis, 1994). The emphasis is on modeling various phenomena in disordered media. Lev N. Shchur, “Incipient spanning clusters in square and cubic percolation,” in Studies in Condensed Matter Physics, Vol. 85, edited by D. P. Landau, S. P. Lewis, and H. B. Schuettler (Springer–Verlag, 2000). Not many years ago, it was commonly believed that only one spanning cluster could exist at the percolation threshold. In this paper the probability of the simultaneous occurrence of at least k spanning clusters was studied by extensive Monte Carlo simulations and found to be in agreement with theoretical predictions. Dietrich Stauffer, “Percolation models of financial market dynamics,” Advances in Complex Systems 4, 19–27 (2001). D. Stauffer, “Percolation clusters as teaching aid for Monte Carlo simulation and critical exponents,” Am. J. Phys. 45, 1001–1002 (1977); D. Stauffer, “Scaling theory of percolation clusters,” Physics Reports 54, 1–74 (1979). Dietrich Stauffer and Amnon Aharony, Introduction to Percolation Theory, 2nd ed. (Taylor & Francis, 1994). A delightful book by two of the leading workers in the field. An efficient Fortran implementation of the Hoshen-Kopelman algorithm is given in Appendix A.3. B. P. Watson and P. L. Leath, “Conductivity in the two-dimensional site percolation problem,” Phys. Rev. B 9, 4893–4896 (1974). A research paper on the conductivity of chicken wire. John C. Wierman and Dora Passen Naor, “Criteria for evaluation of universal formulas for percolation thresholds,” Phys. Rev. E 71, 036143-1–7 (2005). Wierman and Naor evaluate several universal formulas that predict approximate values for pc for various lattices. Percolation was first conceived by chemistry Nobel laureate P. J. Flory as a model for polymer gelation, for example, the sol-gel transition of jello [P. J. Flory, “Molecular size distribution in three dimensional polymers,” J. Am. Chem. Soc. 63, 3083–3100 (1941)]. Further important work in this context was done by another chemist, Walter H. Stockmayer. The term “percolation” was coined by mathematicians Broadbent and Hammersley in 1957 who considered percolation on a lattice for the first time. See S. R. Broadbent and J. M. Hammersley, “Percolation processes I. Crystals and mazes,” Proceedings Cambridge Philosophical Society 53, 629–641 (1957). CHAPTER 12. PERCOLATION 483 Kenneth G. Wilson, “Problems in physics with many scales of length,” Sci. Am. 241 (8), 158– 179 (1979). An accessible article on the renormalization group method and its applications in particle and condensed matter physics. See also K. G. Wilson, “The renormalization group and critical phenomena,” Rev. Mod. Phys. 55, 583–600 (1983). The latter article is the text of Wilson’s lecture on the occasion of the presentation of the 1982 Nobel Prize in Physics. In this lecture he claims that he “... found it very helpful to demand that a correctly formulated field theory be soluble by computer, the same way an ordinary differential equation can be solved on a computer ...” . W. Xia and M. F. Thorpe, “Percolation properties of random ellipses,” Phys. Rev. A 38, 2650– 2656 (1988). The authors consider continuum percolation and show that the area fraction remaining after punching out holes at random is given by φ = e−Aρ, where A is the area of a hole and ρ is the number density of the holes. This relation does not depend on the shape of the holes. Richard Zallen, The Physics of Amorphous Solids (Wiley–Interscience, 1983). Chapter 4 discusses many of the applications of percolation concepts to realistic systems. R. M. Ziff and M. E. J. Newman, “Convergence of threshold estimates for two-dimensional percolation,” Phys. Rev. E 66, 016129-1–10 (2002). Chapter 13 Fractals and Kinetic Growth Models We introduce the concept of fractal dimension and discuss several processes that generate fractal objects. 13.1 The Fractal Dimension One of the more interesting geometrical properties of objects is their shape. As an example, we show in Figure 13.1 a spanning cluster generated at the percolation threshold. Although the visual description of such a cluster is subjective, such a cluster can be described as ramified, airy, tenuous, and stringy, rather than compact or space-filling. In the 1970s a new fractal geometry was developed by Mandelbrot and others to describe the characteristics of ramified objects. One quantitative measure of the structure of these objects is their fractal dimension D. To define D, we first review some simple ideas of dimension in ordinary Euclidean geometry. Consider a circular or spherical object of mass M and radius R. If the radius of the object is increased from R to 2R, the mass of the object is increased by a factor of 22 if the object is circular or by 23 if the object is spherical. We can express this relation between mass and the radius or a characteristic length as M(R) ∼ RD (mass dimension), (13.1) where D is the dimension of the object. Equation (13.1) implies that if the linear dimensions of an object are increased by a factor of b while preserving its shape, then the mass of the object is increased by bD. This mass-length scaling relation is closely related to our intuitive understanding of spatial dimension. If the dimension of the object D and the dimension of the Euclidean space in which the object is embedded d are identical, then the mass density ρ = M/Rd scales as ρ(R) ∝ M(R)/Rd ∼ R0 ; (13.2) that is, its density is constant. An example of a two-dimensional object is shown in Figure 13.2. An object whose mass-length relation satisfies (13.1) with D = d is said to be compact. Equation (13.1) can be generalized to define the fractal dimension. We denote objects as fractals if they satisfy (13.1) with a value of D different from the spatial dimension d. If an object satisfies (13.1) with D < d, its density is not the same for all R but scales as ρ(R) ∝ M/Rd ∼ RD−d . (13.3) 484 CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 485 Figure 13.1: Example of a spanning percolation cluster generated at p = 0.5927 on a L = 124 square lattice. The other occupied sites are not shown. Because D < d, we see that a fractal object becomes less dense at larger length scales. The scale dependence of the density is a quantitative measure of the ramified or stringy nature of fractal objects. In addition, another characteristic of fractal objects is that they have holes of all sizes. This property follows from (13.3) because if we replace R by Rb, where b is some constant, we obtain the same power law dependence for ρ(R). Thus, it does not matter what scale of length is used, and thus all hole sizes must be present. Another important characteristic of fractal objects is that they look the same over a range of length scales. This property of self-similarity or scale invariance means that if we take part of a fractal object and magnify it by the same magnification factor in all directions, the magnified picture is similar to the original. This property follows from the scaling argument given for ρ(R). The percolation cluster shown in Figure 13.1 is an example of a random or statistical fractal because the mass-length relation (13.1) is satisfied only on the average, that is, only if the quantity M(R) is averaged over many different origins in a given cluster and over many clusters. In physical systems, the relation (13.1) does not extend over all length scales but is bounded by both upper and lower cut-off lengths. For example, a lower cut-off length is provided by the lattice spacing or the mean distance between the constituents of the object. In computer simulations, the maximum length is usually the finite system size. The presence of these cutoffs complicates the determination of the fractal dimension. In Problem 13.1 we compute the fractal dimension of percolation clusters using straightforward Monte Carlo methods. Remember that data extending over several decades is required to obtain convincing evidence for a power law relationship between M and R and to determine accurate estimates for the fractal dimension. Hence, conclusions based on the limited simulations posed in the problems need to be interpreted with caution. CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 486 Figure 13.2: The number of dots per unit area in each circle is uniform. How does the total number of dots (mass) vary with the radius of the circle? Problem 13.1. The fractal dimension of percolation clusters (a) Generate a site percolation configuration on a square lattice with L ≥ 61 at p = pc ≈ 0.5927. Why might it be necessary to generate several configurations before a spanning cluster is obtained? Does the spanning cluster have many dangling ends? (b) Choose a point on the spanning cluster and count the number of points in the spanning cluster M(b) within a square of area b2 centered about that point. Then double b and count the number of points within the larger box. Can you repeat this procedure indefinitely? Repeat this procedure until you can estimate the b-dependence of the number of points. Use the b-dependence of M(b) to estimate D according to the definition M(b) ∼ bD, that is, estimate D from a log-log plot of M(b) versus b. Choose another point in the cluster and repeat this procedure. Are your results similar? A better estimate for D can be found by averaging M(b) over several origins in each spanning cluster and averaging over many spanning clusters. (c) If you have not already done Problem 12.8a, compute D by determining the mean size (mass) M of the spanning cluster at p = pc as a function of the linear dimension L of the lattice. Consider L = 11, 21, 41, and 61 and estimate D from a log-log plot of M versus L. ∗Problem 13.2. Renormalization group calculation of the fractal dimension Compute M2 , the average of the square of the number of occupied sites in the spanning cluster at p = pc, and the quantity M 2 , the average of the square of the number of occupied sites in the spanning cluster on the renormalized lattice of linear dimension L = L/b. Because M2 ∼ L2D and M 2 ∼ (L/b)2D, we can obtain D from the relation b2D = M2 / M 2 . Choose the length rescaling factor to be b = 2 and adopt the same blocking procedure as was used in Section 12.5. An average over ten spanning clusters for L = 16 and p = 0.5927 is sufficient for qualitative results. In Problems 13.1 and 13.2, we were interested only in the properties of the spanning clusters. For this reason, our algorithm for generating percolation configurations by randomly occupying each site is inefficient because it generates many clusters. A more efficient way of generating single percolation clusters is due independently to Hammersley, Leath, and Alexandrowicz. CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 487 g g g g x g g g x g g g g g g x g g g g g g x g g g x g g g x g g g g x g g g g x g g g x g g g g x g g g x g g Figure 13.3: An example of the growth of a percolation cluster. Sites are occupied with probability p. Occupied sites are represented by a shaded square, growth or perimeter sites are labeled by g, and tested unoccupied sites are labeled by x. Because the seed site is occupied but not tested, we have represented it differently than the other occupied sites. The growth sites are chosen at random. This algorithm, commonly known as the Leath or the single cluster growth algorithm, is equivalent to the following steps (see Figure 13.3): 1. Occupy a single seed site on the lattice. The nearest neighbors (four on the square lattice) of the seed represent the perimeter sites. 2. For each perimeter site, generate a uniform random number r in the unit interval. If r ≤ p, the site is occupied and added to the cluster; otherwise, the site is not occupied. In order that sites be unoccupied with probability 1 − p, these sites are not tested again. 3. For each site that is occupied, determine if there are any new perimeter sites, that is, untested neighbors. Add the new perimeter sites to the perimeter list. 4. Continue steps 2 and 3 until there are no untested perimeter sites to test for occupancy. Class SingleCluster implements this algorithm and computes the number of occupied sites within a radius r of the seed particle. The seed site is placed at the center of a square lattice. Two one-dimensional arrays, pxs and pys, store the x and y positions of the perimeter sites. The status of a site is stored in the byte array s with s(x,y) = (byte) 1 for an occupied site, s(x,y) = (byte)2 for a perimeter site, s(x,y) = (byte)-1 for a site that has already been tested and not occupied, and s(x,y) = (byte) 0 for an untested and unvisited site. To avoid checking for the boundaries of the lattice, we add extra rows and columns at the boundaries and set these sites equal to (byte)-1. We use a byte array because the array s will be sent to the LatticeFrame class which uses byte arrays. Listing 13.1: Class SingleCluster generates and analyzes a single percolation cluster package org . opensourcephysics . sip . ch13 . cluster ; public class SingleCluster { CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 488 public byte s i t e [ ] [ ] ; public int [ ] xs , ys , pxs , pys ; public int L ; public double p ; / / s i t e occupation p r o b a b i l i t y int occupiedNumber ; int perimeterNumber ; / / displacement x to n e a r e s t neighbors int nx [ ] = {1 , −1, 0 , 0 } ; / / displacement y to n e a r e s t neighbors int ny [ ] = {0 , 0 , 1 , −1}; / / mass of ring , index i s d i s t a n c e from c e n t e r of mass double mass [ ] ; public void i n i t i a l i z e ( ) { s i t e = new byte [L+2][L+2]; / / g i v e s s t a t u s of each s i t e xs = new int [L L ] ; / / l o c a t i o n of occupied s i t e s ys = new int [L L ] ; pxs = new int [L L ] ; / / l o c a t i o n of perimeter s i t e s pys = new int [L L ] ; for ( int i = 0; i 0) { int perimeter = ( int ) (Math . random ( ) perimeterNumber ) ; int x = pxs [ perimeter ] ; int y = pys [ perimeter ] ; perimeterNumber −−; pxs [ perimeter ] = pxs [ perimeterNumber ] ; pys [ perimeter ] = pys [ perimeterNumber ] ; i f (Math . random() 1)&&(r pc a fractal? CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 492 (e) The fractal dimension of percolation clusters is not an independent exponent but satisfies the scaling relation D = d − β/ν, (13.5) where β and ν are defined in Table 12.1. The relation (13.5) can be understood by the following finite-size scaling argument. The number of sites in the spanning cluster on a lattice of linear dimension L is given by M(L) ∼ P∞(L)Ld , (13.6) where P∞ is the probability that an occupied site belongs to the spanning cluster, and Ld is the total number of sites in the lattice. In the limit of an infinite lattice and p near pc, we know that P∞(p) ∼ (p−pc)β and ξ(p) ∼ (p−pc)−ν independent of L. Hence, for L ∼ ξ, we have that P∞(L) ∼ L−β/ν [see (12.11)], and we can write M(L) ∼ L−β/ν Ld ∼ LD . (13.7) The relation (13.5) follows. Use the exact values of β and ν from Table 12.1 to find the exact value of D for d = 2. Is your estimate for D consistent with this value? (f)∗ Rewrite the SingleCluster class so that the lattice is stored as a one-dimensional array as is done for class Clusters in Chapter 12. (g)∗ Estimate the fractal dimension for percolation clusters on a simple cubic lattice. Take pc = 0.3117. 13.2 Regular Fractals As we have seen, one characteristic of random fractal objects is that they look the same on a range of length scales. To gain a better understanding of the meaning of self-similarity, consider the following example of a regular fractal, a mathematical object that is self-similar on all length scales. Begin with a line one unit long (see Figure 13.5a). Remove the middle third of the line and replace it by two lines of length 1/3 each so that the curve has a triangular bump in it and the total length of the curve is 4/3 (see Figure 13.5b). In the next stage, each of the segments of length 1/3 is divided into lines of length 1/9 and the procedure is repeated as shown in Figure 13.5c. What is the length of the curve shown in Figure 13.5c? The three stages shown in Figure 13.5 can be extended an infinite number of times. The resulting curve is infinitely long and contains an infinite number of infinitesimally small segments. Such a curve is known as the triadic Koch curve. A Java class that uses a recursive procedure (see Section 6.3) to draw this curve is given in Listing 13.3. Note that method iterate calls itself. Use class KochApp to generate the curves shown in Figure 13.5. Listing 13.3: Class for drawing the Koch curve package org . opensourcephysics . sip . ch13 ; import java . awt . Graphics ; import org . opensourcephysics . controls . ; import org . opensourcephysics . frames . ; import org . opensourcephysics . display . ; public class KochApp extends AbstractCalculation implements Drawable { CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 493 (a) (b) (c) Figure 13.5: The first three stages (a)–(c) of the generation of a self-similar Koch curve. At each stage the displacement of the middle third of each segment is in the direction that increases the area under the curve. The curves were generated using class KochApp. The Koch curve is an example of a continuous curve for which there is no tangent defined at any of its points. The Koch curve is self-similar on each length scale. DisplayFrame frame = new DisplayFrame ( "Koch Curve" ) ; int n = 0; public KochApp ( ) { frame . setPreferredMinMax ( −100 , 600 , −100, 600); frame . setSquareAspect ( true ) ; frame . addDrawable ( this ) ; } public void calculate ( ) { n = control . getInt ( "Number of iterations" ) ; frame . s e t V i s i b l e ( true ) ; } public void i t e r a t e ( double x1 , double y1 , double x2 , double y2 , int n , DrawingPanel panel , Graphics g ) { / / draw Koch curve using r ec ur si on i f (n>0) { double dx = ( x2−x1 ) / 3 ; double dy = ( y2−y1 ) / 3 ; double xOneThird = x1+dx ; / / new end at 1/3 of l i n e segment double yOneThird = y1+dy ; double xTwoThird = x1+2 dx ; / / new end at 2/3 of l i n e segment double yTwoThird = y1+2 dy ; / / r o t a t e s l i n e segment ( dx , dy ) by 60 degrees and adds / / to ( xOneThird , yOneThird ) double xMidPoint = (0.5 dx−0.866 dy+xOneThird ) ; double yMidPoint = (0.5 dy+0.866 dx+yOneThird ) ; / / each l i n e segment g e n e r a t e s 4 new ones i t e r a t e ( x1 , y1 , xOneThird , yOneThird , n−1 , panel , g ) ; CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 494 i t e r a t e ( xOneThird , yOneThird , xMidPoint , yMidPoint , n−1 , panel , g ) ; i t e r a t e ( xMidPoint , yMidPoint , xTwoThird , yTwoThird , n−1 , panel , g ) ; i t e r a t e ( xTwoThird , yTwoThird , x2 , y2 , n−1 , panel , g ) ; } else { int ix1 = panel . xToPix ( x1 ) ; int iy1 = panel . yToPix ( y1 ) ; int ix2 = panel . xToPix ( x2 ) ; int iy2 = panel . yToPix ( y2 ) ; g . drawLine ( ix1 , iy1 , ix2 , iy2 ) ; } } public void draw ( DrawingPanel panel , Graphics g ) { i t e r a t e (0 , 0 , 500 , 0 , n , panel , g ) ; } public void reset ( ) { control . setValue ( "Number of iterations" , 3 ) ; } public s t a t i c void main ( String args [ ] ) { CalculationControl . createApp (new KochApp ( ) ) ; } } How can we determine the fractal dimension of the Koch and similar mathematical objects? There are several generalizations of the Euclidean dimension that lead naturally to a definition of the fractal dimension (see Section 13.5). Here we consider a definition based on counting boxes. Consider a one-dimensional curve of unit length that has been divided into N equal segments of length so that N = 1/ (see Figure 13.6). As decreases, N increases linearly, which is the expected result for a one-dimensional curve. Similarly, if we divide a two-dimensional square of unit area into N equal subsquares of length , we have N = 1/ 2, the expected result for a two-dimensional object (see Figure 13.6). In general, we have N = 1/ D, where D is the fractal dimension of the object. If we take the logarithm of both sides of this relation, we can express the fractal dimension as D = logN log(1/ ) (box dimension). (13.8) Now let us apply this definition to the Koch curve. Each time the length of our measuring unit is reduced by a factor of 3, the number of segments is increased by a factor of 4. If we use the size of each segment as the size of our measuring unit, then at the nth iteration we have N = 4n and = (1/3)n, and the fractal dimension of the triadic Koch curve is given by D = log4n log3n = nlog4 nlog3 ≈ 1.2619 (triadic Koch curve). (13.9) From (13.9) we see that the Koch curve has a fractal dimension between that of a line and a plane. Is this statement consistent with your visual interpretation of the degree to which the triadic Koch curve fills space? CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 495 d = 1 d = 2 Figure 13.6: Examples of one-dimensional and two-dimensional objects. Problem 13.4. The recursive generation of regular fractals (a) Recursion is used in method iterate in KochApp and is one of the more difficult programming concepts. Explain the nature of recursion and the way it is implemented. (b) Regular fractals can be generated from a pattern that is used in a self-replicating manner. Write a program to generate the quadric Koch curve shown in Figure 13.7a. What is its fractal dimension? (c) What is the fractal dimension of the Sierpiński gasket shown in Figure 13.7b? Write a program that generates the next several iterations. (d) What is the fractal dimension of the Sierpiński carpet shown in Figure 13.7c? How does the fractal dimension of the Sierpiński carpet compare to the fractal dimension of a percolation cluster? Are the two fractals visually similar? 13.3 Kinetic Growth Processes Many systems in nature exhibit fractal geometry. Fractals have been used to describe the irregular shapes of such varied objects as coastlines, clouds, coral reefs, and the human lung. Why are fractal structures so common? How do fractal structures form? In this section we discuss several growth models that generate structures that show a remarkable similarity to forms observed in nature. The first two models are already familiar to us and exemplify the flexibility and utility of kinetic growth models. Epidemic model. In the context of the spread of disease, we usually want to know the conditions for an epidemic. A simple lattice model of the spread of a disease can be formulated as follows. Suppose that an occupied site corresponds to an infected person. Initially there is a single infected person and the four nearest neighbor sites (on the square lattice) correspond to susceptible people. At the next time step, we visit the four susceptible sites and occupy (infect) each site with probability p. If a susceptible site is not occupied, we say that the site is immune and we do not test it again. We then find the new susceptible sites and continue until either the disease is controlled or reaches the boundary of the lattice. Convince yourself that this growth model of a disease generates a cluster of infected sites that is identical to a percolation cluster at probability p. The only difference is that we have introduced a discrete time step into the model. Some of the properties of this model are explored in Problem 13.5. CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 496 (a) (b) (c) Figure 13.7: (a) The first few iterations of the quadric Koch curve. (b) The first few iterations of the Sierpiński gasket. (c) The first few iterations of the Sierpiński carpet. Problem 13.5. A simple epidemic model (a) Explain why the simple epidemic model discussed in the text generates the same clusters as in the percolation model. What is the minimum value of p necessary for an epidemic to occur? Recall that in one time step, all susceptible sites are visited simultaneously and infected with probability p. Determine how n, the number of infected sites, depends on the time t (the number of time steps) for various values of p. A straightforward way to proceed is to modify class SingleCluster so that all susceptible sites are visited and occupied with probability p before new susceptible sites are found. In Chapter 14 we will learn that this model is an example of a cellular automaton. (b) What are some ways that you could modify the model to make it more realistic? For example, the infected sites might recover after a certain time. Eden model. An even simpler example of a growth model was proposed by Eden in 1958 to simulate the growth of tumors or a bacterial colony. Although we will find that the resultant mass distribution is not a fractal, the description of the Eden model illustrates the general nature of the fractal growth models we will discuss. Choose a seed site at the center of the lattice for simplicity. The unoccupied nearest neighbors of the occupied sites are the perimeter or growth sites. In the simplest version of the model, a growth site is chosen at random and occupied. The newly occupied site is removed from the list of growth sites and the new growth sites are added to the list. This process is repeated many times until a large cluster of occupied sites is formed. The difference between this model and the simple epidemic model is that all tested sites are occupied. In other words, no growth sites ever become “immune.” Some of the properties of Eden clusters are investigated in Problem 13.6. CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 497 0 2 4 6 8 10 0 1 2 3 4 ln r ln M Figure 13.8: Plot of lnM versus lnr for a single Eden cluster generated on a L = 61 square lattice. A least squares fit from r = 2 to r = 32 yields a slope of approximately 2.01. Problem 13.6. The Eden model (a) Modify class SingleCluster so that clusters are generated on a square lattice according to the Eden model. A straightforward procedure is to occupy perimeter sites with probability p = 1. The simulation should be stopped when the cluster just reaches the edge of the lattice. What would happen if we were to occupy perimeter sites indefinitely? Follow the procedure of Problem 13.3 and determine the number of occupied sites M(r) within a distance r of the seed site. Assume that M(r) ∼ rD for sufficiently large r and estimate D from the slope of a log-log plot of M versus r. A typical log-log plot is shown in Figure 13.8 for L = 61. Can you conclude from your data that Eden clusters are compact? (b) Modify your program so that only the perimeter or growth sites are shown. Where are the majority of the perimeter sites relative to the center of the cluster? Grow as big a cluster as time permits. Invasion percolation. A dynamical process known as invasion percolation has been used to model the shape of the oil-water interface that occurs when water is forced into a porous medium containing oil. The goal is to use the water to recover as much oil as possible. In this process a water cluster grows into the oil through the path of least resistance. Consider a lattice of size Lx×Ly, with the water (the invader) initially occupying the left edge (see Figure 13.9). The resistance to the invader is given by assigning to each lattice site a uniformly distributed random number between 0 and 1; these numbers are fixed throughout the invasion. Sites that are nearest neighbors of the invader sites are the perimeter sites. At each time step, the perimeter site with the lowest random number is occupied by the invader and the oil (the defender) is displaced. The invading cluster grows until a path of occupied sites connects the left and right edges of the lattice. After this path forms, there is no need for the water to occupy any additional sites. To minimize boundary effects, periodic boundary conditions are used for the top and bottom edges, and all quantities are measured over only a central region for from the left and right edges of the lattice. CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 498 0.10 0.07 0.84 0.42 0.64 0.70 0.13 0.04 0.89 0.59 0.55 0.22 0.61 0.34 0.72 t = 0 0.10 0.07 0.84 0.42 0.64 0.70 0.13 0.04 0.89 0.59 0.55 0.22 0.61 0.34 0.72 t = 1 0.10 0.07 0.84 0.42 0.64 0.70 0.13 0.04 0.89 0.59 0.55 0.22 0.61 0.34 0.72 t = 2 0.10 0.07 0.84 0.42 0.64 0.70 0.13 0.04 0.89 0.59 0.55 0.22 0.61 0.34 0.72 t = 3 0.10 0.07 0.84 0.42 0.64 0.70 0.13 0.04 0.89 0.59 0.55 0.22 0.61 0.34 0.72 t = 4 0.10 0.07 0.84 0.42 0.64 0.70 0.13 0.04 0.89 0.59 0.55 0.22 0.61 0.34 0.72 t = 5 0.10 0.07 0.84 0.42 0.64 0.70 0.13 0.04 0.89 0.59 0.55 0.22 0.61 0.34 0.72 t = 6 0.10 0.07 0.84 0.42 0.64 0.70 0.13 0.04 0.89 0.59 0.55 0.22 0.61 0.34 0.72 t = 7 Figure 13.9: Example of a cluster formed by invasion percolation on a 5×3 lattice. The lattice at t = 0 shows the random numbers that have been assigned to the sites. The darkly shaded sites are occupied by the invader that occupies the perimeter site (lightly shaded) with the smallest random number. The cluster continues to grow until a site in the right-most column is occupied. Class Invasion implements the invasion percolation algorithm. The two-dimensional array element site[i][j] initially stores a random number for the site at (i,j). If the site at (i,j) is occupied, then site[i][j] is set equal to 1. If the site at (i,j) is a perimeter site, then site[i][j] is increased by 2. In this way we know which sites are perimeter sites, and the value of the random number is associated with the perimeter site. A new perimeter site is inserted into its proper ordered position in the lists perimeterListX and perimeterListY. The perimeter lists are ordered so that the site with the largest random number is at the beginning. Two search methods are provided for determining the position of a new perimeter site in CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 499 the perimeter lists. In a linear search we go through the list in order until the random number associated with the new perimeter site is between two random numbers in the list. In a binary search we divide the list in two, and determine in which half the new random number belongs. Then we divide this half into half again and so on until the correct position is found. The linear and binary search methods are compared in Problem 13.7d. The binary search is the default method used in class Invasion. The main quantities of interest are the fraction of sites occupied by the invader and the probability P (r)∆r that a site with a random number between r and r + ∆r is occupied. The properties of invasion percolation are explored in Problem 13.7. Listing 13.4: Class for simulating invasion percolation package org . opensourcephysics . sip . ch13 . invasion ; import java . awt . Color ; import org . opensourcephysics . frames . ; public class Invasion { public int Lx , Ly ; public double s i t e [ ] [ ] ; public int perimeterListX [ ] , perimeterListY [ ] ; public int numberOfPerimeterSites ; public boolean ok = true ; public LatticeFrame l a t t i c e ; public Invasion ( LatticeFrame latticeFrame ) { l a t t i c e = latticeFrame ; l a t t i c e . setIndexedColor (0 , Color . blue ) ; l a t t i c e . setIndexedColor (1 , Color . black ) ; } public void i n i t i a l i z e ( ) { Lx = 2 Ly ; s i t e = new double [ Lx ] [ Ly ] ; perimeterListX = new int [ Lx Ly ] ; perimeterListY = new int [ Lx Ly ] ; for ( int y = 0; y 2; numberOfPerimeterSites ++; / / i n s e r t s s i t e in perimeter l i s t in order i n s e r t (1 , y ) ; } ok = true ; } CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 500 public void i n s e r t ( int x , int y ) { int insertionLocation = binarySearch ( x , y ) ; for ( int i = numberOfPerimeterSites −1; i >insertionLocation ; i −−) { perimeterListX [ i ] = perimeterListX [ i −1]; perimeterListY [ i ] = perimeterListY [ i −1]; } perimeterListX [ insertionLocation ] = x ; perimeterListY [ insertionLocation ] = y ; } public int binarySearch ( int x , int y ) { int f i r s t L o c a t i o n = 0; int lastLocation = numberOfPerimeterSites −2; i f ( lastLocation <0) { lastLocation = 0; } int middleLocation = ( f i r s t L o c a t i o n+lastLocation ) / 2 ; / / determine which h a l f of l i s t new number i s in while ( lastLocation −firstLocation >1) { int middleX = perimeterListX [ middleLocation ] ; int middleY = perimeterListY [ middleLocation ] ; i f ( s i t e [ x ] [ y]> s i t e [ middleX ] [ middleY ] ) { lastLocation = middleLocation ; } else { f i r s t L o c a t i o n = middleLocation ; } middleLocation = ( f i r s t L o c a t i o n+lastLocation ) / 2 ; } return lastLocation ; } / / goes in order looking f o r l o c a t i o n to i n s e r t public int linearSearch ( int x , int y ) { i f ( numberOfPerimeterSites==1) { return 0; } else { for ( int i = 0; i s i t e [ perimeterListX [ i ] ] [ perimeterListY [ i ] ] ) { return i ; } } } return numberOfPerimeterSites −1; } public void step ( ) { i f ( ok ) { int nx [ ] = {1 , −1, 0 , 0 } ; int ny [ ] = {0 , 0 , 1 , −1}; int x = perimeterListX [ numberOfPerimeterSites −1]; int y = perimeterListY [ numberOfPerimeterSites −1]; i f ( x>Lx−3) { CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 501 / / i f c l u s t e r g e t s near the end , stop simulation ok = false ; } numberOfPerimeterSites −−; s i t e [ x ] [ y ] −= 1; l a t t i c e . setValue ( x , y , 1 ) ; for ( int i = 0; i <4; i ++) { / / f i n d s new perimeter s i t e s int perimeterX = x+nx [ i ] ; int perimeterY = ( y+ny [ i ])%Ly ; i f ( perimeterY==−1) { perimeterY = Ly−1; } i f ( s i t e [ perimeterX ] [ perimeterY ] <1) { / / new perimeter s i t e s i t e [ perimeterX ] [ perimeterY ] += 2; numberOfPerimeterSites ++; i n s e r t ( perimeterX , perimeterY ) ; } } } } public void computeDistribution ( PlotFrame data ) { int numberOfBins = 20; int numberOccupied = 0; double occupied [ ] = new double [ numberOfBins ] ; double number [ ] = new double [ numberOfBins ] ; double binSize = 1.0/ numberOfBins ; int minX = Lx /3; int maxX = 2 minX ; for ( int x = minX ; x<=maxX; x++) { for ( int y = 0; y1)&&( s i t e [ x ] [ y ] <2)) { numberOccupied++; occupied [ bin ]++; } } } data . setMessage ( "Number occupied = "+numberOccupied ) ; for ( int bin = 0; bin pc on a square lattice. Assume that R2(t) → 4Ds(p)t for p > pc and sufficiently long times. We have denoted the diffusion CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 503 1 t = 0 1/3 1/3 0 1/3 t = 1 0 1/6 0 2/3 0 1/6 t = 2 1/18 5/18 0 1/18 2/9 0 7/18 0 t = 3 Figure 13.10: The evolution of the probability distribution function Wt(i) for three successive time steps. coefficient by Ds because we are considering random walks only on spanning clusters and are not considering walks on the finite clusters that also exist for p > pc. Generate a cluster at p = 0.7 using the single cluster growth algorithm considered in Problem 13.3. Choose the initial position of the ant to be the seed site and modify your program to observe the motion of the ant on the screen. Use L ≥ 101 and average over at least 100 walkers for t up to 500. Where does the ant spend much of its time? If R2(t) ∝ t, what is Ds(p)/D(p = 1)? (b) As in part (a) compute R2(t) for p = 1.0, 0.8, 0.7, 0.65, and 0.62 with L = 101. If time permits, average over several clusters. Make a log-log plot of R2(t) versus t. What is the qualitative t-dependence of R2(t) for relatively short times? Is R2(t) proportional to t for longer times? (Remember that the maximum value of R2 is bounded by the finite size of the lattice.) If R2(t) ∝ t, estimate Ds(p). Plot Ds(p)/D(p = 1) as a function of p and discuss its qualitative dependence. (c) Compute R2(t) for p = 0.4 and confirm that for p < pc, the clusters are finite, R2(t) is bounded, and diffusion is impossible. (d) Because there is no diffusion for p < pc, we might expect that Ds vanishes as p → pc from above, that is, Ds(p) ∼ (p − pc)µs for p pc. Extend your calculations of part (b) to larger L, more walkers (at least 1000), and more values of p near pc and estimate the dynamical exponent µs. (e) At p = pc, we might expect R2(t) to exhibit a different type of t-dependence, for example, R2(t) → t2/z for large t. Do you expect the exponent z to be greater or less than two? Do a simulation of R2(t) at p = pc and estimate z. Choose L ≥ 201 and average over several spanning clusters. CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 504 (f) The algorithm we have been using corresponds to a “blind” ant because the ant chooses from four outcomes even if some of these outcomes are not possible. In contrast, the “myopic” ant can look ahead and see the number q of nearest neighbor occupied sites. The ant then chooses one of the q possible outcomes and thus always takes a step. Redo the simulations in part (b). Does R2(t) reach its asymptotic linear dependence on t earlier or later compared to the blind ant? (g)∗ The limitation of approach we have taken so far is that we have to average over different random walks R2(t) on a given cluster and also average over different clusters. A more efficient way of treating random walks on a random lattice is to use an exact enumeration approach and to consider all possible walks on a given cluster. The idea of the exact enumeration method is that Wt+1(i), the probability that the ant is at site i at time t + 1, is determined solely by the probabilities of the ant being at the neighbors of site i at time t. Store the positions of the occupied sites in an array and introduce two arrays corresponding to Wt+1(i) and Wt(i) for all sites i in the cluster. Use the probabilities Wt(i) to obtain Wt+1(i) (see Figure 13.10). Spatial averages such as the mean square displacement can be calculated from the probability distribution function at different times. The details of the method and the results are discussed in Majid et al., who used walks of 5000 steps on clusters with ∼ 103 sites and averaged their results over 1000 different clusters. (h)∗ Another reason for the interest in diffusion in disordered media is that the diffusion coefficient is proportional to the electrical conductivity of the medium. One of Einstein’s many contributions was to show that the mobility, the ratio of the mean velocity of the particles in a system to an applied force, is proportional to the self-diffusion coefficient in the absence of the applied force (see Reif). For a system of charged particles, the mean velocity of the particles is proportional to the electrical current and the applied force is proportional to the voltage. Hence, the mobility and the electrical conductivity are proportional, and the conductivity is proportional to the self-diffusion coefficient. The electrical conductivity σ vanishes near the percolation threshold as σ ∼ (p − pc)µ with µ ≈ 1.30 (see Section 12.1). The difficulty of doing a direct Monte Carlo calculation of σ was considered in Project 12.18. We measured the self-diffusion coefficient Ds by always placing the ant on a spanning cluster rather than on any cluster. In contrast, the conductivity is measured for the entire system including all finite clusters. Hence, the self-diffusion coefficient D that enters into the Einstein relation should be determined by placing the ant at random anywhere on the lattice, including sites that belong to the spanning cluster and sites that belong to the many finite clusters. Because only those ants that start on the spanning cluster can contribute to D, D is related to Ds by D = P∞Ds, where P∞ is the probability that the ant would land on a spanning cluster. Because P∞ scales as P∞ ∼ (p − pc)β, we have that (p − pc)µ ∼ (p − pc)β(p − pc)µs or µ = µs + β. Use your result for µs found in part (d) and the exact result β = 5/36 (see Table 12.1) to estimate µ and compare your result to the critical exponent µ for the dc electrical conductivity. (i)∗ We can also derive the scaling relation z = 2 + µs/ν = 2 + (µ − β)ν, where z is defined in part (e). Is it easier to determine µs or z accurately from a Monte Carlo simulation on a finite lattice? That is, if your real interest is estimating the best value of the critical exponent µ for the conductivity, should you determine the conductivity directly or should we measure the self-diffusion coefficient at p = pc or at p > pc? What is your best estimate of the conductivity exponent µ? Diffusion limited aggregation. Many objects in nature grow by the random addition of subunits. Examples include snow flakes, lightning, crack formation along a geological fault, CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 505 Figure 13.11: A DLA cluster of 4284 particles on a square lattice with L = 300. and the growth of bacterial colonies. Although it might seem unlikely that such phenomena have much in common, the behavior observed in many models gives us clues that these and many other natural phenomena can be understood in terms of a few unifying principles. A popular model that is a good example of how random motion can give rise to beautiful selfsimilar clusters is known as diffusion limited aggregation or DLA. The first step is to occupy a site with a seed particle. Next, a particle is released at random from a point on the circumference of a large circle whose center coincides with the seed. The particle undergoes a random walk until it reaches a perimeter site of the seed and sticks. Then another random walker is released from the circumference of a large circle and walks until it reaches a perimeter site of one of the two particles in the cluster and sticks. This process is repeated many times (typically on the order of several thousand to several million) until a large cluster is formed. A typical DLA cluster is shown in Figure 13.11. Some of the properties of DLA clusters are explored in Problem 13.9. The following class provides a reasonably efficient simulation of DLA. Walkers begin just outside a circle of radius startRadius enclosing the existing cluster and centered at the seed site. If the walker moves away from the cluster, the step size for the random walker increases. If the walker wanders too far away (further than maxRadius), the walk is restarted. Listing 13.5: Class for simulating diffusion limited aggregation package org . opensourcephysics . sip . ch13 ; import org . opensourcephysics . controls . ; import org . opensourcephysics . frames . LatticeFrame ; import java . awt . Color ; public class DLAApp extends AbstractSimulation { LatticeFrame latticeFrame = new LatticeFrame ( "DLA" ) ; byte s [ ] [ ] ; / / l a t t i c e on which c l u s t e r l i v e s int xOccupied [ ] , yOccupied [ ] ; / / l o c a t i o n of occupied s i t e s int L ; / / l i n e a r dimension of l a t t i c e int halfL ; / / L/2 int ringSize ; / / ring s i z e in which walkers can move int numberOfParticles ; / / number of p a r t i c l e s in c l u s t e r / / radius of c l u s t e r at which walkers are s t a r t e d CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 506 int startRadius ; / / maximum radius walker can go b e f o r e a new walk i s s t a r t e d int maxRadius ; public void i n i t i a l i z e ( ) { latticeFrame . setMessage ( null ) ; numberOfParticles = 1; L = control . getInt ( "lattice size" ) ; startRadius = 3; halfL = L/2; ringSize = L/10; maxRadius = startRadius+ringSize ; s = new byte [L ] [ L ] ; s [ halfL ] [ halfL ] = Byte .MAX_VALUE; latticeFrame . setAll ( s ) ; } public void reset ( ) { latticeFrame . setIndexedColor (0 , Color .BLACK) ; control . setValue ( "lattice size" , 300); setStepsPerDisplay (100); enableStepsPerDisplay ( true ) ; i n i t i a l i z e ( ) ; } public void stopRunning ( ) { control . println ( "Number of particles = "+numberOfParticles ) ; / / add code to compute the mass d i s t r i b u t i o n here } public void doStep ( ) { int x = 0 , y = 0; i f ( startRadius =halfL ) { / / stop the simulation control . calculationDone ( "Done" ) ; latticeFrame . setMessage ( "Done" ) ; } latticeFrame . setMessage ( "n = "+numberOfParticles ) ; } public boolean walk ( int x , int y ) { do { double rSquared = ( x−halfL ) ( x−halfL )+( y−halfL ) ( y−halfL ) ; int r = 1+( int ) Math . sqrt ( rSquared ) ; i f ( r>maxRadius ) { CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 507 return true ; / / s t a r t new walker } i f ( ( r0)) { numberOfParticles ++; s [ x ] [ y ] = 1; latticeFrame . setValue ( x , y , Byte .MAX_VALUE) ; i f ( r>=startRadius ) { startRadius = r +2; } maxRadius = startRadius+ringSize ; return false ; / / walk i s f i n i s h e d } else { / / take a st ep / / s e l e c t d i r e c t i o n randomly switch ( ( int ) (4 Math . random ( ) ) ) { case 0 : x++; break ; case 1 : x−−; break ; case 2 : y++; break ; case 3 : y−−; } } / / end e l s e i f } while ( true ) ; / / end do loop } public s t a t i c void main ( String [ ] args ) { SimulationControl . createApp (new DLAApp ( ) ) ; } } Problem 13.9. Diffusion limited aggregation (a) DLAApp generates diffusion limited aggregation clusters on a square lattice. Each walker begins at a random site on a launching circle of radius r = Rmax + 2, where Rmax is the maximum distance of any particle in the cluster from the origin. To save computer time, we remove a walker that reaches a distance 2Rmax from the seed site and place a new walker at random on the circle of radius r. If the clusters appear to be fractals, make a visual estimate of the fractal dimension. Choose a lattice of linear dimension L ≥ 61. (Experts can make a visual estimate of D to within a few percent.) Modify DLAApp by color coding the sites in the cluster according to their time of arrival; for example, color the first group of sites white, the next group blue, the next group red, and the last group sites green. (Your choice of the size of the group depends in part on the total size of your cluster.) Which parts of the cluster grow faster? Do any of the late arriving green particles reach the center? (b) At t = 0, the four perimeter (growth) sites on the square lattice each have a probability pi = 1/4 of becoming part of the cluster. At t = 1, the cluster has mass two and six perimeter sites. Identify the perimeter sites and convince yourself that their growth probabilities are CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 508 not the same. Do a Monte Carlo simulation and verify that two perimeter sites have growth probabilities p = 2/9 and the other four have p = 5/36. We discuss a more direct way of determining the growth probabilities in Problem 13.10. (c) DLAApp generates clusters inefficiently because most of the CPU time is spent while the random walker is wandering far from the perimeter sites of the cluster. There are several ways of making your program more efficient. One way is to let the walker take bigger steps the further it is from the cluster. For example, if the walker is a distance R > Rmax, a step of length greater than or equal to R − Rmax − 1 may be permitted if this distance is greater than one lattice unit. If the walker is very close to the cluster, the step length is one lattice unit. Make this modification to class DLA and estimate the fractal dimension of diffusion limited clusters generated on a square lattice by computing M(r), the number of sites in the cluster within a radius r centered at the seed site. Because very large clusters are needed to accurately estimate the fractal dimension, you will obtain only approximate results. Other possible modifications to make the implementation of the algorithm are discussed in Project 13.17 and by Meakin (see references). (d)∗ Each time we grow a DLA cluster (and other clusters in which a perimeter site is selected at random), we obtain a slightly different cluster if we use a different random number sequence. One way of reducing this “noise” is to use “noise reduction,” that is, a perimeter site not occupied until it has been visited m times. Each time the random walker lands on a perimeter site, the number of visits for this site is increased by one until the number of visits equals m and the site is occupied. The idea is that noise reduction accelerates the approach to the asymptotic scaling behavior. Consider m = 2, 3, 4, and 5 and grow DLA clusters on the square lattice. Are there any qualitative differences between the clusters for different values of m? (e)∗ In Chapter 12 we found that the exponents describing the percolation transition are independent of the symmetry of the lattice; for example, the exponents for the square and triangular lattices are the same. We might expect that the fractal dimension of DLA clusters would also show such universal behavior. However, the presence of a lattice introduces a small anisotropy that becomes apparent only when very large clusters with the order of 106 sites are grown. Modify your program so that DLA clusters are generated on a triangular lattice. Do the clusters have the same visual appearance as on the square lattice? Estimate the fractal dimension and compare your estimate to your result for the square lattice. The best estimates of D for the square and triangular lattices are D ≈ 1.5 and D ≈ 1.71, respectively. We are reminded of the difficulty of extrapolating the asymptotic behavior from finite clusters. We consider the growth of diffusion limited aggregation clusters in the continuum in Project 13.16. ∗Laplacian growth model. As we discussed in Section 10.6, we can formulate the solution of Laplace’s equation in terms of a random walk. We now do the converse and formulate the DLA algorithm in terms of a solution to Laplace’s equation. Consider the probability P (r) that a random walker reaches a site r starting from the external boundary. This probability satisfies the relation P (r) = 1 4 a P (r + a), (13.10) where the sum in (13.10) is over the four nearest neighbor sites (on a square lattice). If we set P = 1 on the boundary and P = 0 on the cluster, then (13.10) also applies to sites that are neighbors of the external boundary and the cluster. A comparison of the form of (13.10) with CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 509 the form of (10.12) shows that the former is a discrete version of Laplace’s equation 2P = 0. Hence, P (r) has the same behavior as the electrical potential between two electrodes connected to the outer boundary and the cluster, and the growth probability at a perimeter site of the cluster is proportional to the value of the potential at that site. ∗Problem 13.10. Laplacian growth models (a) Solve the discrete Laplace equation (13.10) by hand for the growth probabilities of a DLA cluster of mass 1, 2, and 3. Set P = 1 on the boundary and P = 0 on the cluster. Compare your results to your results in Problem 13.9b for mass 1 and 2. (b) You are probably familiar with the random nature of electrical discharge patterns that occur in atmospheric lightning. Although this phenomenon, known as dielectric breakdown, is complicated, we will see that a simple model leads to discharge patterns that are similar to those that are observed in nature. Because lightning occurs in an inhomogeneous medium with differences in the density, humidity, and conductivity of air, we will develop a model of electrical discharge in an inhomogeneous insulator. We know that when an electrical discharge occurs, the electrical potential φ satisfies Laplace’s equation 2φ = 0. One version of the model (see Family et al.) is specified by the following steps: (i) Consider a large boundary circle of radius R and place a charge source at the origin. Choose the potential φ = 0 at the origin (an occupied site) and φ = 1 for sites on the circumference of the circle. The radius R should be larger than the radius of the growing pattern. (ii) Use the relaxation method (see Section 10.5) to compute the values of the potential φi for (empty) sites within the circle. (iii) Assign a random number r to each empty site within the boundary circle. The random number ri at site i represents a breakdown coefficient and the random inhomogeneous nature of the insulator. (iv) The growth sites are the nearest neighbor sites of the discharge pattern (the occupied sites). Form the product riφa i for each growth site i, where a is an adjustable parameter. Because the potential for the discharge pattern is zero, φi for growth site i can be interpreted as the magnitude of the potential gradient at site i. (v) The perimeter site with the maximum value of the product rφa breaks down; that is, set φ for this site equal to zero. (vi) Use the relaxation method to recompute the values of the potential at the remaining unoccupied sites and repeat steps (iv) and (v). Choose a = 1/4 and analyze the structure of the discharge pattern. Does the pattern appear qualitatively similar to lightning? Does the pattern appear to have a fractal geometry? Estimate the fractal dimension by counting M(b), the average number of sites belonging to the discharge pattern that are within a b×b box. Consider other values of a, for example, a = 1/6 and a = 1/3, and show that the patterns have a fractal structure with a tunable fractal dimension that depends on the parameter a. Published results (Family et al.) are for patterns with 800 occupied sites. (c) Another version of the dielectric breakdown model associates a growth probability pi = φa i / j φa j with each growth site i, where the sum is over all the growth sites. One of the growth sites is occupied with probability pi. That is, choose a growth site at random and CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 510 generate a random number r between 0 and 1. If r ≤ pi, the growth site i is occupied. As before, the exponent a is a free parameter. Convince yourself that a = 1 corresponds to diffusion limited aggregation. (The boundary condition used in the latter corresponds to a zero potential at the growth sites.) To what type of cluster does a = 0 correspond? Consider a = 1/2, 1, and 2 and explore the dependence of the visual appearance of the clusters on a. Estimate the fractal dimension of the clusters. (d) Consider a deterministic growth model for which all growth sites are tested for occupancy at each growth step. Adopt the same geometry and boundary conditions as in part (b) and use the relaxation method to solve Laplace’s equation for φi. Then find the perimeter site with the largest value of φ and set φmax equal to this value. Only those perimeter sites for which the ratio φi/φmax is larger than a parameter p become part of the cluster; φi is set equal to unity for these sites. After each growth step, the new growth sites are determined and the relaxation method is used to recompute the values of φi at each unoccupied site. Choose p = 0.35 and determine the nature of the regular fractal pattern. What is the fractal dimension? Consider other values of p and determine the corresponding fractal dimension. These patterns have been termed Laplace fractal carpets (see Family et al.). Surface growth models. The fractal objects we have discussed so far are self-similar’ that is, if we look at a small piece of the object and magnify it isotropically to the size of the original, the original and the magnified object look similar (on the average). In the following, we introduce some simple models that generate a class of fractals that are self-similar only for scale changes in certain directions. Suppose that we have a flat surface at time t = 0. How does the surface grow as a result of vapor deposition and sedimentation? For example, consider a surface that is initially a line of L occupied sites. Growth is in the vertical direction only (see Figure 13.12). As before, we simply choose a growth site at random and occupy it (the Eden model again). The average height of the surface is given by h = 1 Ns Ns i=1 hi, (13.11) where hi is the distance of the ith surface site from the substrate, and the sum is over all surface sites Ns. (The precise definition of a surface site is discussed in Problem 13.11.) Each time a particle is deposited, the time t is increased by unity. Our main interest is how the width of the surface changes with t. We define the width of the surface by w2 = 1 Ns Ns i=1 (hi − h)2 . (13.12) In general, the width w, which is a measure of the surface roughness, depends on L and t. For short times we expect that w(L,t) ∼ tβ . (13.13) The exponent β describes the growth of the correlations with time along the vertical direction. Figure 13.12 illustrates the evolution of the surface generated according to the Eden model. After a characteristic time, the length over which the fluctuations are correlated becomes comparable to L, and the width reaches a steady-state value that depends only on L. We write w(L,t 1) ∼ Lα , (13.14) CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 511 Figure 13.12: Example of surface growth according to the Eden model. The surface site in column i is the perimeter site with the maximum value of hi. In the figure the average height of the surface is 20.46 and the width is 2.33. where α is known as the roughness exponent. From (13.14) we see that in the steady state, the width of the surface in the direction perpendicular to the substrate grows as Lα. This scaling behavior of the width is characteristic of a self-affine fractal. Such a fractal is invariant (on the average) under anisotropic scale changes; that is, different scaling relations exist along different directions. For example, if we rescale the surface by a factor b in the horizontal direction, then the surface must be rescaled by a factor of bα in the direction perpendicular to the surface to preserve the similarity along the original and rescaled surfaces. Note that on short length scales, that is, lengths shorter than the width of the interface, the surface is rough and its roughness can be characterized by the exponent α. (Imagine an ant walking on the surface.) For length scales much larger than the width of the surface, the surface appears to be flat and, in our example, it is a one-dimensional object. The properties of the surface generated by several growth models are explored in Problem 13.11. Problem 13.11. Growing surfaces (a) In the Eden model a perimeter site is chosen at random and occupied. The growth rule is the same as the usual Eden model, but the growth is started from a line of length L rather than a single site. Hence, there can be “overhangs” as shown in Figure 13.12. Use periodic boundary conditions in the horizontal direction to determine the perimeter sites. The height hi corresponds to the height of column i. Consider L = 64. Describe the visual appearance of the surface as the surface grows. Is the surface well defined visually? Where are most of the perimeter sites? (b) To estimate the exponents α and β, plot the width w(t) as a function of t for L = 32, 64, and 128 on the same graph. What type of plot is most appropriate? Does the width initially grow as a power law? If so, estimate the exponent β. Is there a L-dependent crossover time after which the width of the surface approaches its steady-state value? How can you estimate the exponent α? The best numerical estimates for β and α are consistent with the exact values β = 1/3 and α = 1/2. (c)∗ The dependence of w(L,t) on t and L can be combined into the scaling form w(L,t) ≈ Lα f (t/Lα/β ), (13.15) where f (x) =    Axβ x 1 constant x 1, (13.16) CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 512 Figure 13.13: Example of the growth of a surface according to the ballistic deposition model. Note that if column one is chosen, the next site that would be occupied (not shaded) would leave an unoccupied site below it. where A is a constant. Verify the existence of the scaling form (13.15) by plotting the ratio w(L,t)/Lα versus t/Lα/β for the different values of L considered in part (b). If the scaling form holds, the results for w for the different values of L should fall on a universal curve. Use either the estimated values of α and β that you found in part (b) or the exact values. (d) The Eden model is not really a surface growth model, because any perimeter site can become part of the cluster. In the simplest random deposition model, a column is chosen at random and a particle is deposited at the top of the column of already deposited particles. There is no horizontal correlation between neighboring columns. Do a simulation of this growth model and visually inspect the surface of the interface. Show that the heights of the columns follow a Poisson distribution [see (7.31)] and that h ∼ t and w ∼ t1/2. This structure does not depend on L and hence α = 0. (e) In the ballistic deposition model, a column is chosen at random and a particle is assumed to fall vertically until it reaches the first perimeter site that is a nearest neighbor of a site that already is part of the surface. This condition allows for growth parallel to the substrate. Only one particle falls at a time. How do the rules for this growth model differ from those of the Eden model? How does the surface compare to that of the Eden model? Suppose that instead of the particle falling vertically, we let it do a random walk as in DLA. Would the resultant surface be the same? 13.4 Fractals and Chaos In Chapter 6 we explored dynamical systems that exhibited chaos under certain conditions. We found that after an initial transient, the trajectory of such a dynamical system consists of a set of points in phase space called an attractor. For chaotic motion this attractor is often an object that can be described as a fractal. Such attractors are called strange attractors. We first consider the familiar logistic map [see (6.1)] xn+1 = 4rxn(1 − xn). For most values of the control parameter r > r∞ = 0.892486417967..., the trajectories are chaotic. Are these trajectories fractals? To calculate the fractal dimension for dynamical systems, we use the box counting method introduced in Section 13.2 in which space is divided into d-dimensional boxes of length . Let CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 513 N( ) equal the number of boxes that contain a piece of the trajectory. The fractal dimension is defined by the relation N( ) ∼ lim →0 −D (box dimension). (13.17) Equation (13.17) holds only when the number of boxes is much larger than N( ) and the number of points on the trajectory is sufficiently large. If the trajectory moves through many dimensions, that is, the phase space is very large, box counting becomes too memory intensive because we need an array of size ∝ −d. This array becomes very large for small and large d. A more efficient approach is to compute the correlation dimension. In this approach we store in an array the position of N points on the trajectory. We compute the number of points Ni(r), and the fraction of points fi(r) = Ni(r)/(N − 1) within a distance r of the point i. The correlation function C(r) is defined by C(r) ≡ 1 N i fi(r), (13.18) and the correlation dimension Dc is defined by C(r) ∼ lim r→0 rDc (correlation dimension). (13.19) From (13.19) we see that the slope of a log-log plot of C(r) versus r yields an estimate of the correlation dimension. In practice, small values of r must be discarded because we cannot sample all of the points on the trajectory, and hence there is a cutoff value of r below which C(r) = 0. In the large r limit, C(r) saturates to unity if the trajectory is localized as it is for chaotic trajectories. We expect that for intermediate values of r, there is a scaling regime where (13.19) holds. In Problems 13.12–13.14, we consider the fractal properties of some of the dynamical systems that we considered in Chapter 6. Problem 13.12. Strange attractor of the logistic map (a) Write a program that uses box counting to determine the fractal dimension of the attractor for the logistic map. Compute N( ), the number of boxes of length that have been visited by the trajectory. Test your program for r < r∞. How does the number of boxes containing a piece of the trajectory change with ? What does this dependence tell you about the dimension of the trajectory for r < r∞? (b) Compute N( ) for r = 0.9 using at least five different values of , for example, 1/ = 100, 300, 1000, 3000, . . . . Iterate the map at least 10,000 times before determining N( ). What is the fractal dimension of the attractor? Repeat for r ≈ r∞, r = 0.95, and r = 1. (c) Generate points at random in the unit interval and estimate the fractal dimension using the same method as in part (b). What do you expect to find? Use your results to estimate the accuracy of the fractal dimension that you found in part (b). (d) Write a program to compute the correlation dimension for the logistic map and repeat the calculations for parts (b) and (c). CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 514 Problem 13.13. Strange attractor of the Hénon map (a) Use two-dimensional boxes of linear dimension to estimate the fractal dimension of the strange attractor of the Hénon map [see (6.32)] with a = 1.4 and b = 0.3. Iterate the map at least 1000 times before computing N( ). Does it matter what initial condition you choose? (b) Compute the correlation dimension for the same parameters used in part (a) and compare Dc with the box dimension computed in part (a). (c) Iterate the Hénon map and view the trajectory on the screen by plotting xn+1 versus xn in one window and yn versus xn in another window. Do the two ways of viewing the trajectory look similar? Estimate the correlation dimension, where the ith data point is defined by (xi,xi+1) and the distance rij between the ith and jth data point is given by rij 2 = (xi − xj)2 + (xi+1 − xj+1)2. (d) Estimate the correlation dimension with the ith data point defined by xi and rij 2 = (xi −xj)2. What do you expect to obtain for Dc? Repeat the calculation for the ith data point given by (xi,xi+1,xi+2) and rij 2 = (xi −xj)2 +(xi+1 −xj+1)2 +(xi+2 −xj+2)2. What do you find for Dc? ∗Problem 13.14. Strange attractor of the Lorenz model (a) Use three-dimensional graphics or three two-dimensional plots of x(t) versus y(t), x(t) versus z(t), and y(t) versus z(t) to view the structure of the Lorenz attractor. Use σ = 10, b = 8/3, r = 28, and the time step ∆t = 0.01. Compute the correlation dimension for the Lorenz at- tractor. (b) Repeat the calculation of the correlation dimension using x(t), x(t +τ), and x(t +2τ) instead of x(t), y(t), and z(t). Choose the delay time τ to be at least ten times greater than the time step ∆t. (c) Compute the correlation dimension in the two-dimensional space of x(t) and x(t + τ). Do the same calculation in four dimensions using x(t), x(t + τ), x(t + 2τ), and x(t + 3τ). What can you conclude about the results for the correlation dimension using two-, three-, and four-dimensional spaces? What do you expect to see for d > 4? Problems 13.13 and 13.14 illustrate a practical method for determining the underlying structure of systems when, for example, the data consists only of a single time series, that is, measurements of a single quantity over time. The dimension Dc(d) computed by increasing the dimension of the space d using the delayed coordinate τ eventually saturates when d is approximately equal to the number of variables that actually determine the dynamics. Hence, if we have extensive data for a single variable, for example, the atmospheric pressure or a stock market index, we can use this method to determine the number of independent variables that determine the dynamics of the variable. This information can then be used to help create models of the dynamics. 13.5 Many Dimensions So far we have discussed three ways of defining the fractal dimension: the mass dimension (13.1), the box dimension (13.17), and the correlation dimension (13.19). These methods do not CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 515 always give the same results for the fractal dimension. Indeed, there are many other dimensions that we could compute. For example, instead of just counting the boxes that contain a part of an object, we can count the number of points of the object in each box ni and compute pi = ni/N, where N is the total number of points. A generalized dimension Dq can be defined as Dq = 1 q − 1 lim →0 ln N( ) i=1 p q i ln . (13.20) The sum in (13.20) is over all the boxes and involves the probabilities raised to the qth power. For q = 0, we have D0 = −lim →0 lnN( ) ln . (13.21) If we compare the form of (13.21) with (13.17), we can identify D0 with the box dimension. For q = 1, we need to take the limit of (13.20) as q → 1. Let u(q) = ln i pi q , (13.22) and do a Taylor-series expansion of u(q) about q = 1. We have u(q) = u(1) + (q − 1) du dq + ··· . (13.23) The quantity u(1) = 0 because i pi = 1. The first derivative of u(q) is given by du dq = i pi q lnpi i pi q = i pi lnpi, (13.24) where the last equality follows by setting q = 1. If we use the above relations, we find that D1 is given by D1 = lim →0 i pi lnpi ln (information dimension). (13.25) D1 is called the information dimension because of the similarity of the plnp term in the numerator of (13.24) to the information form of the entropy. It is possible to show that D2 as defined by (13.20) is the same as the mass dimension defined in (13.1) and the correlation dimension Dc. That is, box counting gives D0 and correlation functions give D2 (cf. Sander et al.). There are many objects in nature that differ in appearance but have similar fractal dimension. An example is the different visual appearance in three dimensions of diffusion limited aggregation clusters and the percolation clusters at the percolation threshold. (Both objects have a fractal dimension of approximately 2.5.) In some cases this difference can be accounted for by the multifractal properties of an object. For multifractals the various Dq are different, in contrast to monofractals for which the different measures are the same. Percolation clusters are an example of a monofractal because pi ∼ D0 , the number of boxes N( ) ∼ −D0 , and from (13.20) Dq = D0 for all q. Multifractals occur when the growth quantities are not the same throughout the object as frequently happens for the strange attractors produced by chaotic dynamics. Diffusion limited aggregation is an example of a multifractal. CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 516 13.6 Projects Although the kinetic growth models we have considered yield beautiful pictures, there is much we do not understand. For example, the fractal dimension of DLA clusters can be calculated only by approximate theories whose accuracy is unknown. Why do the fractal dimensions have the values that we estimated by various simulations? Can we trust our numerical estimates of the various exponents, or is it necessary to consider much larger systems to obtain their true asymptotic values? Can we find unifying features for the many kinetic growth models that presently exist? What is the relation of the various kinetic growth models to physical systems? What are the essential quantities needed to characterize the geometry of an object? One of the reasons that kinetic growth models are difficult to understand is that the final cluster typically depends on the history of the growth. We say that these models are examples of “nonequilibrium behavior.” The combination of simplicity, beauty, complexity, and relevance to many experimental systems suggests that the study of fractal objects will continue to involve a wide range of workers in many disciplines. Project 13.15. The percolation cluster size distribution Use the Leath algorithm to determine the critical exponent τ of the cluster size distribution ns for percolation clusters at p = pc: ns ∼ s−τ . (s 1) (13.26) Modify class SingleCluster so that many clusters are generated and ns is computed for a given probability p. Remember that the number of clusters of size s that are grown from a seed is the product sns, rather than ns itself (see Problem 13.3a). Grow at least 100 clusters on a square lattice with L ≥ 61. If time permits, use bigger lattices and average over more clusters and also estimate the accuracy of your estimate of τ. See Grassberger for a discussion of an extension of this approach to estimating the value of pc in higher dimensions. Project 13.16. Continuum DLA (a) In the continuum (off-lattice) version of diffusion limited aggregation. the diffusing particles are assumed to be disks of radius a. A disk executes a random walk until its center is within a distance 2a of the center of a disk that is already a part of the DLA cluster. At each step the walker changes its position by (r cosθ,r sinθ), where r is the step size, and θ is a random variable between 0 and 2π. Modify your DLA program or class DLAApp to simulate continuum DLA. (b) Compare the appearance of a continuum DLA cluster with a DLA cluster generated on a square lattice. It is necessary to grow very large clusters (approximately 106 particles) to see the differences. (c) Use the mass dimension to estimate the fractal dimension of continuum DLA clusters and compare its value with the value you found for the square lattice. Project 13.17. More efficient simulation of DLA To improve the efficiency of the algorithm, the walker in class DLAApp is restarted if it wanders too far from the existing cluster. When the walker is within the distance startRadius of the seed, no optimization is used. Because there can be many unoccupied sites within this distance, it is desirable to use an additional optimization technique (see Ball and Brady). The idea is to CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 517 choose a simple geometrical object (a circle or square) centered at the walker such that none of the cluster is within the object. The walker moves in one step to a site on the boundary of the object. For a circle the walker can move with equal probability to any location on the circumference. For the square we need the probability of moving to various locations on the boundary. To find the largest object that does not contain a part of the DLA cluster, consider coarse grained lattices. For example, each 2×2 group of sites on the original lattice corresponds to one site on the coarser lattice; each 2 × 2 group of sites on the coarse lattice corresponds to a site on an even coarser lattice, etc. If a site is occupied, then any coarse grained site containing this site is also occupied. (a) Because we have considered DLA clusters on a square lattice, we use squares centered at the walker. We first find the probability p(∆x,∆y,s) that a walker centered on a square of length l = 2s+1, will be displaced by the (∆x,∆y). This probability can be computed by simulating a random walk starting at the origin and ending at a boundary site of the square. Repeat this simulation for many walkers and then for various values of s. The fraction of walkers that reach the position (∆x,∆y) is p(∆x,∆y,s). Determine p(∆x,∆y,s) for s = 1 to 16. Store your results in a file. (b) We next determine the arrays such that for a given value of s and a uniform random number r, we can quickly find (∆x,∆y). One way to do so is to create four arrays. The first array lists the probability determined from part (a) such that the values for s = 1 are listed first. Call this array p. For example, p[1] = p(−1,−1,1), p(2) = p(1) + p(−1,0,1), p[3] = p[2] + p(−1,1,1), etc. The array start tells us where to start in the array p for each value of s. The arrays dx(i) and dy(i) give the values of ∆x and ∆y corresponding to p[i]. To see how these arrays are used, consider a walker located at (x,y) centered on a square of linear dimension 2s + 1. Generate a random number r and find i = start(s). If r < p[i], then the walker moves to (x + dx(i),y + dy(i)). If not, increment i by unity and check again. Repeat until r ≤ p[i]. Write a program to create these four arrays and store them in a file. (c) Write a method to determine the maximum value of the parameter s such that a square of size 2s + 1 centered at the position of the walker does not contain any part of the DLA cluster. Use coarse grained lattices to do this determination more efficiently. Modify class DLA to incorporate this method and the arrays defined in part (b). How much faster is your modified program than the original class DLA for clusters of 500 and 5000 particles? (d) What is the largest cluster you can grow on your computer in a reasonable time? Does the cluster show any evidence for anisotropy? For example, does the cluster tend to extend further along the axes or along any other direction? Project 13.18. Cluster-cluster aggregation In DLA all the particles that stick to a cluster are the same size (the growth occurs by the addition of one particle at a time), and the cluster that is formed is motionless. In the following, we consider a cluster-cluster aggregation (CCA) model in which the clusters do a random walk as they aggregate. Suppose we begin with a dilute collection of N particles. Each of these particles is initially a cluster of unit mass and does a random walk until two particles become nearest neighbors. They then stick together to form a cluster of two particles. This new cluster now moves as a single random walker with a smaller diffusion coefficient. As this process continues, the clusters become larger and fewer in number. For simplicity, we assume a square lattice with periodic boundary conditions. The CCA algorithm can be summarized as follows: CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 518 (i) Place N particles at random positions on the lattice. Do not allow a site to be occupied by more than one particle. Identify the ith particle with the ith cluster. (ii) Check if any two clusters have particles that are nearest neighbors. If so, join these two clusters to form a single cluster. (iii) Choose a cluster at random. Decide whether to move the cluster as discussed. If so, move it at random to one of the four possible directions. The details will be discussed in the following paragraphs. (iv) Repeat steps (ii) and (iii) for the desired number of steps or until there is only a single cluster. What rule should we use to decide whether to move a cluster? One possibility is to select a cluster at random and simply move it. This possibility corresponds to all clusters having the same diffusion coefficient, regardless of their mass. A more realistic rule is to assume that the diffusion coefficient of a cluster is inversely related to its mass s, for example, Ds ∝ s−x with x 0. A common assumption is x = 1. If we assume that Ds is inversely proportional to the linear dimension (radius) of the cluster, an assumption consistent with the Stokes–Einstein relation, then x = 1/d, where d is the spatial dimension. However, because the resultant clusters are fractals, we really should take x = 1/D, where D is the fractal dimension of the cluster. To implement the cluster-cluster aggregation algorithm, we need to store the position of each particle and the cluster to which each particle belongs. In class CCA, which can be downloaded from ch13 directory, the position of a particle is given by its x- and y-coordinates and stored in the arrays x and y, respectively. The array element site[x][y] equals zero if there is no particle at (x,y); otherwise, the element equals the label of the cluster to which the particle at (x,y) belongs. The labels of the clusters are found as follows. The array element firstParticle(k) gives the particle label of the first particle in cluster k. To determine all the particles in a given cluster, we use a data structure called a linked list. We implement the linked list using the array nextParticle, so that the value of an element of this array is the index for the next element in the linked list. The array nextParticle contains a series of linked lists, one for each cluster, such that nextParticle[i] equals the particle label of another particle in the same cluster as particle i. If nextParticle[i] = −1, there are no more particles in the cluster. To see how these arrays work, consider three particles 5, 9, and 16 which constitute cluster 4. We have firstParticle[4] = 5, nextParticle[5] = 9, nextParticle[9] = 16, and nextParticle[16] = -1. As the clusters undergo a random walk, we need to check if any pair of particles in different clusters have become nearest neighbors. If such a situation occurs, their respective clusters have to be merged. The check for nearest neighbors is done in method checkNeighbors. If site[x][y] and site[x+1][y] are both nonzero and are not equal, then the two clusters associated with these sites need to be combined. To do so, we add the particles of the smaller cluster to those of the larger cluster. We use another array, lastParticle, to keep track of the last particle in a cluster. The merger can be accomplished by the following statements: / / l i n k l a s t p a r t i c l e of l a r g e r c l u s t e r to f i r s t p a r t i c l e / / of smaller c l u s t e r nextParticle [ l a s t p a r t i c l e [ largerClusterLabel ] ] = f i r s t P a r t i c l e [ smallerClusterLabel ] ; / / s e t s the l a s t p a r t i c l e of l a r g e r c l u s t e r to the l a s t p a r t i c l e / / of smaller c l u s t e r CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 519 l a s t P a r t i c l e [ largerClusterLabel ] = l a s t P a r t i c l e [ smallerClusterLabel ] ; / / adds mass of smaller c l u s t e r to the l a r g e r c l u s t e r mass [ largerClusterLabel ] += mass [ smallerClusterLabel ] ; To complete the merger, all the entries in site[x][y] corresponding to the smaller cluster are relabeled with the label for the larger cluster, and the last cluster in the list is relabeled by the label of the small cluster, so that if there are n clusters they are labeled by 0,1,...,n − 1. (a) Write a target class for class CCA. The class assumes that the diffusion coefficient is independent of the cluster mass. Choose L = 50 and N = 500 and describe the qualitative appearance of the clusters as they form. Do they appear to be fractals? Compare their appearance to DLA clusters. (b) Compute the fractal dimension of the final cluster. Use the center of mass rcm as the origin of the cluster, where rcm = (1/N) i xi, i yi and (xi,yi) is the position of the ith particle. Average your results over at least ten final clusters. Do the same for other values of L and N. Are the clusters formed by cluster-cluster aggregation more or less space filling than DLA clusters? (c) Assume that the diffusion coefficient of a cluster of s particles varies as Ds ∝ s−1/2 in two dimensions. Let Dmax be the diffusion coefficient of the largest cluster. Choose a random number r between 0 and 1 and move the cluster if r < Ds/Dmax. Repeat the simulations in part (a) and discuss any changes in your results. What effect does the dependence of D on s have on the motion of the clusters? References and Suggestions for Further Reading We have considered only a few of the models that lead to self-similar patterns. Use your imagination to design your own model of real-world growth processes. We encourage you to read the research literature and the many books on fractals. R. C. Ball and R. M. Brady, “Large scale lattice effect in diffusion-limited aggregation,” J. Phys. A 18, L809–L813 (1985). The authors discuss the optimization algorithm used in Project 13.17. Albert–László Barabási and H. Eugene Stanley, Fractal Concepts in Surface Growth, Cambridge University Press (1995). J. B. Bassingthwaighte, L. S. Liebovitch, and B. J. West, Fractal Physiology Oxford University Press (1994). D. Ben–Avraham and S. Havlin, Diffusion and Reactions in Fractals and Disordered Systems, Cambridge University Press (2005). K. S. Birdi, Fractals in Chemistry, Geochemistry, and Biophysics, Plenum Press (1993). Armin Bunde and Shlomo Havlin, editors, Fractals and Disordered Systems, revised edition, Springer–Verlag (1996). Fereydoon Family and David P. Landau, editors, Kinetics of Aggregation and Gelation, North– Holland (1984). A collection of research papers that give a wealth of information, pictures, and references on a variety of growth models. CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 520 Fereydoon Family, Daniel E. Platt, and Tamás Vicsek, “Deterministic growth model of pattern formation in dendritic solidification,” J. Phys. A 20, L1177–L1183 (1987). The authors discuss the nature of Laplace fractal carpets. Fereydoon Family and Tamás Vicsek, editors, Dynamics of Fractal Surfaces, World Scientific (1991). A collection of reprints. Fereydoon Family, Y. C. Zhang, and Tamás Vicsek, “Invasion percolation in an external field: Dielectric breakdown in random media,” J. Phys. A. 19, L733–L737 (1986). Jens Feder, Fractals, Plenum Press (1988). This text discusses the applications as well as the mathematics of fractals. Gary William Flake, The Computational Beauty of Nature, MIT Press (2000). J.–M. Garcia–Ruiz, E. Louis, P. Meakin, and L. M. Sander, editors, Growth Patterns in Physical Sciences and Biology, NATO ASI Series B304, Plenum (1993). Peter Grassberger, “Critical percolation in high dimensions," Phys. Rev. E 67, 036101-1–4 (2003). The author uses the Leath algorithm to estimate the value of pc. Thomas C. Halsey, “Diffusion limited aggregation: A model for pattern formation,” Physics Today 53 (11), 36 (2000). J. M. Hammersley and D. C. Handscomb, Monte Carlo Methods, Methuen (1964). The chapter on percolation processes discusses a growth algorithm for percolation. H. J. Herrmann, “Geometrical cluster growth models and kinetic gelation,” Physics Reports 136, 153–224 (1986). Robert C. Hilborn, Chaos and Nonlinear Dynamics, second edition, Oxford University Press (2000). Ofer Malcai, Daniel A. Lidar, Ofer Biham, and David Avnir, “Scaling range and cutoffs in empirical fractals,” Phys. Rev. E, 56, 2817–2828 (1997). The authors show that experimental reports of fractal behavior are typically based on a scaling range that spans only 0.5–2 decades and discuss the possible implications of this limited scaling range. Benoit B. Mandelbrot, The Fractal Geometry of Nature, W. H. Freeman (1983). An influential and beautifully illustrated book on fractals. Imtiaz Majid, Daniel Ben-Avraham, Shlomo Havlin, and H. Eugene Stanley, “Exact-enumeration approach to random walks on percolation clusters in two dimensions,” Phys. Rev. B 30, 1626 (1984). Paul Meakin, Fractals, Scaling and Growth Far From Equilibrium, Cambridge University Press (1998). Also see P. Meakin, “The growth of rough surfaces and interfaces,” Physics Reports 235, 189–289 (1993). The author has written many seminal articles on DLA and similar models. L. Niemeyer, L. Pietronero, and H. J. Wiesmann, “Fractal dimension of dielectric breakdown,” Phys. Rev. Lett. 52, 1033 (1984). H. O. Peitgen and P. H. Richter, The Beauty of Fractals, Springer-Verlag (1986). CHAPTER 13. FRACTALS AND KINETIC GROWTH MODELS 521 Luciano Pietronero and Erio Tosatti, editors, Fractals in Physics, North–Holland (1986). A collection of research papers, many of which are accessible to the motivated reader. Raissa M. D’Souza, “Anomalies in simulations of nearest neighbor ballistic deposition,” Int. J. Mod. Phys. C 8 (4), 941–951 (1997). The author finds that ballistic deposition is a sensitive physical test for correlations present in pseudorandom sequences. F. Reif, Fundamentals of Statistical and Thermal Physics, McGraw-Hill (1965). Einstein’s relation between the diffusion and mobility is discussed in Chapter 15. John C. Russ, Fractal Surfaces, Plenum Press (1994). Evelyn Sander, Leonard M. Sander, and Robert M. Ziff, “Fractals and Fractal Correlations,” Computers in Physics 8, 420–425 (1994). An introduction to fractal growth models and the calculation of their properties. H. Eugene Stanley and Nicole Ostrowsky, editors, On Growth and Form, Martinus Nijhoff Publishers, Netherlands (1986). A collection of research papers at the same level as the 1984 Family and Landau collection. Hideki Takayasu, Fractals in the Physical Sciences, John Wiley & Sons (1990). David D. Thornburg, Discovering Logo, Addison–Wesley (1983). The book is more accurately described by its subtitle, An Invitation to the Art and Pattern of Nature. The nature of recursive procedures and fractals are discussed using many simple examples. Donald L. Turcotte, Fractals and Chaos in Geology and Geophysics, Cambridge University Press (1992). Tamás Vicsek, Fractal Growth Phenomena, second edition, World Scientific Publishing (1992). This book contains an accessible introduction to diffusion limited and cluster-cluster ag- gregation. Bruce J. West, Fractal Physiology and Chaos in Medicine, World Scientific Publishing (1990). David Wilkinson and Jorge F. Willemsen, “Invasion percolation: A new form of percolation theory,” J. Phys. A 16, 3365–3376 (1983). Yu-Xia Zhang, Jian-Ping Sang, Xian-Wu Zou, and Zhun-Zhi Jin, “Random walk on percolation under an external field,” Physica A 350, 163–172 (2005). The authors consider random walks with a drift. Chapter 14 Complex Systems We introduce cellular automata, neural networks, genetic algorithms, and growing networks to explore the concepts of self-organization and complexity. Applications to sandpiles, fluids, earthquakes, and other areas are discussed. 14.1 Cellular Automata Part of the fascination of physics is that it allows us to reduce natural phenomena to a few simple laws. It is also fascinating to think about how a few simple laws can produce the enormously rich behavior that we see in nature. In this chapter we will discuss several models that illustrate some of the new ideas that are emerging from the study of complex systems. The first class of models that we will discuss are known as cellular automata. Cellular automata were introduced by von Neumann and Ulam in 1948 and are mathematical idealizations of dynamical systems in which space and time are discrete and the quantities of interest have a finite set of discrete values that are updated according to a local rule. A cellular automaton can be thought of as a lattice of sites or a checkerboard with colored squares (the cells). Each cell changes its state at the tick of an external clock according to a rule based on the present configuration of the cells in its neighborhood. Cellular automata are examples of discrete dynamical systems that can be simulated exactly on a digital computer. Because the original motivation for studying cellular automata was their biological aspects, the discrete locations in space are frequently referred to as cells. More recently, cellular automata have been applied to a wide variety of physical systems ranging from fluids to galaxies. We will usually refer to sites rather then cells, except when we are explicitly discussing biological systems. The important characteristics of cellular automata include the following: 1. Space is discrete and consists of a regular array of sites. Each site has a finite set of values. 2. The rule for the new value of a site depends only on the values of a local neighborhood of sites near it. 3. Time is discrete. The variables at each site are updated simultaneously based on the values of the variables at the previous time step. Hence, the state of the entire lattice advances in discrete time steps. 522 CHAPTER 14. COMPLEX SYSTEMS 523 t: t + 1: 0 000 1 001 0 010 1 011 1 100 0 101 1 110 0 111 Figure 14.1: Example of a local rule for the evolution of a one-dimensional cellular automaton. The variable at each site can have values 0 or 1. The top row shows the 23 = 8 possible combinations of three sites. The bottom row gives the value of the central site at the next iteration. For example, if the value of a site is 0 and its left neighbor is 1 and its right neighbor is 0, the central site will have the value 1 in the next time step. This rule is termed 01011010 in binary notation (see the second row), the modulo-two rule or rule 90. Note that 90 is the base ten (decimal) equivalent of the binary number 01011010; that is, 90 = 21 + 23 + 24 + 26. We first consider one-dimensional cellular automata and assume that the neighborhood of a given site is the site itself and the sites immediately to the left and right of it. Each site is assumed to have two states (a Boolean automaton). An example of such a rule is illustrated in Figure 14.1, where we see that a rule can be labeled by the binary representation of the update rule for each of the eight possible neighborhoods and by the base ten equivalent of the binary representation. Because any eight digit binary number specifies a one-dimensional cellular automaton, there are 28 = 256 possible rules. Class OneDimensionalAutomatonApp takes the decimal representation of the rule as input and produces the rule array update, which is used to update each lattice site using periodic boundary conditions. The OneDimensionalAutomatonApp class manipulates numbers using their binary representation. Note the use of the bit manipulation operators »> and & (AND) in method setRule. To understand how the right shift operator »> works, consider the expression 13 »> 1. In this case the result of the shift operator is to shift the bits of the binary representation of the integer 13 to the right by one. Because the binary representation of 13 is 1101, the result of the shift operator is 0110. (The left-hand bits are filled with 0s as needed.) To understand the nature of the & operator, consider the expression 0110 & 1, which we can write as 0110 & 0001. In this case the result is 0000 because the & operator sets each of the the resulting bits to 1 if the corresponding bit in both operands is 1; otherwise, the bit is zero. We use the LatticeFrame class to represent the sites and their evolution. At a given time, the sites are drawn in the horizontal direction; time increases in the vertical direction. In method iterate, the % operator is used to determine the left and right neighbors of a site using periodic boundary conditions. Also note the use of the left shift operator « in method iterate. A more complete discussion of bit manipulation is given in Section 14.6. Listing 14.1: One-dimensional cellular automaton class. package org . opensourcephysics . sip . ch14 . ca ; import org . opensourcephysics . controls . ; import org . opensourcephysics . frames . ; public class OneDimensionalAutomatonApp extends AbstractCalculation { LatticeFrame automaton = new LatticeFrame ( "" ) ; / / update [ ] maps neighborhood c o n f i g u r a t i o n s to 0 or 1 int [ ] update = new int [ 8 ] ; public void calculate ( ) { control . clearMessages ( ) ; int L = control . getInt ( "Linear dimension" ) ; int tmax = control . getInt ( "Maximum time" ) ; CHAPTER 14. COMPLEX SYSTEMS 524 / / d e f a u l t i s l a t t i c e s i t e s a l l zero automaton . r e s i z e L a t t i c e (L , tmax ) ; / / seed l a t t i c e by putting 1 in middle of f i r s t row automaton . setValue (L/2 , 0 , 1 ) ; / / choose c o l o r of empty and occupied s i t e s automaton . setIndexedColor (0 , java . awt . Color .YELLOW) ; / / empty automaton . setIndexedColor (1 , java . awt . Color .BLUE ) ; / / occupied setRule ( control . getInt ( "Rule number" ) ) ; for ( int t = 1; t=0; i −−) { / / ( ruleNumber >>> i ) s h i f t s the contents of ruleNumber to the r i g h t by i / / b i t s . In p a r t i c u l a r , the i t h b i t of ruleNumber r e s i d e s in the rightmost / / p o s i t i o n of t h i s e x p r e s s i o n . After "and" ing with the number 1 , we are / / l e f t with e i t h e r the number 0 or 1 , depending on whether the i t h / / b i t of ruleNumber was c l e a r e d or s e t . update [ i ] = ( ( ruleNumber>>>i )&1); control . print ( " "+update [ i ]+" " ) ; } control . println ( ) ; } public void reset ( ) { control . setValue ( "Rule number" , 90); control . setValue ( "Maximum time" , 100); control . setValue ( "Linear dimension" , 500); } public s t a t i c void main ( String args [ ] ) { CalculationControl . createApp (new OneDimensionalAutomatonApp ( ) ) ; } } The properties of all 256 one-dimensional cellular automata have been cataloged (see Wol- CHAPTER 14. COMPLEX SYSTEMS 525 fram, 1984). We explore some of the properties of one-dimensional cellular automata in Problems 14.1 and 14.3. Problem 14.1. One-dimensional cellular automata (a) What is the result of 13 & 12, 33 >>> 1 (decimal representation) and 1101 & 0111 (binary representation)? Consider rule 90 and work out by hand the values of update[] according to method setRule. (b) Use OneDimensionalAutomatonApp and consider rule 90 shown in Figure 14.1. This rule is also known as the modulo-two rule because the value of a site at step t+1 is the sum modulo 2 of its two neighbors at step t. Choose the initial configuration to be a single nonzero site (the seed) at the midpoint of the lattice. It is sufficient to consider the evolution for approximately twenty iterations. Is the resulting pattern of nonzero sites self-similar? If so, characterize the pattern by a fractal dimension. (c) Determine the properties of a rule for which the value of a site at step t + 1 is the sum modulo 2 of the values of its neighbors plus its own value at step t. This rule is equivalent to 10010110 or rule 150 = 21 + 22 + 24 + 27. Start with a single seed site. (d) Choose a random initial configuration for which the independent probability for each site to have the value 1 is p = 1/2; otherwise, the value of the site is 0. Determine the evolution of rule 90, rule 150, rule 18 = 21+24 (00010010), rule 73 = 20+23+26 (01001001), and rule 136 (10001000). How sensitive are the patterns that are formed to the initial conditions? Does the nature of the patterns depend on the use or nonuse of periodic boundary conditions? Listing 14.2: A more efficient implementation of method iterate in OneDimensionalAutomatonApp. public void i t e r a t e ( int t , int L) { / / encodes s t a t e (L−1) and s t a t e (0) in second and f i r s t b i t s / / of neighborhood v a r i a b l e int neighborhood = ( automaton . getValue (L−1 , t −1)<<1) + automaton . getValue (0 , t −1); for ( int i = 0; i < L ; i ++) { / / c l e a r t h i r d b i t of neighborhood , but keep second and f i r s t b i t s neighborhood = neighborhood & 3; / / s h i f t second and f i r s t b i t s of neighborhood to t h i r d / / and second b i t s neighborhood = neighborhood << 1; / / encode s t a t e ( i +1) into f i r s t b i t of neighborhood using / / p e r i o d i c boundary c o n d i t i o n s neighborhood += automaton . getValue ( ( i +1)%L , t −1); / / neighborhood now encodes the t h r e e b i t s of s t a t e surrounding / / index i at time t −1. with neighborhood as an index , the / / update [ ] t a b l e g i v e s us the s t a t e at index i and time t . automaton . setValue ( i , t , update [ neighborhood ] ) ; } } Method iterate in class OneDimensionalAutomatonApp is not as efficient as possible because it does not use information about the neighborhood at site i to determine the neighborhood at site i + 1. A more efficient implementation is given in Listing 14.2. To understand how CHAPTER 14. COMPLEX SYSTEMS 526 this version of method iterate works, suppose that the lattice at t = 0 is 1011, and we want to determine the neighborhood of the site at i = 0. The answer is 6 in decimal, corresponding to 110 in binary. Because of periodic boundary conditions, the index to the left of i = 0 is L − 1. The expression (automaton.getValue(L-1,t-1)«1) yields 001 « 1 = 010 because « shifts all bits to the left. (Only 3 bits are needed to describe the neighborhood.) The statement int neighborhood = ( automaton . getValue (L−1 , t −1)<<1) + automaton . getValue (0 , t −1); yields 010 + 001 = 011. The effect of the statement neighborhood = neighborhood & 3; is to clear the third bit of the neighborhood but to keep the second and first bits: 011 & 011 = 011. In this case nothing is changed. We then shift the second and first bits of the neighborhood to the third and second bits: neighborhood = neighborhood << 1; and obtain neighborhood = 110. Finally the statement neighborhood += automaton . getValue ( ( i +1)%L , t −1); gives neighborhood = 011 + 000 = 011, which is 2 in decimal. ∗Problem 14.2. Whose time is more important? (a) Work out another example to make sure that you understand the nature of the bit manipulations that are used in Listing 14.1 and in the more efficient version of method iterate. (b) Which version of method iterate would you use, the more efficient but more difficult to understand (and debug) version or the less efficient but easier to understand version? What is more important, computer time or programmer time? In general, the answer depends on the context. The dynamical behavior of many of the 256 one-dimensional Boolean cellular automata is uninteresting, and hence we also consider one-dimensional Boolean cellular automata with larger neighborhoods (including the site itself). Because a larger neighborhood implies that there are many more possible update rules, we place some reasonable restrictions on the rules. First, we assume that the rules are symmetrical; for example, the neighborhood 100 produces the same value for the central site as 001. We also require that the zero neighborhood 000 yields 0 for the central site, and that the value of the central site depends only on the sum of the values of the sites in the neighborhood; for example, 011 produces the same value for the central site as 101 (see Wolfram, 1984). A simple way of coding the rules that is consistent with these requirements is as follows. Each rule is labeled by a sequence of 0s and 1s such that the sequence indicates which sums set the central site equal to 1. If the lowest order digit is 1, then the central site is set to 1 if the sum is 0. If the next digit is 1, then the central site is set to 1 if the sum is 1, etc. For example, the rule 10110 indicates that the central site will be set to 1 if the number of neighbors equal to 1 is 1, 2, or 4. CHAPTER 14. COMPLEX SYSTEMS 527 Problem 14.3. More one-dimensional cellular automata (a) Modify class OneDimensionalAutomatonApp so that it incorporates the possible rules discussed in the text based on the number of sites equal to 1 in a neighborhood of 2z + 1 sites. How many possible rules are there for z = 1? Choose z = 1 and a random initial configuration and determine if the long time behavior for each rule belongs to one of the following categories: (i) A homogeneous state where every site has the same value. An example is rule 1000. (ii) A pattern consisting of separate stable or periodic regions. An example is rule 0100. (iii) A chaotic, aperiodic pattern. An example is rule 1010. (iv) A set of complex, localized structures that may not live forever. There are no examples for z = 1. (b) Modify your program so that z = 2. Wolfram (1984) claims that rules 010100 and 110100 are the only examples of complex behavior (category 4). Describe how the behavior of these two rules differs from the behavior of the other rules. Find at least one rule for each of the four categories. The results of Problem 14.3 suggests that an important feature of cellular automata is their capability for self-organization. In particular, the class of complex localized structures is distinct from regular as well as aperiodic structures. An important idea of complexity theory is that simple rules can lead to complex behavior. This complex behavior is not random but has structure. Are there “coarse grained” descriptions that can predict the dynamical behavior of these systems, or do we have to implement the model on a computer using the dynamical rules at the lowest level of description? For example, our understanding of the flow of a fluid through a pipe would be very limited if the only way we could obtain information about the behavior of fluids was to solve the equations of motion for all the individual particles. In this case there is a coarse grained description of fluids where the fundamental fluid variables are not the individual positions and velocities of the molecules, but rather a velocity field which can be interpreted as a spatial average over the velocities of many particles. The resultant partial differential equation of fluid mechanics, known as the Navier– Stokes equation, provides the coarse grained description, which can be solved, in principle, to predict the motion of the fluid. Is there an analogous coarse grained description of a cellular automaton? Israeli and Goldenfeld have found some examples for which a coarse grained description exists. We first simulate a cellular automaton that produces complex structures. Then we start with the same initial state and create a coarse grained lattice such that each of its cells is a coarse grained description of a group of cells on the original lattice. The idea is to determine a different update rule to evolve the coarse grained lattice such that the configurations of the coarse grained lattice are identical to the coarse grained configurations of the original lattice that were obtained using the original update rule. If it is possible to implement this procedure in general, we would be better able to develop theories of complex macroscopic systems without needing to know the details of the dynamics of the microscopic constituents that make up these systems. We explore two examples in Problem 14.4. CHAPTER 14. COMPLEX SYSTEMS 528 ∗Problem 14.4. Coarse graining one-dimensional cellular automata (a) Add methods to OneDimensionalAutomatonApp that create a coarse grained lattice such that groups of three cells are coarse grained to 1 if all three cells are 1 and coarse grained to 0 otherwise. Allow the coarse grained lattice to evolve separately using a different update rule than the original lattice. The coarse grained lattice should be updated after every three updates of the original lattice. Draw the coarse grained lattice as a space-time diagram similar to what we have done for the original lattice, such that each cell in the coarse grained lattice is three times the size of a cell on the original lattice in both the space and time directions. Use rule 146 (10010010) for the original lattice and rule 128 (10000000) for the coarse grained lattice. Choose a lattice size L that is a multiple of 3 and run for a time that is a multiple of 3. You should see similar patterns in the two lattices, although the original lattice contains some details that are washed out by the coarse grained lattice. If you coarse grain the original lattice cells at each time step, you will obtain the same pattern as the coarse grained lattice. (b) Modify your program such that each pair of cells is coarse grained to 1 if two original cells are both 0 or both 1 and coarse grained to 0 otherwise. Use rule 105 (01101001) on the original cells with L = 120 for 60 iterations and run the coarse grained system using rule 150 (10100110). You should obtain results similar to those found in part (a). Traffic models. Physicists have been at the forefront of the development of a more systematic approach to the characterization and control of traffic. Much of this work was initiated at General Motors by Robert Herman in the late 1950s. The car-following theory of traffic flow that he and Elliott Montroll and others developed during this time is still used today. What has changed is the way we can implement these theories. The continuum approach used by Herman and Montroll is based on partial differential equations. An alternative that is more flexible and easier to understand is based on cellular automata. We first consider a simple one lane highway where cars enter at one end and exit at the other end. To implement the Nagel–Schreckenberg cellular automaton model, we use integer arrays for the position xi and velocity vi, where i indexes a car and not a lattice site. The important input parameters of the simulation are the maximum velocity vmax, the density of cars ρ, and the probability p of a car slowing down. This probability adds some randomization to the drivers. The algorithm implemented in class Freeway for the motion of each car at each iteration is as follows: 1. If vi < vmax, increase the velocity vi of car i by one unit; that is, vi → vi + 1. This change models the process of acceleration to the maximum velocity. 2. Compute the distance to the next car d. If vi ≥ d, then reduce the velocity to vi = d − 1 to prevent crashes. 3. With probability p, reduce the velocity of a moving car by one unit. Thus, vi → vi − 1. 4. Update the position xi of car i so that xi(t + 1) = xi(t) + vi. This ordering of the steps ensures that cars do not overlap. Listing 14.3: One lane freeway class. package org . opensourcephysics . sip . ch14 . t r a f f i c ; CHAPTER 14. COMPLEX SYSTEMS 529 import java . awt . Graphics ; import org . opensourcephysics . display . ; import org . opensourcephysics . frames . ; import org . opensourcephysics . display2d . ; import org . opensourcephysics . controls . ; public class Freeway implements Drawable { public int [ ] v , x , xtemp ; public LatticeFrame spaceTime ; public double [ ] distribution ; public int roadLength ; public int numberOfCars ; public int maximumVelocity ; public double p ; / / p r o b a b i l i t y of reducing v e l o c i t y private CellLattice road ; public double flow ; public int steps , t ; / / number of i t e r a t i o n s b e f o r e s c r o l l i n g space −time diagram public int scrollTime = 100; public void i n i t i a l i z e ( LatticeFrame spaceTime ) { this . spaceTime = spaceTime ; x = new int [ numberOfCars ] ; xtemp = new int [ numberOfCars ] ; / / used to allow p a r a l l e l updating v = new int [ numberOfCars ] ; spaceTime . r e s i z e L a t t i c e ( roadLength , 100); road = new CellLattice ( roadLength , 1 ) ; road . setIndexedColor (0 , java . awt . Color .RED) ; road . setIndexedColor (1 , java . awt . Color .GREEN) ; spaceTime . setIndexedColor (0 , java . awt . Color .RED) ; spaceTime . setIndexedColor (1 , java . awt . Color .GREEN) ; int d = roadLength/numberOfCars ; x [ 0 ] = 0; v [0 ] = maximumVelocity ; for ( int i = 1; i =d) { v [ i ] = d−1; / / slow down due to cars in f r o n t } i f ( ( v [ i ]>0)&&(Math . random()

3 } } } CHAPTER 14. COMPLEX SYSTEMS 535 latticeFrame . setAll ( newCells ) ; } public s t a t i c void main ( String [ ] args ) { OSPControl control = SimulationControl . createApp (new LifeApp ( ) ) ; control . addButton ( "clear" , "Clear" ) ; / / o p t i o n a l custom action } } Problem 14.6. The Game of Life (a) LifeApp allows the user to determine the initial configuration interactively by clicking on a cell to change its value before hitting the Start button. Choose several initial configurations with a small number of live cells and determine the different types of patterns that emerge. Some suggested initial configurations are shown in Figure 14.2b. Does it matter whether you use fixed or periodic boundary conditions? Use a 16 × 16 lattice. (b) Modify LifeApp so that each cell is initially alive with a 50% probability. Use a 32 × 32 lattice. What types of patterns typically result after a long time? What happens for 20% live cells? What happens for 70% live cells? (c) Assume that each cell is initially alive with probability p. Given that the density of live cells at time t is ρ(t), what is ρ(t + 1), the expected density at time t + 1? Do the simulation and plot ρ(t + 1) versus ρ(t). If p = 0.5, what is the steady-state density of live cells? (d)∗ LifeApp has not been optimized for the Game of Life and is written so that other rules can be implemented easily. Rewrite LifeApp so that it uses bit manipulation (see Section 14.6). The Game of Life is an example of a universal computing machine. That is, we can choose an initial configuration of live cells to represent any possible program and any set of input data, run the Game of Life, and the output data will appear in some region of the lattice. The proof of this result (see Berlekamp et al.) involves showing how various configurations of cells represent the components of a computer, including wires, storage, and the fundamental components of a CPU – the digital logic gates that perform and, or, and other logical and arithmetic operations. Other cellular automata can also be shown to be universal computing machines. 14.2 Self-Organized Critical Phenomena Very large events such as a magnitude eight earthquake, an avalanche on a snow covered mountain, the sudden collapse of an empire (for example, the Soviet Union), or the crash of the stock market are rare. When such events occur, are they due to some special set of circumstances or are they part of a more general pattern of events that would occur without any specific external intervention? The idea of self-organized criticality is that in many cases the occurrence of very large events does not depend on special conditions or external forces and is due to the intrinsic dynamics of the system. If s represents the magnitude of an event, such as the energy released in an earthquake or the amount of snow in an avalanche, then a system is said to be critical if the number of events, N(s), follows a power law: N(s) ∼ s−α (no characteristic scale). (14.1) CHAPTER 14. COMPLEX SYSTEMS 536 If α ≈ 1, the form (14.1) implies that there would be one large event of size 1000 for every 1000 events of size one. One implication of the power law form (14.1) is that there is no characteristic scale, and the system is said to be scale invariant. This terminology reflects the fact that power laws look the same on all scales. For example, the replacement s → bs in the function N(s) = As−α yields a function N(s) that is indistinguishable from N(s), except for a change in the amplitude A by the factor b−α. Contrast the nature of the power law dependence of N(s) in (14.1) to the result of combining a large number of independently acting random events. In this case we know that the distribution of the sum is a Gaussian (see Problem 7.15), and N(s) has the form N(s) ∼ e−(s/s0)2 (characteristic scale). (14.2) Scale invariance does not hold for functions that decay as in (14.2), because the replacement s → bs in the function e−(s/s0)2 changes s0 (the characteristic scale or size of s) by the factor b. Note that for a power law distribution, there are events of all sizes, but for a Gaussian distribution, there are, practically speaking, no events much larger than the characteristic scale s0. For example, if we take s0 = 100, there would be one large event of size 1000 for every 2.7 × 1043 events of size one! A simple example of self-organized critical phenomena is an idealized sandpile. Suppose that we construct a sandpile by randomly adding one grain at a time onto a flat surface with open edges. Initially, the grains will remain where they land, but after we add more grains, there will be small avalanches during which the grains move so that the local slope of the pile is not too big. Eventually, the pile will reach a statistically stationary (time-independent) state, and the amount of sand added will balance the sand that falls off the edge (on the average). When a single grain of sand is added to such a configuration, a rearrangement might occur that triggers an avalanche of any size (up to the size of the system), so that the mean slope again equals the critical value. We say that the statistically stationary state is critical because there are avalanches of all sizes. The stationary state is self-organized because no external parameter (such as the temperature) needs to be tuned to force the system to this state. In contrast, the concentration of fissionable material in a nuclear chain reaction has to be carefully controlled for the nuclear chain reaction to become critical. We consider a two-dimensional model of a sandpile and represent the height at site i by the array element height[i]. One grain of sand is added to a random site j, height[j]++, at each iteration. If height[j] = 4, then we remove the four grains from site j and distribute them equally to its nearest neighbors. A site whose height is equal to four is said to topple. If any of the neighbors now have four grains of sand, they topple as well. This process continues until all sites have less than four grains of sand. Grains that fall outside the lattice are lost forever. Class Sandpile implements this idealized model. The lattice is stored in a LatticeFrame and the arrays toppleSiteX and toppleSiteY store the coordinates of the sites with four grains of sand. The array distribution accumulates the data for the number of sites that topple at each addition of a grain of sand to the pile. It is possible, though rare, that a site will topple more than once in one step. Hence, the number of toppled sites may be greater than the number of sites in the lattice. Physically, it is not the actual height that determines toppling but the mean local slope between a site and its nearest neighbors. Thus, what we call the “height” really should be called the “slope.” However, in the literature many authors use the term “height.” Listing 14.6: Implementation of the two-dimensional sandpile model. package org . opensourcephysics . sip . ch14 . sandpile ; CHAPTER 14. COMPLEX SYSTEMS 537 import java . awt . Graphics ; import org . opensourcephysics . frames . ; public class Sandpile { int [ ] distribution ; / / d i s t r i b u t i o n of number of s i t e s toppling int [ ] toppleSiteX , toppleSiteY ; LatticeFrame height ; int L , numberToppledMax ; int numberToppled , numberOfSitesToTopple , numberOfGrains ; public void i n i t i a l i z e ( LatticeFrame height ) { this . height = height ; height . r e s i z e L a t t i c e (L , L ) ; / / c r e a t e new l a t t i c e / / s i z e of d i s t r i b u t i o n array numberToppledMax = 2 L L+1; / / could use histogramframe instead distribution = new int [ numberToppledMax ] ; toppleSiteX = new int [L L ] ; toppleSiteY = new int [L L ] ; numberOfGrains = 0; resetAverages ( ) ; } public void step ( ) { numberOfGrains++; numberToppled = 0; int x = ( int ) (Math . random ( ) L ) ; int y = ( int ) (Math . random ( ) L ) ; int h = height . getValue ( x , y )+1; height . setValue ( x , y , h ) ; / / add grain to random s i t e height . render ( ) ; i f (h==4) { / / t op pl e grain numberOfSitesToTopple = 1; boolean unstable = true ; int [ ] siteToTopple = { x , y } ; while ( unstable ) { unstable = toppleSite ( siteToTopple ) ; } } distribution [ numberToppled ]++; } public boolean toppleSite ( int siteToTopple [ ] ) { / / t op pl e s i t e numberToppled++; int x = siteToTopple [ 0 ] ; int y = siteToTopple [ 1 ] ; numberOfSitesToTopple −−; / / remove grains from s i t e height . setValue ( x , y , height . getValue ( x , y ) −4); height . render ( ) ; / / add grains to neighbors / / i f (x , y ) i s on the border of the l a t t i c e , then / / some grains w i l l be l o s t . CHAPTER 14. COMPLEX SYSTEMS 538 i f ( x+10) { addGrain ( x−1 , y ) ; } i f ( y+10) { addGrain ( x , y −1); } i f ( numberOfSitesToTopple >0) { / / next s i t e to t op pl e siteToTopple [ 0 ] = toppleSiteX [ numberOfSitesToTopple −1]; siteToTopple [ 1 ] = toppleSiteY [ numberOfSitesToTopple −1]; return true ; } else { return false ; } } public void addGrain ( int x , int y ) { int h = height . getValue ( x , y )+1; height . setValue ( x , y , h ) ; / / add grain to s i t e height . render ( ) ; i f (h==4) { / / new s i t e to t op pl e toppleSiteX [ numberOfSitesToTopple ] = x ; toppleSiteY [ numberOfSitesToTopple ] = y ; numberOfSitesToTopple++; } } public void resetAverages ( ) { distribution = new int [ numberToppledMax ] ; numberOfGrains = 0; } } Listing 14.7: The target class for the two-dimensional sandpile model. package org . opensourcephysics . sip . ch14 . sandpile ; import org . opensourcephysics . controls . ; import org . opensourcephysics . frames . ; public class SandpileApp extends AbstractSimulation { Sandpile sandpile = new Sandpile ( ) ; ; LatticeFrame height = new LatticeFrame ( "x" , "y" , "Sandpile" ) ; PlotFrame plotFrame = new PlotFrame ( "ln s" , "ln N" , "Distribution of toppled sites" ) ; public SandpileApp ( ) { height . setIndexedColor (0 , java . awt . Color .WHITE) ; height . setIndexedColor (1 , java . awt . Color .BLUE ) ; CHAPTER 14. COMPLEX SYSTEMS 539 height . setIndexedColor (2 , java . awt . Color .GREEN) ; height . setIndexedColor (3 , java . awt . Color .RED) ; height . setIndexedColor (4 , java . awt . Color .BLACK) ; } public void i n i t i a l i z e ( ) { sandpile . L = control . getInt ( "L" ) ; height . setPreferredMinMax (0 , sandpile . L , 0 , sandpile . L ) ; sandpile . i n i t i a l i z e ( height ) ; } public void doStep ( ) { sandpile . step ( ) ; } public void stop ( ) { plotFrame . clearData ( ) ; for ( int s = 1; s 0) { plotFrame . append (0 , Math . log ( s ) , Math . log ( f /N ) ) ; ; } } plotFrame . render ( ) ; } public void reset ( ) { control . setValue ( "L" , 10); enableStepsPerDisplay ( true ) ; } public void resetAverages ( ) { sandpile . resetAverages ( ) ; } public s t a t i c void main ( String [ ] args ) { SimulationControl control = SimulationControl . createApp (new SandpileApp ( ) ) ; control . addButton ( "resetAverages" , "resetAverages" ) ; } } Problem 14.7. A two-dimensional sandpile model (a) Use the classes Sandpile and SandpileApp to simulate a two-dimensional sandpile with linear dimension L. Run the simulation with L = 10 and stop it once toppling starts to occur. When this behavior occurs, black cells (with four grains) will momentarily appear. Use the Step button to watch individual toppling events and obtain a qualitative sense of the dynamics of the sandpile model. (b) Comment out the height.render() statements in Sandpile and add a statement to SandPileApp so that the number of grains added to the system is displayed. (The number of grains added is a measure of the number of configurations that are included in the various CHAPTER 14. COMPLEX SYSTEMS 540 averages.) Now you will not be able to see individual toppling events, but you can more quickly collect data on the toppling distribution, the frequency of the number of sites that topple when a grain is added. The program outputs a log-log plot of the distribution. Estimate the slope of the log-log distribution from the part of the plot that is linear and thus determine the power law exponent α. Reset the averages and repeat your calculation to obtain another estimate of α. If your two estimates of α are within a few percent, you have added enough grains of sand. Compute α for L = 10, 20, 40, and 80. As you make the lattice size larger, the range over which the log-log plot is linear should increase. Explain why the plot is not linear for large values of the number of toppled sites. Of course, the model of a sandpile in Problem 14.7 is over simplified. Laboratory experiments indicate that real sandpiles show power law behavior if the piles are small, but that larger sandpiles do not (see Jaeger et al.). Earthquakes. The empirical Gutenberg–Richter law for N(E), the number of earthquakes with energy release E, is consistent with power law behavior: N(E) ∼ E−b , (14.3) with b ≈ 1. The magnitude of earthquakes on the Richter scale is approximately the logarithm of the energy release. This power law behavior does not necessarily hold for individual fault systems, but holds reasonably accurately when all fault systems are considered. One implication of the power law dependence in (14.3) is that there is nothing special about large earthquakes. In Problems 14.8 and 14.9 and Project 14.26 we explore some earthquake models. Given the long time scales between earthquakes, there is considerable interest in simulating models of earthquakes. The Burridge–Knopoff model considered in Project 14.26 consists of a system of coupled masses in contact with a rough surface. The masses are subjected to static and dynamic friction forces due to the surface, and are also pulled by an external force corresponding to slow tectonic plate motion. The major difficulty with this model is that the numerical solution of the corresponding equations of motion is computationally intensive. For this reason we consider several cellular automaton models that retain some of the basic physics of the Burridge–Knopoff model. Problem 14.8. A simple earthquake model Define the real variable F(i,j) on a square lattice, where F represents the force or stress on the block at position (i,j). The initial state of the lattice at time t = 0 is found by assigning small random values to F(i,j). The lattice is updated according to the following rules: (i) Increase F at every site by a small amount ∆F, for example, ∆F = 10−3, and increase the time t by 1. This increase represents the effect of the driving force due to the slow motion of the tectonic plate. (ii) Check if F(i,j) is greater than Fc, the threshold value of the force. If not, the system is stable and step 1 is repeated. If the system is unstable, go to step 3. Choose Fc = 4 for convenience. (iii) The release of stress due to the slippage of a block is represented by letting F(i,j) = F(i,j)− Fc. The transfer of stress is represented by updating the stress at the sites of the four neighbors at (i,j ± 1) and (i ± 1,j): F → F + 1. Periodic boundary conditions are not used. CHAPTER 14. COMPLEX SYSTEMS 541 These rules are equivalent to the Bak–Tang–Wiesenfeld model. What is the relation of this model to the sandpile model considered in Problem 14.7? As an example, choose L = 10. Do the simulation and show that the system eventually comes to a statistically stationary state, where the average value of the stress at each site stops growing. Monitor N(s), the number of earthquakes of size s, where s is the total number of sites (blocks) that are affected by the instability. Then consider L = 30 and repeat your simulations. Are your results for N(s) consistent with scaling? Problem 14.9. A dissipative earthquake model The Bak–Tang–Wiesenfeld earthquake model discussed in Problem 14.8 displays power law scaling due to the inherent conservation of the dynamical variable, the stress. It is easy to modify the model so that the stress is not conserved and the model is more realistic. The Rundle– Jackson–Brown/Olami–Feder–Christensen model of an earthquake fault is a simple example of such a nonconservative system. (a) Modify the toppling rule in Problem 14.8 so that when the stress on site (i,j) exceeds Fc, not all the excess stress is given to the neighbors. In particular, assume that when site (i,j) topples, F(i,j) is reduced to the residual stress Fr(i,j). The amount α(Fij − Fr) is dissipated leaving (Fij − Fr)(1 − α) to be distributed equally to the neighbors. If α = 0, the model is equivalent to the model considered in Problem 14.8. Choose α = 0.2 and determine if N(s) exhibits power law scaling. For simplicity, choose Fc = 4 and Fr = 1 (see Grassberger). (b) Make the model more realistic by adding a small amount of noise to Fr so that Fr is uniformly distributed between 1 − δ,1 + δ with δ = 0.05. Also run the model in what is called the “zero-velocity limit” by finding the site with the maximum stress Fmax and then increasing the stress on all sites by Fc − Fmax so that only one site initially becomes unstable. Determine N(s) and see if your results differ from what you found in part (a). Do you still observe power law scaling? (c) The model can be made more realistic still by assuming that the interaction between the blocks is long range due to the existence of elastic forces. Distribute the excess stress equally to all z neighbors that are within a distance of radius R of an unstable site. Each of the z neighbors receives a stress equal to (Fij − Fr)(1 − α)/z. First choose R = 3 and see if the qualitative behavior of N(s) changes as R becomes larger. Lattices with L ≥ 256 are typically considered with R 30 (see the papers by W. Klein and J. B. Rundle and collaborators.). The behavior of other simple models of natural phenomena is explored in the following. Problem 14.10. Forest fire model (a) Consider the following model of the spread of a forest fire. Suppose that at t = 0 the L × L sites of a square lattice either have a tree or are empty with probability p and 1 − p, respectively. The sites that have a tree are on fire with probability f . At each iteration an empty site grows a tree with probability g, a tree that has a nearest neighbor site on fire catches fire, and a site that is already on fire dies and becomes empty. This model is an example of a probabilistic cellular automaton. Write a program to simulate this model and color code the three types of sites. Use periodic boundary conditions. (b) Choose L ≥ 30 and determine the values of g for which the forest maintains fires indefinitely. Note that as long as g > 0, new trees will always grow. CHAPTER 14. COMPLEX SYSTEMS 542 (c) Use the value of g that you found in part (b) and compute the distribution of the number of sites sf on fire. If the distribution is critical, determine the exponent α that characterizes this distribution. Also compute the distribution for the number of trees st. Is there any relation between these two distributions? (d)∗ To obtain reliable results it is frequently necessary to average over many initial configurations. However, the behavior of many systems is independent of the initial configuration and averaging over many initial configurations is unnecessary. This latter possibility is called self-averaging. Repeat parts (b) and (c), but average your results over ten initial configurations. Is this forest fire model self-averaging? Problem 14.11. Another forest fire model Consider a simple variation of the model discussed in Problem 14.10. At t = 0 each site is occupied by a tree with probability p; otherwise, it is empty. The system is updated in successive iterations as follows: (i) Randomly grow new trees at time t with a small probability g from sites that are empty at time t − 1; (ii) A tree that is not on fire at t − 1 catches fire due to lightning with probability f . (iii) Trees on fire ignite neighboring trees, which in turn ignite their neighboring trees, etc. The spreading of the fire occurs instantaneously. (iv) Trees on fire at time t − 1 die (become empty sites) and are removed at time t (after they have set their neighbors on fire). As in Problem 14.10, the changes in each site occur synchronously. (a) Determine N(s), the number of clusters of trees of size s that catch fire in each iteration. Two trees are in the same cluster if they are nearest neighbors. Is the behavior of N(s) consistent with N(s) ∼ s−α? If so, estimate the exponent α for several values of g and f . (b)∗ The balance between the mean rate of birth and burning of trees in the steady state suggests a value for the ratio f /g at which this model is likely to be scale invariant. If the average steady state density of trees is ρ, then at each iteration the mean number of new trees appearing is gN(1−ρ), where N = L2 is the total number of sites. In the same spirit, we can say that for small f , the mean number of trees destroyed by lightning is f ρN s , where s is the mean number of trees in a cluster. Is this reasoning consistent with the results of your simulation? If we equate these two rates, we find that s ∼ [(1−ρ)/ρ](g/f ). Because 0 < ρ < 1, it follows that s → ∞ in the limit f /g → 0. Given the relation s = ∞ s=1 sN(s)/ s N(s) and the divergent behavior of s , why does it follow that N(s) must decay more slowly than exponentially with s? This reasoning suggests that N(s) ∼ s−α with α < 2. Is this expectation consistent with the results that you obtained in part (a)? In this model there are three well-separated time scales; that is, the time for lightning to strike (∝ f −1), the time for trees to grow (∝ g−1), and the instantaneous spreading of fire through a connected cluster. This separation of time scales seems to be an essential ingredient for selforganized criticality (see Grinstein and Jayaprakash). CHAPTER 14. COMPLEX SYSTEMS 543 Problem 14.12. Model of punctuated equilibrium (a) The idea of punctuated equilibrium is that biological evolution occurs episodically rather than as a steady, gradual process. That is, most of the major changes in life forms occur in relatively short periods of time. Bak and Sneppen have proposed a simple model that exhibits some of the behavior of punctuated equilibrium. The model consists of a onedimensional cellular automaton of linear dimension L, where cell i represents the biological fitness of species i. Initially, all cells receive a random fitness fi between 0 and 1. Then the cell with the lowest fitness and its two nearest neighbors are randomly given new fitness values. This update rule is repeated indefinitely. Write a program to simulate the behavior of this model. Use periodic boundary conditions and display the fitness of each cell as a column of height fi. Begin with L = 64 and describe what happens to the distribution of fitness values after a long time. (b) We can crudely think of the update process as replacing a species and its neighbors by three new species. In this sense the fitness represents a barrier to creating a new species. If the barrier is low, it is easier to create a new species. Do the low fitness species die out? What is the average value of fitness of the species after the model is run for a long time (104 or more iterations)? Compute the distribution of fitness values N(f ) averaged over all cells and over many iterations. Allow the system to come to a fluctuating steady state before computing N(f ). Plot N(f ) versus f . Is there a critical value fc below which N(f ) is much less than the values above fc? Is the update rule reasonable from a evolutionary point of view? (c) Modify your program to compute the distance x between successive fitness changes and the distribution of these distances P (x). Make a log-log plot of P (x) versus x. Is there any evidence of self-organized criticality (power law scaling)? (d) Another way to visualize the results is to make a plot of the time at which a cell is changed versus the position of the cell. Is the distribution of the plotted points approximately uniform? We might expect that the survival time of a species depends exponentially on its fitness, and hence each update corresponds to an elapsed time of e−cfi , where the constant c sets the time scale, and fi is the fitness of the cell that has been changed. Choose c = 100 and make a similar plot with the time axis replaced by the logarithm of the time; that is, the quantity 100fi. Is this plot more meaningful? (e) Another way of visualizing punctuated equilibrium is to plot the number of times groups of cells change as a function of time. Divide the time into units of 100 updates and compute the number of fitness changes for cells i = 1 to 10 as a function of time. Do you see any evidence of punctuated equilibrium? 14.3 The Hopfield Model and Neural Networks Neural network models have been motivated in part by how neurons in the brain collectively store and recall memories. Usually, a neuron is in one of two states, a resting potential (not firing) or firing at the maximum rate. A neuron “fires” once it receives electrical inputs from other neurons whose strength reaches a certain threshold. An important characteristic of a neuron is that its output is a nonlinear function of the sum of its inputs. The assumption is that when memories are stored in the brain, the strengths of the connections between neurons change. CHAPTER 14. COMPLEX SYSTEMS 544 One of the uses of neural network models is pattern recognition. If we see someone more than once, the person’s face provides input that helps us to recall the person’s name. In the same spirit, a neural network can be given a pattern, for example, a string of ±1s, that partially reflect a previously memorized pattern. The idea is to store memories so that a computer can recall them when the inputs are close to a particular memory. We now consider an example of a neural network due to Hopfield. The network consists of N neurons and the state of the network is defined by the state of each neuron Si, which in the Hopfield model takes on the values −1 (not firing) and +1 (firing). The strength of the connection between neuron i and neuron j is denoted by wij, which is determined by the M stored memories: wij = M α=1 Sα i Sα j (14.4) where Sα i represents the state of neuron i in stored memory α. Given the initial state of all the neurons, the dynamics of the network is simple. We choose a neuron i at random and change its state according to its input, which is i j wijSj, where Sj represents the current state of neuron j. Then we change the state of neuron i by setting Si =    +1 for i j wijSj > 0 −1 for i j wijSj ≤ 0. (14.5) The threshold value of the input has been set equal to zero, but other values could be used as well. The HopfieldApp class in Listing 14.8 implements this model of a neural network and stores memories based on user input. The state of the network is stored in the array S[i] and the connections between the neurons are stored in the array w[i][j]. The user initially clicks on various cells to toggle their values between −1 and +1 and presses the Remember button to store a pattern. Then the user presses the Randomize button to initialize the Si by setting Si to ±1 at random. After the memories are stored, press the Start button to update the neurons using the Hopfield algorithm to try to recall one of the stored memories. Listing 14.8: HopfieldApp class. package org . opensourcephysics . sip . ch14 ; import org . opensourcephysics . controls . ; import org . opensourcephysics . frames . ; / / Hopfield model of a neural network public class HopfieldApp extends AbstractSimulation { LatticeFrame l a t t i c e ; int N; / / t o t a l number of neurons double [ ] [ ] w; / / connection array (N by N elements ) int numberOfStoredMemories ; public HopfieldApp ( ) { l a t t i c e = new LatticeFrame ( "Hopfield state" ) ; l a t t i c e . setToggleOnClick ( true , −1, 1 ) ; l a t t i c e . setIndexedColor ( −1 , java . awt . Color . blue ) ; l a t t i c e . setIndexedColor (0 , java . awt . Color . blue ) ; l a t t i c e . setIndexedColor (1 , java . awt . Color . green ) ; l a t t i c e . setSize (600 , 120); CHAPTER 14. COMPLEX SYSTEMS 545 } public void doStep ( ) { int [ ] S = l a t t i c e . getAll ( ) ; for ( int counter = 0; counter0) ? 1 : −1; } l a t t i c e . setAll ( S ) ; } public void i n i t i a l i z e ( ) { N = control . getInt ( "Lattice size" ) ; w = new double [N] [N] ; l a t t i c e . r e s i z e L a t t i c e (N, 1 ) ; for ( int i = 0; i 0, the spins i and j lower their energy by lining up in the same direction. If Jij < 0, the spins lower their energy by lining up in opposite directions (see Figure 15.1). We are interested in finding the ground state when the coupling constant Jij randomly takes on the values ±J0/N, where N is the number of spins and J0 is an arbitrary constant. To find the ground state, we need to find the configurations of spins that give the lowest value of the CHAPTER 14. COMPLEX SYSTEMS 547 energy. Finding the ground state of a spin glass is particularly difficult because there are many configurations that correspond to local minima of the energy. In fact the problem of finding the exact ground state is an example of a computationally difficult problem called NP-complete. (Another example of such a problem is considered in Problem 15.31.) In Problem 14.14 we explore if the Hopfield algorithm can find a good approximation to the global minimum. Problem 14.14. Minimum energy of an Ising spin glass (a) Choose J0 = 4 in (14.6) and modify the HopfieldApp class so that it applies to a model spin glass. Display the output string and the energy after every N attempts to change a spin. Begin with N = 20. (b) What happens to the energy after a long time? For different initial states, but the same set of the Jij, is the value of the energy the same after the system has evolved for a long time? Explain your results in terms of the number of local energy minima. (c) What is the behavior of the system? Do you find periodic behavior, random behavior, or does the system evolve to a state that does not change? 14.4 Growing Networks A network is a collection of points called nodes that are connected by lines called links. Mathematicians refer to networks as graphs, and graph theory has been an active field of mathematics for many years. A mathematical network can represent an actual network by defining what a node represents and the kind of relationship represented by a link. For example, in an airline network the nodes represent airports and the links represent flights between airports. In an acquaintance network, the nodes represent individuals, and the links represent the state of two people knowing each other. In a biochemical network, the nodes represent various molecular types, and the links represent a reaction between molecules. One reason for the recent interest in networks is that data on existing networks is now more readily available due to the widespread use of computers. Indeed, one of the networks of current interest is the network of websites. Another reason for the interest in networks is that some new models of networks have been developed. We first discuss one of the original network models, the Erdös–Rényi model. In this model we start with N nodes and then form n links between pairs of nodes such that each pair has either one link or no links. The probability of a link between any pair of nodes is p = n/(N(N − 1)/2). One quantity of interest is the degree distribution D( ), which is the fraction of nodes that have links. An example of the determination of D( ) is shown in Figure 14.3. In the Erdös-Rényi model this distribution is a Poisson distribution for large N. Thus, there is a peak in D( ), and for large , D( ) decreases exponentially. In some network models there is a path between any pair of nodes. In other models, such as the Erdös–Rényi model, there are some nodes that cannot be reached from other nodes (see Figure 14.3). In these networks there are other quantities of interest that are analogous to those in percolation theory. The main difference is that in network models the position of the nodes is irrelevant, and only their connectivity is relevant. In particular, there is no spanning cluster as can exist in percolation models. Instead, there can be a cluster that is significantly larger than the other clusters. In the Erdös–Rényi model, the transition at which such a “giant” cluster appears depends on the probability p that any pair of nodes is connected. In the large N limit this transition occurs at p = 1/N. CHAPTER 14. COMPLEX SYSTEMS 548 Figure 14.3: Example of a disconnected network with 10 nodes and 9 links. The degree distribution for this network is D(1) = 5/10 = 0.5, D(2) = 3/10 = 0.3, D(3) = 1/10 = 0.1, and D(4) = 1/10 = 0.1. The cluster coefficient or transitivity is defined as 3 times the number of triangles divided by the number of possible triples of connected nodes. In this case we have 1 triangle and 12 triples. Thus, the clustering coefficient equals 3 × 1/12 = 0.25. If a node has links, then the number of triples centered at that node is /(2!( − 2)!). Problem 14.15. The Erdös–Rényi model (a) Write a program to create networks based on the Erdös–Rényi model. Choose N = 100 and p ≈ 0.01 and compute D( ); average over at least 10 networks. Show that D( ) follows a Poisson distribution. (b) Define a giant cluster as one that has over three times as many nodes as any other cluster and at least 10% of the nodes. Find the value of p at which the giant cluster first appears for N = 64, 128, and 256. Average over 10 networks for each value of N. The cluster distribution should be updated after every link is added using the labeling procedure used in Chapter 12. In this case it is easier because every time we add a link, we either combine two clusters or we make no change in the cluster distribution. Some of the networks that we will consider are by definition connected. In these cases one of the important quantities of interest is the mean path length between two nodes, where the path length between two nodes is the shortest number of links from one node to the other. If the mean path length weakly depends on the total number of nodes and is small, then this property of networks is known as the “small world” property. A well-known example of the small world property is what is called “six degrees of separation,” which refers to the fact that almost any person is connected through a sequence of six connections to almost any other person. We wish to understand the structure of different networks. One structural property is the clustering coefficient or transitivity. If node A is linked to B and B to C, the clustering coefficient is the probability that A is linked to C (see Figure 14.3 for a precise definition). If this coefficient is large, then there will be many small loops of nodes in the network. If we think of the nodes as people and the links as friendship connections, then the clustering coefficient is a measure of the tendency of people to form cliques. It is also of interest to see to what extent the network is hierarchically organized. Can we find groups of nodes that are linked together at different CHAPTER 14. COMPLEX SYSTEMS 549 levels of organization? Can we produce an organizational chart for the network similar to what is used by many businesses? Algorithms for computing the hierarchical or community structure of a network are discussed in the references. Two popular network models are the Watts–Strogatz small world model and the Barabasi– Albert preferential attachment model. In the Watts–Strogatz model, a regular lattice of nodes connected by nearest neighbor links is “rewired” so that a link between two neighboring nodes is broken with probability p, and a link is randomly added between one of the nodes and any other node in the system. The small world property shows up as a logarithmic dependence of the mean path length on the system size N for large p. The degree distribution is similar to that of the Erdös-Rényi model. In the preferential attachment model, we begin with a few connected nodes and then add one node at a time. Each new node is then linked to m existing nodes, with preference given to those nodes that already have many links. The probability of a node with links being connected to a new node is proportional to . For example, if we have ten nodes in the network with 1, 1, 3, 2, 7, 3, 4, 7, 10, and 2 links, respectively, then there are a total of 40 links and the probability of getting the next link from a new node is 1/40, 1/40, 3/40, 2/40, 4/40, 3/40, 4/40, 7/40, 10/40, and 2/40, respectively. The result of this growth rule is that some nodes will accumulate many links. The key result is that the link distribution is a power law with D( ) ∼ −α. This scale-free behavior is very important because it says thst in the limit of an infinite network, there is a non-negligible probability that a node exists with any particular number of links. Examples of real networks that have this behavior are actor networks where the links correspond to two actors appearing in the same movie, airport networks, the internet, and the links between various websites. In addition to the scale-free degree distribution, the preferential attachment model also has the small world property that the mean path length grows only logarithmically with the number of nodes. The PreferentialAttachment class implements the preferential attachment model. Method setPosition is not relevant to the actual growth model. It places the nodes in random positions so that the network can be drawn so they are too close to each other. This drawing method is useful only for networks with less than about 100 nodes. Listing 14.9: PreferentialAttachment class: Preferential attachment network model. package org . opensourcephysics . sip . ch14 . networks ; import java . awt . Color ; import java . awt . Graphics ; import org . opensourcephysics . frames . ; import org . opensourcephysics . display . Drawable ; import org . opensourcephysics . display . DrawingPanel ; public class PreferentialAttachment implements Drawable { int [ ] node , linkFrom , degree ; / / p o s i t i o n s of nodes , only meaningful f o r d i s p l a y purposes double [ ] x , y ; int N; / / maximum number of nodes int m = 2; / / number of attempted l i n k s per node int linkNumber = 0; / / twice current number of l i n k s int n = 0; / / current number of nodes boolean drawPositions = true ; / / only draw network i f true int numberOfCompletedNetworks = 0; public void i n i t i a l i z e ( ) { CHAPTER 14. COMPLEX SYSTEMS 550 / / degree d i s t r i b u t i o n to be averaged over many networks degree = new int [N] ; numberOfCompletedNetworks = 0; / / w i l l draw many networks startNetwork ( ) ; } public void addLink ( int i , int j , int s ) { linkFrom [ i m+s ] = j ; node [ i ]++; node [ j ]++; linkNumber += 2; / / twice current number of l i n k s } public void startNetwork ( ) { n = 0; linkFrom = new int [m N] ; node = new int [N] ; x = new double [N] ; y = new double [N] ; linkNumber = 0; for ( int i = 0; i <=m; i ++) { n++; setPosition ( i ) ; } for ( int i = 1; i sum ) ; for ( int r = 0; r0) { plot . append (0 , Math . log ( i ) , Math . log ( degree [ i ] 1.0/ (N numberOfCompletedNetworks ) ) ) ; } } } CHAPTER 14. COMPLEX SYSTEMS 552 public void draw ( DrawingPanel panel , Graphics g ) { i f ( node!= null&&drawPositions ) { int pxRadius = Math . abs ( panel . xToPix (1.0) − panel . xToPix ( 0 ) ) ; int pyRadius = Math . abs ( panel . yToPix (1.0) − panel . yToPix ( 0 ) ) ; g . setColor ( Color . green ) ; for ( int i = 0; i 0.5) { genotype [ i ] [ j ] = true ; / / s e t s genes randomly } } } } public void copyGenotype ( boolean a [ ] , boolean b [ ] ) { / / copy a to b CHAPTER 14. COMPLEX SYSTEMS 557 for ( int i = 0; i 0.5) { J [ i ] [ j ] [ bond ] = 1; } else { CHAPTER 14. COMPLEX SYSTEMS 559 J [ i ] [ j ] [ bond ] = −1; } } } } } public void determineFitness ( GenePool genePool ) { t o t a l F i t n e s s = 0; int s t a t e [ ] [ ] = new int [L ] [ L ] ; populationFitness = new int [ genePool . numberOfGenotypes ] ; for ( int n = 0;n 0; low energy i mp li es high f i t n e s s populationFitness [n] = highestEnergy −populationFitness [n ] ; t o t a l F i t n e s s += populationFitness [n ] ; } } public void s e l e c t ( GenePool genePool ) { selectedPopulationFitness = new int [ genePool . numberOfGenotypes ] ; boolean savedGenotype [ ] [ ] = new boolean [ genePool . numberOfGenotypes ] [ genePool . genotypeSize ] ; for ( int n = 0;n bestFitness ) { bestFitness = selectedPopulationFitness [n ] ; } genePool . copyGenotype ( savedGenotype [ choice ] , genePool . genotype [n ] ) ; } } } Problem 14.19. Ground state of Ising-like models (a) Use the genetic algorithm we have discussed to find the ground state of the ferromagnetic Ising model for which Jij = 1. In this case the ground state energy is E = −2L2 (all spins up or all spins down). It will be necessary to modify method Initialize in class Phenotype. Choose L = 4 and consider a population of 20 strings, with 10 recombinations and 4 mutations per generation. How long does it take to find the ground state energy? You might wish to modify the program to show each new generation is shown on the screen. (b) Find the mean number of generations needed to find the ground state for L = 4, 6, and 8. Repeat each run several times. Use a population of 100, a recombination rate of 50, and a mutation rate of 20. Are there any general trends as L is increased? How do your results change if you double the population size? What happens if you double the recombination rate or mutation rate? Use larger lattices if you have sufficient computer resources. (c) Repeat part (b) for the antiferromagnetic model for which Jij = −1. (d) Repeat part (b) for a spin glass for which Jij = ±1 at random. In this case we do not know the ground state energy in advance. What criterion can you use to terminate a run? One of the important features of the genetic algorithm is that the change in the genetic code is selected not in the genotype directly, but in the phenotype. Note that the way we change the strings (particularly with recombination) is not closely related to the two-dimensional lattice of spins. We could have used some other prescription for converting a string of 0s and 1s to a configuration of spins on a two-dimensional lattice. If the phenotype is a three-dimensional lattice, we could use the same procedure for modifying the genotype, but a different prescription for converting the genetic sequence (the string of 0s and 1s) to the phenotype (the threedimensional lattice of spins). The point is that it is not necessary for the genetic coding to mimic the phenotypic expression. This point becomes distorted in the popular press when a gene is tied to a particular trait, because specific pieces of DNA rarely correspond directly to any explicitly expressed trait in the phenotype. 14.6 Lattice Gas Models of Fluid Flow We now return to cellular automaton models and discuss one of their more interesting applications— simulations of fluid flow. In general, fluid flow is very difficult to simulate because the partial differential equation describing the flow of incompressible fluids, the Navier–Stokes equation, is nonlinear, and this nonlinearity can lead to the failure of standard numerical algorithms. In addition, there are typically many length scales that must be considered simultaneously. These length scales include the microscopic motion of the fluid particles, the length scales associated with fluid structures such as vortices, and the length scales of macroscopic objects such as pipes CHAPTER 14. COMPLEX SYSTEMS 561 Velocity Vector Direction Symbol Abbreviation Decimal Binary v0 (1,0) RIGHT RI 1 00000001 v1 (1,− √ 3)/2 RIGHT_DOWN RD 1 00000010 v2 −(1, √ 3)/2 LEFT_DOWN LD 4 00000100 v3 (−1,0) LEFT LE 8 00001000 v4 (−1, √ 3)/2 LEFT_UP LU 16 00010000 v5 (1, √ 3)/2 RIGHT_UP RU 32 00100000 v6 (0,0) STATIONARY S 64 01000000 BARRIER 128 10000000 Table 14.1: Summary of the possible velocities and their representations. or obstacles. Because of these considerations, simulations of fluid flow based on the direct numerical solutions of the Navier–Stokes equation typically require very sophisticated numerical methods (cf. Oran and Boris). Cellular automaton models of fluids are known as lattice gas models. In a lattice gas model the positions of the particles are restricted to the sites of a lattice, and the velocities are restricted to a small number of vectors corresponding to neighbor sites. A time step is divided into two substeps. In the first substep the particles move freely to their corresponding nearest neighbor lattice sites. Then the velocities of the particles at each lattice site are changed according to a collision rule that conserves mass (particle number), momentum, and kinetic energy. The purpose of the collision rules is not to accurately model microscopic collisions, but rather to achieve the correct macroscopic behavior. The idea is that if we satisfy the conservation laws associated with microscopic collisions, then we can find the correct physics at the macroscopic level, including translational and rotational invariance, by averaging over many particles. We assume a triangular lattice, because it can be shown that this symmetry is sufficient to yield the macroscopic Navier–Stokes equations for a continuum. In contrast, the more limited symmetry of a square lattice is not sufficient. Three-dimensional models are much more difficult to implement and justify theoretically. All the moving particles are assumed to have the same speed and mass. The possible velocity vectors lie only in the direction of the nearest neighbor sites, and hence there are six possible velocities as summarized in Table 14.1. A rest particle is also allowed. The number of particles at each site moving in a particular direction (channel) is restricted to be zero or one. In the first substep all particles move in the direction of their velocity to a neighboring site. In the second substep the velocity vectors at each lattice site are changed according to the appropriate collision rule. Examples of the collision rules are illustrated in Figures 14.4–14.6. The rules are deterministic with only one possible set of velocities after a collision for each possible set of velocities before a collision. It is easy to check that momentum conservation for collisions between the particles is enforced by these rules. As in Section 14.1, we use bit manipulation to efficiently represent a lattice site and the collision rules. Each lattice site is represented by one element of the integer array lattice. In Java each int stores 32 bits, but we will use only the first 8 bits. We use the first six bits from 0 to 5 to represent particles moving in the six possible directions with bit 0 corresponding to a particle moving with velocity v0 (see Table 14.1). If there are three particles with velocities v0, v2, and v4 at a site and no barrier, then the value of the lattice array element at this site is 00010101 in binary notation. Bit 6 represents a possible rest (stationary) particle. If we want a site to act as a barrier that CHAPTER 14. COMPLEX SYSTEMS 562 LU RU RILE RDLD Figure 14.4: Examples of collision rules for three particles, with one particle unchanged and no stationary particles. Each direction or channel is represented by 32 bits, but we need only the first 8 bits. The various channels are summarized in Table 14.1. (a) (b) (c) Figure 14.5: (a) Example of collision rule for three particles with zero net momentum. (b) Example of two particle collision rule. (c) Example of four-particle collision rule. The rules for states that are not shown is that the velocities do not change after a collision. An open circle represents a lattice site and the absence of a stationary particle. blocks incoming particles, we set bit 7. For example, a barrier site containing a particle with velocity v1 is represented by 10000010. The rules for the collisions are given in the declaration of the class variables in class LatticeGas. Because rule is declared static final, we cannot normally overwrite its values. However, an exception is made for static initializers that are run when the class is first loaded. To construct the rules, we use the bitwise or operator | and use named constants for each of the possible states. As an example, the state corresponding to one particle moving to the right, one moving to the left and down, and one moving to the left and up is given by LU+LD+RI, which we write as LU|LD|RI or 00010101. The collision rule in Figure 14.5(a) is that this state transforms to one particle moving to the right and down, one moving left, and one moving to the right and up. Hence, this collision rule is given by rule[LU|LD|RI] = RU|LE|RD. The other rules are given in a similar way. Stationary particles can also be created or destroyed. For example, what are the states before and after the collision for rule[LU|RI] = RU|S? To every rule corresponds a dual rule that flips the bits corresponding to the presence and absence of a particle. This duality means that we need to only specify half of the rules. The dual rules can be constructed by flipping all bits of the input and output. Our convention is to list the rules starting without a stationary particle. Then the corresponding dual rules are those CHAPTER 14. COMPLEX SYSTEMS 563 (a) (b) (c) (d) Figure 14.6: (a) and (c) and (b) and (d) are duals of each other. An open circle represents the absence of a stationary particle, and a filled circle represents the presence of a stationary particle. Note that the collision rule in (c) is similar to (b), and the collision rule in (d) is similar to (a), but in the opposite direction. that start with a stationary particle. The dual rules are implemented by the statement rule [ i ^(RU|LU|LE|LD|RD| RI | S ) ] = rule [ i ]^(RU|LU|LE|LD|RD| RI | S ) ; where ˆ is the bitwise exclusive or operator, which equals 1 if both bits are different and is 0 otherwise. Two examples of dual rules are given in Figure 14.6. The rules in Figures 14.5(b) and 14.5(c) cycle through the states in a particular direction. Although these rules are straightforward, they are not invariant under reflection. To help eliminate this bias, we cycle in the opposite direction when a stationary particle is present (see Figure 14.6). We adopt the rule that when a particle moves onto a barrier site, we set the velocity v of this particle equal to −v (see Figure 14.7). Because of our ordering of the velocities, the rule for updating a barrier can be expressed compactly using bit manipulation. Reflection off a barrier is accomplished by shifting the higher-order bits to the right by three bits (»>3) and shifting the lower-order bits to the left by three bits («3). Check the rules given in Listing 14.13. Other possibilities are to set the angle of incidence equal to the angle of reflection or to set the velocity to an arbitrary direction. The latter case would correspond to a collision off a rough surface. The step method runs through the entire lattice and moves all the particles. The updated values of the sites are placed in the array newLattice. We then go through the newLattice array, implement the relevant collision rule at each site, and write the results into the array Lattice. The movement of the particles is accomplished as follows. Because the even rows are horizontally displaced one half a lattice spacing from the odd rows, we need to treat odd and even rows separately. In the step method we loop through every other row and update site1 and site2 at the same time. An example will show how this update works. The statement CHAPTER 14. COMPLEX SYSTEMS 564 t = 0 t = 1 t = 2 Figure 14.7: Example of a collision from a barrier. At t = 1 the particle moves to the barrier site and then reverses its velocity. The symbol ⊗ denotes a barrier site. site1 j - 1 j j + 1site2 rghtcent j + 2 left Figure 14.8: We update site1 and site2 at the same time. The rows are indexed by j. The dotted line connects sites in the same column. rght [ j −1] |= s i t e 1 & RIGHT_DOWN; means that if there is a particle moving to the right and down at site1, then the bit corresponding to RIGHT_DOWN is added to the site rght (see Figure 14.8). The statement cent [ j ] |= s i t e 1 & (STATIONARY|BARRIER) | s i t e 2 & RIGHT_DOWN; means that a stationary particle at site1 remains there, and if site1 is a barrier, it remains so. If site2 has a particle moving in the direction RD, then site1 will receive this particle. To maintain a steady flow rate, we add the necessary horizonal momentum to the lattice uniformly after each time step. The procedure is to chose a site at random and determine if it is possible to change the sites’s horizontal momentum. If so, we then remove the left bit and add the right bit or vice versa. This procedure is accomplished by the statements at the end of the step method. Listing 14.13: Listing of the LatticeGas class. package org . opensourcephysics . sip . ch14 . l a t t i c e g a s ; import org . opensourcephysics . display . ; import java . awt . ; import java . awt . geom . AffineTransform ; import java . awt . geom . Line2D ; public class LatticeGas implements Drawable { / / input parameters from user public double flowSpeed ; / / c o n t r o l s pressure / / s i z e of v e l o c i t y arrows displayed public double arrowSize ; CHAPTER 14. COMPLEX SYSTEMS 565 public int spatialAveragingLength ; / / s p a t i a l averaging of v e l o c i t y public int Lx , Ly ; / / l i n e a r dimensions of l a t t i c e public int [ ] [ ] l a t t i c e , newLattice ; private double numParticles ; s t a t i c final double SQRT3_OVER2 = Math . sqrt ( 3 ) / 2 ; s t a t i c final double SQRT2 = Math . sqrt ( 2 ) ; s t a t i c final int RIGHT = 1 , RIGHT_DOWN = 2 , LEFT_DOWN = 4; s t a t i c final int LEFT = 8 , LEFT_UP = 16 , RIGHT_UP = 32; s t a t i c final int STATIONARY = 64 , BARRIER = 128; / / maximum number of p a r t i c l e s per s i t e s t a t i c final int NUM_CHANNELS = 7; / / 7 channel b i t s plus 1 b a r r i e r b i t per s i t e s t a t i c final int NUM_BITS = 8; / / t o t a l number of p o s s i b l e s i t e c o n f i g u r a t i o n s = 2^8 s t a t i c final int NUM_RULES = 1<<8; / / 1 << 8 means move the zeroth b i t over 8 p l a c e s to the l e f t to / / the eighth b i t s t a t i c final double ux [ ] = { 1.0 , 0.5 , −0.5 , −1.0 , −0.5 , 0.5 , 0 } ; s t a t i c final double uy [ ] = { 0.0 , −SQRT3_OVER2, −SQRT3_OVER2, 0.0 , SQRT3_OVER2, SQRT3_OVER2, 0 } ; / / averaged v e l o c i t i e s f o r every s i t e c o n f i g u r a t i o n s t a t i c final double [ ] vx , vy ; s t a t i c final int [ ] rule ; s t a t i c { / / s e t rule t a b l e / / d e f a u l t rule i s the i d e n t i t y rule rule = new int [NUM_RULES] ; for ( int i = 0; i >3)|( lowBits <<3); } } s t a t i c { / / s e t average s i t e v e l o c i t i e s / / f o r every p a r t i c l e s i t e c o n f i g u r a t i o n i , c a l c u l a t e t o t a l / / net v e l o c i t y and place in vx [ i ] , vy [ i ] vx = new double [NUM_RULES] ; vy = new double [NUM_RULES] ; for ( int i = 0; i 0) ? LEFT : RIGHT) ) { l a t t i c e [ i ] [ j ] ^= RIGHT|LEFT ; } } } public void draw ( DrawingPanel panel , Graphics g ) { i f ( l a t t i c e==null ) { return ; } / / i f s = 1 draw l a t t i c e and p a r t i c l e d e t a i l s e x p l i c i t l y / / otherwise average v e l o c i t y over an s by s square int s = spatialAveragingLength ; Graphics2D g2 = ( Graphics2D ) g ; AffineTransform toPixels = panel . getPixelTransform ( ) ; Line2D . Double line = new Line2D . Double ( ) ; for ( int i = 0; i 0), (14.19) where the parameter σ represents the drop of the friction force at the onset of the slip. If a block is stuck, the calculation of the static friction force is a bit more involved. If the total force on a block due to the springs is to the right, then the static friction force is set equal and opposite to the total spring force up to a maximum value of F0. However, if the total spring force is to the left, the static friction is chosen so that the acceleration of the block is zero. Typical values of the parameters are F0 = 1, = 10, σ = 0.01, α = 2.5, and v0 = 10−5. Initially we set ˙uj = 0 for all j and assign small random displacements to all the blocks. The blocks will then move according to (14.18). For simplicity we set the substrate velocity v = 0, and when all the blocks become stuck, we move all the blocks to the left by an equal amount CHAPTER 14. COMPLEX SYSTEMS 577 such that the total force due to the springs on one block equals unity (F0). This procedure will then cause one block to move or slip. As this block moves, other neighboring blocks may move leading to an earthquake. Eventually, all the blocks will again become stuck. The main quantities of interest are P (s), the distribution of the number of blocks that have moved during an earthquake, and P (M), the distribution of the net displacement of the blocks during an earthquake, where M = i ∆ui. (14.20) The sum over i in (14.20) is over the blocks involved in an earthquake, and ∆ui is the net displacement of the blocks during the earthquake. Do P (s) and P (M) exhibit scaling consistent with Gutenberg–Richter? The movement of the blocks represents the slip of the two surfaces of a fault past one another during an earthquake. The stick-slip behavior of this model is similar to that of a real earthquake fault. Other interesting questions are posed in the references (see Klein et al., Ferguson et al., and Mori and Kawamura). References and Suggestions for Further Reading Réka Albert and Albert–László Barabási, “Statistical mechanics of complex networks,” Rev. Mod. Phys. 74, 47–97 (2002). Per Bak, How Nature Works (Copernicus Books, 1999). A good read about self-organized critical phenomena from earthquakes to stock markets. Nature is not as simple as Bak believed, but his interest in complex systems spurred many others to become interested. P. Bak, “Catastrophes and self-organized criticality,” Computers in Physics 5 (4), 430 (1991). A good introduction to self-organized critical phenomena. Per Bak and Michael Creutz, “Fractals and self-organized criticality,” in Fractals in Science, Armin Bunde and Shlomo Havlin, editors (Springer–Verlag, 1994). Per Bak and Kim Sneppen, “Punctuated equilibrium and criticality in a simple model of evolution,” Phys. Rev. Lett. 71, 4083 (1993); Henrik Flyvbjerg, Kim Sneppen, and Per Bak, “Mean field theory for a simple model of evolution,” Phys. Rev. Lett. 71, 4087 (1993). P. Bak, C. Tang, and K. Wiesenfeld, “Self-organized criticality,” Phys. Rev. A 38, 364–374 (1988). E. R. Berlekamp, J. H. Conway, and R. K. Guy, Winning Ways for your Mathematical Plays, Vol. 2 (Academic Press, 1982). A discussion of how the Game of Life simulates a universal computer. Bruce M. Boghosian and C. David Levermore, “A cellular automaton for Burger’s equation,” Complex Systems 1, 17–30 (1987). Reprinted in Doolen et al. D. Challet and Y.-C. Zhang, “Emergence of cooperation and organization in an evolutionary game,” Physica A 246, 407–418 (1997), or adap-org/9708006. The authors give the first description of the minority game.. CHAPTER 14. COMPLEX SYSTEMS 578 Debashish Chowdhury, Ludger Santen, and Andreas Schadschneider, “Simulation of vehicular traffic: A statistical physics perspective,” Computing in Science and Engineering 2 (5), 80–87 (2000). John W. Clark, Johann Rafelski, and Jeffrey V. Winston, “Brain without mind: Computer simulation of neural networks with modifiable neuronal interactions,” Physics Reports 123, 215–273 (1985). Aaron Clauset, M. E. J. Newman, and Cristopher Moore, “Finding community structure in very large networks,” Phys. Rev. E 70, 066111-1–6 (2004). This paper describes a faster algorithm than that discussed in Newman and Girvan. J. P. Crutchfield and M. Mitchell, “The evolution of emergent computation,” Proc. Natl. Acad. Sci. 92, 10742–10746 (1995). The authors use genetic algorithms to evolve a cellular automata model. Guillaume Deffuant, Fréd’eric Amblard, Gérard Weisbuch, and Thierry Faure, “How can extremism prevail? A study based on the relative agreement interaction model,” J. Artificial Societies and Social Simulation 5(4) paper #1 (2002). This paper and others can be found at . Gary D. Doolen, Uriel Frisch, Brosl Hasslacher, Steven Orszag, and Stephen Wolfram, editors, Lattice Gas Methods for Partial Differential Equations (Addison–Wesley, 1990). A collection of reprints and original articles by many of the leading workers in lattice gas methods. Stephanie Forrest, editor, Emergent Computation: Self-Organizing, Collective, and Cooperative Phenomena in Natural and Artificial Computing Networks (MIT Press, 1991). Stephen I. Gallant, Neural Network Learning and Expert Systems (MIT Press, 1993). M. Gardner, Wheels, Life and Other Mathematical Amusements (W. H. Freeman, 1983). Peter Grassberger, “Efficient large-scale simulations of a uniformly driven system,” Phys. Rev. E 49, 2436–2444 (1994). Grassberger considered substantially larger lattices and longer simulation times than that used by Olami et al. and found that the Olami, Feder, Christensen model does not exhibit power law scaling. G. Grinstein and C. Jayaprakash, “Simple models of self-organized criticality,” Computers in Physics 9, 164 (1995). G. Grinstein, Terence Hwa, and Henrik Jeldtoft Jensen, “1/f α noise in dissipative transport,” Phys. Rev. A 45, R559–R562 (1992). B. Hayes, “Computer recreations,” Sci. Am. 250 (3), 12–21 (1984). An introduction to cellular automata. J. E. Hanson and J. P. Crutchfield, “Computational mechanics of cellular automata: An example,” Physica D 103, 169–189 (1997). The authors discuss energence in cellular automata. Robert Herman, editor, The Theory of Traffic Flow (Elsevier, 1961). John Hertz, Anders Krogh, and Richard G. Palmer, Introduction to the Theory of Neural Computation (Addison–Wesley, 1991). CHAPTER 14. COMPLEX SYSTEMS 579 J. J. Hopfield, “Neural networks and physical systems with emergent collective computational abilities,” Proc. Natl. Acad. Sci. USA 79, 2554–2558 (1982). Navot Israeli and Nigel Goldenfeld, “Computational irreducibility and the predictability of complex physical systems,” Phys. Rev. Lett. 92, 074105 (2004). H. M. Jaeger, Chu–heng Liu, and Sidney R. Nagel, “Relaxation at the angle of repose,” Phys. Rev. Lett. 62, 40 (1989). These authors discuss experiments on real sandpiles. W. Klein, C. Ferguson, and J. B. Rundle, “Spinodals and scaling in slider block models,” in Reduction and Predictability of Natural Disasters, J. B. Rundle, D. L. Turcotte, and W. Klein, editors (Addison–Wesley, 1995). Also see C. D. Ferguson, W. Klein, and John B. Rundle, “Spinodals, scaling, and ergodicity in a threshold model with long-range stress transfer,” Phys. Rev. E 60, 1359–1373 (1999). J. A. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection (MIT Press, 1992). Chris Langton, “Studying artificial life with cellular automata,” Physica D 22, 120–149 (1986). See also Christopher G. Langton, editor, Artificial Life (Addison–Wesley, 1989); Christopher G. Langton, Charles Taylor, J. Doyne Farmer, and Steen Rasmussen, editors, Artificial Life II, Addison–Wesley (1989); Christopher G. Langton, editor, Artificial Life III (Addison–Wesley, 1994). Roger Lewin, Complexity: Life at the Edge of Chaos (University of Chicago Press, 2000). A popular exposition of complexity theory. Sergei Maslov, Maya Paczuski, and Per Bak, “Avalanches and 1/f noise in evolution and growth models,” Phys. Rev. Lett. 73, 2162 (1994). Stephan Mertens, “Computational complexity for physicists,” Computing in Science and Engineering 4 (3), 31–47 (2002). Takahiro Mori and Hikaru Kawamura, “Simulation study of the one-dimensional Burridge– Knopoff model of earthquakes,” J. Geophysical Res. 111, B07302 (2006). K. Nagel and M. Schreckenberg, “A cellular automaton model for freeway traffic,” J. Phys. I France 2, 2221–2229 (1992). Also see . Kai Nagel, Dietrich E. Wolf, Peter Wagner, and Patrice Simon, “Two-lane traffic rules for cellular automata: A systematic approach,” Phys. Rev. E 58, 1425–1437 (1998). M. E. J. Newman, “The structure and function of complex networks,” SIAM Rev. 45, 167–256 (2003). M. E. J. Newman, “Detecting community structure in networks,” Eur. Phys. J. B 38, 321–330 (2004); M. E. J. Newman and M. Girvan, “Finding and evaluating community structure in networks,” Phys. Rev. E 69, 026113-1–15 (2004). These papers describe an algorithm for detecting the heirarchical structure of networks. J. A. Niesse, R. P. White, and H. R. Mayne, “Genetic algorithm approaches to minimum energy geometry of aromatic hydrocarbon clusters,” J. Chem. Phys. 108, 2208–2218 (1998). Z. Olami, H. J. S. Feder, and K. Christensen, “Self-organized criticality in a continuous, nonconservative cellular automaton modeling earthquakes,” Phys. Rev. Lett. 68, 1244 (1992). CHAPTER 14. COMPLEX SYSTEMS 580 Suzana Moss de Oliveira, Jorge S. Sá Martins, Paulo Murilo C. de Oliveira, Karen Luz-Burgoa, Armando Ticona, and Thadeu J. P. Penna, “The Penna model for biological aging and speciation,” Computing in Science and Engineering 6 (3), 74–81 (2004). Also see Dietrich Stauffer, “The complexity of biological ageing,” cond-mat/0310038. Elaine S. Oran and Jay P. Boris, Numerical Simulation of Reactive Flow, 2nd ed. (Cambridge University Press, 2002). Although much of this book assumes an understanding of fluid dynamics, the discussion of simulation methods and the numerical solution of the differential equations of fluid flow does not require much background. Michel Peyrard, “Nonlinear dynamics and statistical physics of DNA,” Nonlinearity 17, R1– R40 (2004). The author describes a simple mechanical model of DNA [see Figure 10 and Eq. (1)] that is in the same spirit as the Burridge–Knopoff model of earthquakes. William Poundstone, The Recursive Universe (Contemporary Books, 1985). A book on the Game of Life that attempts to draw analogies between the patterns of Life and ideas of information theory and cosmology. Drek de Solla Price, “Networks of scientific papers,” Science 149, 510–515 (1965); “A genral theory of bibliometric and other cummulative advantage processes,” J. Amer. Soc. Inform. Sci. 27, 292–306 (1976). Possibly the first description of a scale-free network and the explanation for power law distributions. Daniel H. Rothman and Stéphane Zalesk, Lattice-Gas Cellular Automata (Cambridge University Press, 1997). This text includes a discussion of fluid flow through porous media as well as the lattice Boltzmann method for simulating fluids. Also see Daniel H. Rothman and Stéphane Zaleski, “Lattice-gas models of phase separation: interfaces, phase transitions, and multiphase flow,” Rev. Mod. Phys. 66, 1417–1479 (1994). David E. Rumelhart and James L. McClelland, Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 1: Foundations (MIT Press, 1986). See also Vol. 2 on applications. Robert Savit, Radu Manuca, and Rick Riolo, “Adaptive competition, market efficiency, and phase transitions,” Phys. Rev. Lett. 82, 2203 (1999). Analysis of the scaling behavior of the minority game. Herbert A Simon, “On a class of skew distribution functions,” Biometrika, 42, 425–440 (1955). An early paper that shows power laws coming from preferential attachment. V. Sood and S. Redner, “Voter model on heterogeneous graphs,” Phys. Rev. Lett. 94, 178701 (2005). Dietrich Stauffer, “Monte Carlo simulations of Sznajd models,” J. Artificial Societies and Social Simulation 5(1) paper #4 (2002). This paper and other relevant papers can be found at . Dietrich Stauffer, “Cellular automata,” Chapter 9 in Fractals and Disordered Systems, Armin Bunde and Shlomo Havlin, editors (Springer–Verlag, 1991). Also see Dietrich Stauffer, “Programming cellular automata,” Computers in Physics 5 (1), 62 (1991). Daniel L. Stein, editor, Lectures in the Sciences of Complexity, Vol. 1 (Addison–Wesley, 1989); Erica Jen, editor, Lectures in Complex Systems, Vol. 2 (Addison–Wesley, 1990); Daniel L. Stein and Lynn Nadel, editors, Lectures in Complex Systems, Vol. 3 (Addison–Wesley, 1991). CHAPTER 14. COMPLEX SYSTEMS 581 Patrick Sutton and Sheri Boyden, “Genetic algorithms: A general search procedure,” Am. J. Phys. 62, 549–552 (1994). This readable paper discusses the application of genetic algorithms to Ising models and function optimization. K. Sznajd-Weron and J. Sznajd, “Opinion evolution in closed community,” Int. J. Mod. Phys. C 11 (6), 1157–1165 (2000). Tommaso Toffoli and Norman Margolus, Cellular Automata Machines—A New Environment for Modeling (MIT Press, 1987). See also Norman Margolus and Tommaso Toffoli, “Cellular automata machines,” in the volume edited by Doolen et al. D. J. Tritton, Physical Fluid Dynamics, 2nd ed. (Oxford Science Publications, 1988). An excellent introductory text that integrates theory and experiment. Although there is only a brief discussion of numerical work, the text provides the background useful for simulating fluids. M. Mitchell Waldrop, Complexity: The Emerging Science at the Edge of Order and Chaos (Simon and Schuster, 1992). A popular exposition of complexity theory. Stephen Wolfram, editor, Theory and Applications of Cellular Automata (World Scientific, 1986). A collection of research papers on cellular automata that range in difficulty from straightforward to specialists only. An extensive annotated bibliography also is given. Two papers in this collection that discuss the classification of one-dimensional cellular automata are S. Wolfram, “Statistical mechanics of cellular automata,” Rev. Mod. Phys. 55, 601–644 (1983), and S. Wolfram, “Universality and complexity in cellular automata,” Physica B 10, 1–35 (1984). Stephen Wolfram, A New Kind of Science (Wolfram Media, 2002). This book discusses many important ideas and computer experiments on cellular automata. More information can be found at . An interesting review of this book is given by L. Kadanoff, “Wolfram on cellular automata,” Phys. Today 55 (7), 55–56 (2002). László Zalányi, Gábor Csárdi, Tamás Kiss, Máté Lengyel, Rebecca Warner, Jan Tobochnik, and Péter Érdi, “Properties of a random attachment growing network,” Phys. Rev. E. 68, 066104-1–9 (2003). Chapter 15 Monte Carlo Simulations of Thermal Systems We discuss how to simulate thermal systems using a variety of Monte Carlo methods including the traditional Metropolis algorithm. Applications to the Ising model and various particle systems are discussed and more efficient Monte Carlo algorithms are introduced. 15.1 Introduction The Monte Carlo simulation of the particles in the box problem discussed in Chapter 7 and the molecular dynamics simulations discussed in Chapter 8 have exhibited some of the important qualitative features of macroscopic systems such as the irreversible approach to equilibrium and the existence of equilibrium fluctuations in macroscopic quantities. In this chapter we apply various Monte Carlo methods to simulate the equilibrium properties of thermal systems. These applications will allow us to explore some of the important concepts of statistical mechanics. Due in part to the impact of computer simulations, the applications of statistical mechanics have expanded from the traditional areas of dense gases, liquids, crystals, and simple models of magnetism to the study of complex materials, particle physics, and theories of the early universe. For example, the demon algorithm introduced in Section 15.3 was developed by a physicist interested in lattice gauge theories which are used to describe the interactions of fundamental particles. 15.2 The Microcanonical Ensemble We first discuss an isolated system for which the number of particles N, the volume V , and the total energy E are fixed and external influences such as gravitational and magnetic fields can be ignored. The macrostate of the system is specified by the values of E, V , and N. At the microscopic level, there are many different ways or configurations in which the macrostate (E,V ,N) can be realized. A particular configuration or microstate is accessible if its properties are consistent with the specified macrostate. All we know about the accessible microstates is that their properties are consistent with the known physical quantities of the system. Because we have no reason to prefer one microstate 582 CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 583 ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↑ ↓ ↓ ↑ ↑ ↓ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↓ ↓ ↑ ↓ ↓ ↑ ↓ ↑ ↑ ↓ ↑ ↑ ↓ ↑ ↓ ↓ ↓ ↑ ↑ ↓ ↑ ↑ ↓ ↑ ↑ ↓ ↓ ↓ ↑ ↓ ↓ ↑ ↑ ↑ ↑ ↓ ↑ ↓ ↑ ↓ ↑ ↑ ↓ ↓ 4µB 2µB 0 −2µB −4µB Table 15.1: The sixteen microstates for a one-dimensional system of N = 4 noninteracting spins. The total energy E of each microstate is also shown. If the total energy of the system is E = −2µB, then there are four accessible microstates (see the fourth column). Hence, in this case the ensemble consists of four systems, each in a different microstate with equal probability. over another when the system is in equilibrium, it is reasonable to postulate that the system is equally likely to be in any one of its accessible microstates. To make this postulate of equal a priori probabilities more precise, imagine an isolated system with Ω accessible states. The probability Ps of finding the system in microstate s is Ps =    1/Ω, if s is accessible 0, otherwise. (15.1) The sum of Ps over all Ω states is equal to unity. Equation (15.1) is applicable only when the system is in equilibrium. The averages of physical quantities can be determined in two ways. In the usual laboratory experiment, the physical quantities of interest are measured over a time interval sufficiently long to allow the system to sample a large number of its accessible microstates. We computed such time averages in Chapter 8, where we used the method of molecular dynamics to compute the time-averaged values of quantities such as the temperature and pressure. An interpretation of the probabilities in (15.1) that is consistent with such a time average is that during a sequence of observations, Ps yields the fraction of times that a single system is found in a given microstate. Although time averages are conceptually simple, it is convenient to imagine a collection or ensemble of systems that are identical mental copies characterized by the same macrostate but, in general, by different microstates. In this interpretation, the probabilities in (15.1) describe an ensemble of identical systems, and Ps is the probability that a system in the ensemble is in microstate s. An ensemble of systems specified by E, N, V is called a microcanonical ensemble. An advantage of ensembles is that statistical averages can be determined by sampling the states according to the desired probability distribution. Much of the power of Monte Carlo methods is that we can devise sampling methods based on a fictitious dynamics that is more efficient than the real dynamics. Suppose that a physical quantity A has the value As when the system is in microstate s. Then the ensemble average of A is given by A = Ω s=1 AsPs (15.2) where Ps is given by (15.1). To illustrate these ideas, consider a one-dimensional system of N noninteracting spins on a lattice. The spins can be in one of two possible directions which we take to be up or down. CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 584 The total energy of the system is E = −µB i si, where each lattice site has associated with it a number si = ±1, where si = +1 for an up spin and si = −1 for a down spin; B is the magnetic field, and µ is the magnetic moment of a spin. A particular microstate of the system of spins is specified by the set of variables {s1,s2,...,sN }. In this case the macrostate of the system is specified by E and N. In Table 15.1 we show the 16 microstates with N = 4. If the total energy E = −2µB, we see that there are four accessible microstates. Hence, in this case there are four systems in the ensemble each with an equal probability. The enumeration of the systems in the ensemble and their probability allows us to calculate ensemble averages for the physical quantities of interest. Problem 15.1. A simple ensemble average Consider a one-dimensional system of N = 4 noninteracting spins with total energy E = −2µB. What is the probability Pi that the ith spin is up? Does your answer depend on which spin you choose? 15.3 The Demon Algorithm We found in Chapter 8 that we can do a time average of a system of many particles with E, V , and N fixed by integrating Newton’s equations of motion for each particle and computing the time-averaged value of the physical quantities of interest. How can we do an ensemble average at fixed E, V , and N? And what can we do if there is no equation of motion available? One way would be to enumerate all the accessible microstates and calculate the ensemble average of the desired physical quantities as we did in Table 15.1. This approach is usually not practical because the number of microstates for even a small system is much too many to enumerate. In the spirit of Monte Carlo, we wish to develop a practical method of obtaining a representative sample of the total number of microstates. One possible procedure is to fix N, choose each spin to be up or down at random, and retain the configuration if it has the desired total energy. However, this procedure is very inefficient because most configurations would not have the desired total energy and would have to be discarded. An efficient Monte Carlo procedure for simulating systems at a given energy was developed by Creutz in the context of lattice gauge theory. Suppose that we add an extra degree of freedom to the original macroscopic system of interest. For historical reasons, this extra degree of freedom is called a demon. The demon transfers energy as it attempts to change the dynamical variables of the system. If the desired change lowers the energy of the system, the excess energy is given to the demon. If the desired change raises the energy of the system, the demon gives the required energy to the system if the demon has sufficient energy. The only constraint is that the demon cannot have negative energy. We first apply the demon algorithm to a one-dimensional classical system of N noninteracting particles of mass m (an ideal gas). The total energy of the system is E = i mv2 i /2, where vi is the velocity of particle i. In general, the demon algorithm is summarized by the following steps: 1. Choose a particle at random and make a trial change in its coordinates. 2. Compute ∆E, the change in the energy of the system due to the change. 3. If ∆E ≤ 0, the system gives the amount |∆E| to the demon, that is, Ed = Ed − ∆E, and the trial configuration is accepted. CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 585 4. If ∆E > 0 and the demon has sufficient energy for this change (Ed ≥ ∆E), then the demon gives the necessary energy to the system, that is, Ed = Ed − ∆E, and the trial configuration is accepted. Otherwise, the trial configuration is rejected and the configuration is not changed. The above steps are repeated until a representative sample of states is obtained. After a sufficient number of steps, the demon and the system will agree on an average energy for each. The total energy of the system plus the demon remains constant, and because the demon is only one degree of freedom in comparison to the many degrees of freedom of the system, the energy fluctuations of the system will be of order 1/N, which is very small for N 1. The ideal gas has a trivial dynamics. That is, because the particles do not interact, their velocities do not change. (The positions of the particles change, but the positions are irrelevant because the energy depends only on the velocity of the particles.) So the use of the demon algorithm is equivalent to a fictitious dynamics that lets us sample the microstates of the system. Of course, we do not need to apply the demon algorithm to an ideal gas because all its properties can be calculated analytically. However, it is a good idea to consider a simple example first. How do we know that the Monte Carlo simulation of the microcanonical ensemble will yield results equivalent to the time-averaged results of molecular dynamics? The assumption that these two types of averages yield equivalent results is called the quasi-ergodic hypothesis. Although these two averages have not been proven to be identical in general, they have been found to yield equivalent results in all cases of interest. IdealDemon and IdealDemonApp implement the microcanonical Monte Carlo simulation of the ideal classical gas in one dimension. To change a configuration, we choose a particle at random and change its velocity by a random amount. The parameter mcs, the number of Monte Carlo steps per particle, plays an important role in Monte Carlo simulations. On the average, the demon attempts to change the velocity of each particle once per Monte Carlo step per particle. We frequently will refer to the number of Monte Carlo steps per particle as the “time,” even though this time has no obvious direct relation to a physical time. Listing 15.1: The demon algorithm for the one-dimensional ideal gas. package org . opensourcephysics . sip . ch15 ; public class IdealDemon { public double v [ ] ; public int N; public double systemEnergy ; public double demonEnergy ; public int mcs = 0; / / number of MC moves per p a r t i c l e public double systemEnergyAccumulator = 0; public double demonEnergyAccumulator = 0; public int acceptedMoves = 0; public double delta ; public void i n i t i a l i z e ( ) { v = new double [N] ; / / array to hold p a r t i c l e v e l o c i t i e s double v0 = Math . sqrt (2.0 systemEnergy/N) ; for ( int i = 0; i = "+idealGas . demonEnergyAccumulator norm ) ; control . println ( " = "+idealGas . systemEnergyAccumulator norm ) ; control . println ( "acceptance ratio = "+idealGas . acceptedMoves norm ) ; } public void reset ( ) { control . setValue ( "Number of particles N" , 40); control . setValue ( "desired total energy" , 40); control . setValue ( "maximum velocity change" , 2 . 0 ) ; } public void resetData ( ) { idealGas . resetData ( ) ; idealGas . delta = control . getDouble ( "delta" ) ; control . clearMessages ( ) ; } public s t a t i c void main ( String [ ] args ) { SimulationControl control = SimulationControl . createApp (new IdealDemonApp ( ) ) ; control . addButton ( "resetData" , "Reset Data" ) ; / / } } Problem 15.2. Monte Carlo simulation of an ideal gas (a) Use the classes IdealDemon and IdealDemonApp to investigate the equilibrium properties of an ideal gas. Note that the mass of the particles has been set equal to unity and the initial demon energy is zero. For simplicity, the same initial velocity has been assigned to all the particles. Begin by using the default values given in the listing of IdealDemonApp. What is the mean value of the particle velocities after equilibrium has been reached? (b) The configuration corresponding to all particles having the same velocity is not very likely, and it would be better to choose an initial configuration that is more likely to occur when the system is in equilibrium. In any case, we should let the system evolve until it has reached equilibrium before we accumulate data for the various averages. We call this time the equilibration or relaxation time. We can estimate the equilibration time from a plot of the demon energy versus the time. Alternatively, we can reset the data until the computed averages stop changing systematically. Clicking the Reset Data button sets the accumulated sums to zero without changing the configuration. Determine the mean demon energy Ed and the mean system energy per particle using the default values for the parameters. (c) Compute the mean energy of the demon and the mean system energy per particle for N = 100 and E = 10 and E = 20, where E is the total energy of the system. Use your result from part (b) and obtain an approximate relation between the mean demon energy and the mean system energy per particle. (d) In the microcanonical ensemble the total energy is fixed with no reference to the temperature. Define the kinetic temperature by the relation 1 2 m v2 = 1 2 kTkinetic, where 1 2 m v2 is the mean kinetic energy per particle of the system. Use this relation to obtain Tkinetic. Choose units such that m and Boltzmann’s constant k are unity. How is Tkinetic related to the mean demon energy? How do your results compare to the relation given in introductory physics textbooks that the total energy of an ideal gas of N particles in three dimensions is E = 3 2 NkT ? (In one dimension the analogous relation is E = 1 2 NkT .) CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 588 (e) A limitation of most simulations is the finite number of particles. Is the relation between the mean demon energy and mean kinetic energy per particle the same for N = 2 and N = 10 as it is for N = 40? If there is no statistically significant difference between your results for the three values of N, explain why finite N might not be an important limitation for the ideal gas in this simulation. Problem 15.3. Demon energy distribution (a) Add a method to class IdealDemon to compute the probability P (Ed)∆Ed that the demon has energy between Ed and Ed + ∆Ed. Choose the same parameters as in Problem 15.2 and be sure to determine P (Ed) only after equilibrium has been obtained. (b) Plot the natural logarithm of P (Ed) and verify that lnP (Ed) depends linearly on Ed with a negative slope. What is the absolute value of the slope? How does the inverse of this value correspond to the mean energy of the demon and Tkinetic as determined in Problem 15.2? (c) Generalize the IdealDemon class and determine the relation between the mean demon energy, the mean energy per particle of the system, and the inverse of the slope of lnP (Ed) for an ideal gas in two and three dimensions. It is straightforward to write the class so that it is valid for any spatial dimension. 15.4 The Demon as a Thermometer We found in Problem 15.3 that the form of P (Ed) is given by P (Ed) ∝ e−Ed/kT . (15.3) We also found that the parameter T in (15.3) is related to the kinetic temperature of an ideal gas. In Problem 15.4 we will do some further simulations to determine the generality of the form (15.3). Problem 15.4. The Boltzmann probability distribution Modify your simulation of an ideal gas so that the kinetic energy of a particle is proportional to the absolute value of its momentum instead of the square of its momentum. Such a dependence would hold for a relativistic gas where the particles are moving at velocities close to the speed of light. Choose various values of the total energy E and number of particles N. Is the form of P (Ed) the same as in (15.3)? How does the inverse slope of lnP (Ed) versus Ed compare to the mean energy per particle of the system in this case? According to the equipartition theorem of statistical mechanics, each quadratic degree of freedom contributes 1 2 kT to the energy per particle. Problem 15.4 shows that the equipartition theorem is not applicable for other dependencies of the particle energy. Although the microcanonical ensemble is conceptually simple, it does not represent the situation usually found in nature. Most systems are not isolated but are in thermal contact with their environment. This thermal contact allows energy to be exchanged between the laboratory system and its environment. The laboratory system is usually small relative to its environment. The larger system with many more degrees of freedom is commonly referred to as the heat CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 589 reservoir or heat bath. The term heat refers to energy transferred from one body to another due to a difference in temperature. A heat bath is a system for which such energy transfer causes a negligible change in its temperature. A system that is in equilibrium with a heat bath is characterized by the temperature of the latter. If we are interested in the equilibrium properties of such a system, we need to know the probability Ps of finding the system in microstate s with energy Es. The ensemble that describes the probability distribution of a system in thermal equilibrium with a heat bath is known as the canonical ensemble. In general, the canonical ensemble is characterized by the temperature T , the number of particles N, and the volume V , in contrast to the microcanonical ensemble which is characterized by the energy E, N, and V . We have already discussed an example of a system in equilibrium with a heat bath, the demon! In Problems 15.2–15.4, the system of interest was an ideal gas and the demon was an auxiliary (special) particle that facilitated the exchange of energy between the particles of the system. If we take the demon to be the system of interest, we see that the demon exchanges energy with a much bigger system (the ideal gas), which we can take to be the heat bath. We conclude that the probability distribution of the microstates of a system in equilibrium with a heat bath has the same form as the probability distribution of the energy of the demon. (Note that the microstate of the demon is characterized by its energy.) Hence, the probability that a system in equilibrium with a heat bath at temperature T is in microstate s with energy Es has the form given by (15.3): Ps = 1 Z e−βEs (canonical distribution), (15.4) where β = 1/kT and Z is a normalization constant. Because Ps = 1, Z is given by Z = s e−Es/kT . (15.5) The sum in (15.5) is over the microstates of the system for a given N and V . The quantity Z is the partition function of the system. The ensemble defined by (15.4) is known as the canonical ensemble, and the probability distribution (15.4) is the Boltzmann or the canonical distribution. The derivation of the Boltzmann distribution is given in textbooks on statistical mechanics. We will simulate systems in equilibrium with a heat bath in Section 15.6. The partition function plays a key role in statistical mechanics, because the (Helmholtz) free energy F of a system is defined as F = −kT lnZ. (15.6) All thermodynamic quantities can be found from various derivatives of F. In equilibrium the system will be in the state of minimum F for given values of T , V , and N. (This result follows from the second law of thermodynamics which says that a system with fixed E, V , and N will be in the state of maximum entropy.) We will use the free energy concept in a number of the following sections. The form (15.4) of P (Ed) provides a simple way of computing the temperature T from the mean demon energy Ed . The latter is given by Ed = ∞ 0 Ed e−Ed/kT dEd ∞ 0 e−Ed/kT dEd = kT . (15.7) We see that T is proportional to the mean demon energy. Note that the result Ed = kT in (15.7) holds only if the energy of the demon can take on a continuum of values and if the upper limit of integration can be taken to be ∞. CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 590 E = -J E = +J Figure 15.1: The interaction energy between nearest neighbor spins in the absence of an external magnetic field. The demon is an excellent example of a thermometer. It has a measurable property, namely, its energy, which is proportional to the temperature. Because the demon is only one degree of freedom in comparison to the many degrees of freedom of the system with which it exchanges energy, it disturbs the system as little as possible. For example, the demon could be added to a molecular dynamics simulation and provide an independent measure of the temperature. 15.5 The Ising Model A popular model of a system of interacting variables is the Ising model. The model was proposed by Lenz and investigated by Ising, his graduate student, to study the phase transition from a paramagnet to a ferromagnet (cf. Brush). Ising calculated the thermodynamic properties of the model in one dimension and found that the model does not have a phase transition. However, for two and three dimensions the Ising model does exhibit a transition. The nature of the phase transition in two dimensions and some of the diverse applications of the Ising model are discussed in Section 15.7. To introduce the Ising model, consider a lattice containing N sites and assume that each lattice site i has associated with it a number si, where si = ±1. The si are usually referred to as spins. The macroscopic properties of a system are determined by the nature of the accessible microstates. Hence, it is necessary to know the dependence of the energy on the configuration of spins. The total energy E of the Ising model is given by E = −J N i,j=nn(i) sisj − B N i=1 si (15.8) where B is proportional to the uniform external magnetic field. We will refer to B as the magnetic field, even though it includes a factor of µ. The first sum in (15.8) represents the energy of interaction of the spins and is over all nearest neighbor pairs. The exchange constant J is a measure of the strength of the interaction between nearest neighbor spins (see Figure 15.1). The second sum in (15.8) represents the energy of interaction between the magnetic moments of the spins and the external magnetic field. If J > 0, then the states ↑↑ and ↓↓ are energetically favored in comparison to the states ↑↓ and ↓↑. Hence, for J > 0, we expect that the state of lowest total energy is ferromagnetic; that is, the spins all point in the same direction. If J < 0, the states ↑↓ and ↓↑ are favored and the state of lowest energy is expected to be antiferromagnetic, that is, alternate spins are aligned. If we subject the spins to an external magnetic field directed upward, the spins ↑ and ↓ possess an additional energy given by −B and +B, respectively. An important virtue of the Ising model is its simplicity. Some of its simplifying features are that the kinetic energy of the atoms associated with the lattice sites has been neglected, only CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 591 nearest neighbor contributions to the interaction energy are included, and the spins are allowed to have only two discrete values. In spite of the simplicity of the model, we will find that the Ising model exhibits very interesting behavior. Because we are interested in the properties of an infinite system, we have to choose appropriate boundary conditions. The simplest boundary condition in one dimension is to choose a free surface so that the spins at sites 1 and N each have one nearest neighbor interaction only. Usually a better choice is periodic boundary conditions. For this choice a one-dimensional lattice becomes a ring, and the spins at sites 1 and N interact with one another and, hence, have the same number of interactions as do the other spins. What are some of the physical quantities whose averages we wish to compute? An obvious physical quantity is the magnetization M given by M = N i=1 si, (15.9) and the magnetization per spin m = M/N. Usually we are interested in the average values M and the fluctuations M2 − M 2. For the familiar case of classical particles with continuously varying position and velocity coordinates, the dynamics is given by Newton’s laws. For the Ising model the dependence (15.8) of the energy on the spin configuration is not sufficient to determine the time-dependent properties of the system. That is, the relation (15.8) does not tell us how the system changes from one configuration to another, and we have to introduce the dynamics separately. This dynamics will take the form of various Monte Carlo algorithms. We first use the demon algorithm to sample configurations of the Ising model. The implementation of the demon algorithm is straightforward. We first choose a spin at random. The trial change corresponds to a flip of the spin from ↑ to ↓ or ↓ to ↑. We then compute the change in energy of the system and decide whether to accept or reject the trial change. We can determine the temperature T as a function of the energy of the system in two ways. One way is to measure the probability that the demon has energy Ed. Because we know that this probability is proportional to exp(−Ed/kT ), we can determine T from a plot of the logarithm of the probability as a function of Ed. Another way to determine T is to measure the mean demon energy. However, because the possible values of Ed are not continuous for the Ising model, T is not simply proportional to Ed as it is for the ideal gas. We show in Appendix 15A that for B = 0 and the limit of an infinite system, the temperature is related to Ed by kT /J = 4 ln 1 + 4J/ Ed . (15.10) The result (15.10) comes from replacing the integrals in (15.7) by sums over the possible demon energies. Note that in the limit |J/Ed| 1, (15.10) reduces to kT = Ed as expected. The IsingDemon class implements the Ising model in one dimension using periodic boundary conditions and the demon algorithm. Once the initial configuration is chosen, the demon algorithm is similar to that described in Section 15.3. However, the spins in the one-dimensional Ising model must be chosen at random. As usual, we will choose units such that J = 1. Listing 15.3: The implementation of the demon algorithm for the one-dimensional Ising model. package org . opensourcephysics . sip . ch15 ; import java . awt . ; CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 592 import org . opensourcephysics . frames . ; public class IsingDemon { public int [ ] demonEnergyDistribution ; int N; / / number of spins public int systemEnergy ; public int demonEnergy = 0; public int mcs = 0; / / number of MC s t e p s per spin public double systemEnergyAccumulator = 0; public double demonEnergyAccumulator = 0; public int magnetization = 0; public double mAccumulator = 0 , m2Accumulator = 0; public int acceptedMoves = 0; private LatticeFrame l a t t i c e ; public IsingDemon ( LatticeFrame displayFrame ) { l a t t i c e = displayFrame ; } public void i n i t i a l i z e ( int N) { this .N = N; l a t t i c e . r e s i z e L a t t i c e (N, 1 ) ; / / s e t l a t t i c e s i z e l a t t i c e . setIndexedColor (1 , Color . red ) ; l a t t i c e . setIndexedColor ( −1 , Color . green ) ; demonEnergyDistribution = new int [N] ; for ( int i = 0; i 0) { E += dE ; int newSpin = − l a t t i c e . getValue (k , 0 ) ; l a t t i c e . setValue (k , 0 , newSpin ) ; magnetization += 2 newSpin ; } t r i e s ++; } systemEnergy = E ; resetData ( ) ; } public double temperature ( ) { return 4.0/Math . log (1.0+4.0/( demonEnergyAccumulator /(mcs N) ) ) ; } CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 593 public void resetData ( ) { mcs = 0; systemEnergyAccumulator = 0; demonEnergyAccumulator = 0; mAccumulator = 0; m2Accumulator = 0; acceptedMoves = 0; } public void doOneMCStep ( ) { for ( int j = 0; j Math . random ( ) ) ) { accepted ++; ke = keTrial ; velocity = vTrial ; } CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 599 velocityDistribution . append ( velocity ) ; control . clearMessages ( ) ; control . println ( "mcs = "+mcs ) ; control . println ( "acceptance probability = "+(double ) ( accepted )/mcs ) ; } public void reset ( ) { control . setValue ( "Maximum velocity change" , 1 0 . 0 ) ; control . setValue ( "Temperature" , 1 0 . 0 ) ; control . setValue ( "Initial velocity" , 0 . 0 ) ; enableStepsPerDisplay ( true ) ; } public s t a t i c void main ( String [ ] args ) { SimulationControl . createApp (new BoltzmannApp ( ) ) ; } } Problem 15.8. Simulation of a particle in equilibrium with a heat bath (a) Choose the temperature T = 10, the initial velocity equal to zero, and the maximum change in the particle’s velocity to be δ = 10.0. Run for a number of Monte Carlo steps until a plot of lnP (v) versus v is reasonably smooth. Describe the qualitative form of P (v). (Remember that the velocity v can be either positive or negative.) (b) Because the velocity of the particle characterizes the microstate of this single particle system, we need to plot lnP (Es) versus Es = mv2 s /2 to test if the Metropolis algorithm yields the Boltzmann distribution in this case. (The two values of v, one positive and one negative, for each value of E, correspond to different microstates.) Add code to BoltzmannApp to compute P (Es) and determine the slope of lnP (Es) versus Es. The code for extracting information from the HistogramFrame class is given on page 206. Is this slope equal to −β = −1/T , where T is the temperature of the heat bath?. (c) Add code to compute the mean energy and velocity. How do your results for the mean energy compare to the exact value? Explain why the computed mean particle velocity is approximately zero even though the initial particle velocity was not zero. To insure that your results do not depend on the initial conditions, let the initial velocity equal zero and recompute the mean energy and velocity. Do your equilibrium results differ from what you found previously? (d) Add another HistogramFrame object to compute the probability P (E)∆E where E is the energy of the configuration. Does P (E) have the form of a Boltzmann distribution? If not, what is the functional form of P (E)? (e) The acceptance probability is the fraction of trial moves that are accepted. What is the effect of changing the value of δ on the acceptance probability? Problem 15.9. Planar spin in an external magnetic field (a) Consider a classical planar magnet with magnetic moment µ0. The magnet can be oriented in any direction in the x-y plane, and the energy of interaction of the magnet with an external magnetic field B is −µ0Bcosφ, where φ is the angle between the moment and B. Write a CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 600 Monte Carlo program to sample the microstates of this system in thermal equilibrium with a heat bath at temperature T . Compute the mean energy as a function of the ratio βµ0B. (b) Compute the probability density P (φ) and analyze its dependence on the energy. In Problem 15.10 we consider the Monte Carlo simulation of a classical ideal gas of N particles in equilibrium with a heat bath. It is convenient to say that one time unit or one Monte Carlo step per particle (mcs) has elapsed after N particles have had a chance to change their coordinates. If the particles are chosen at random, then during one Monte Carlo step per particle, some particles might not be chosen, but all particles will be chosen equally on the average. The advantage of this definition is that the time is independent of the number of particles. However, this definition of time has no obvious relation to a physical time. Problem 15.10. Simulation of an ideal gas in one dimension (a) Modify class BoltzmannApp to simulate an ideal gas of N particles in one dimension. For simplicity, assume that all particles have the same initial velocity of 10. Let N = 20 and T = 10 and consider at least 2000 Monte Carlo steps per particle. Choose the value of δ so that the acceptance probability is approximately 40%. What are the mean kinetic energy and mean velocity of the particles? (b) We might expect the total energy of an ideal gas to remain constant because the particles do not interact with one another and, hence, cannot exchange energy directly. What is the value of the initial total energy of the system in part (a)? Does the total energy remain constant? If not, explain how the energy changes. (c) What is the nature of the time dependence of the total energy starting from the initial condition in (a)? Estimate the number of Monte Carlo steps per particle necessary for the system to reach thermal equilibrium by computing a moving average of the total energy over a fixed time interval. Does this average change with time after a sufficient time has elapsed? What choice of the initial velocities allows the system to reach thermal equilibrium at temperature T as quickly as possible? (d) Compute the probability P (E)∆E for the system of N particles to have a total energy between E and E+∆E. Plot P (E) as a function of E and describe the qualitative behavior of P (E). Does P (E) have the form of the Boltzmann distribution? If not, describe the qualitative features of P (E) and determine its functional form. (e) Compute the mean energy for T = 10, 20, 40, 80, and 120 and estimate the heat capacity from its definition C = ∂E/∂T . (f) Compute the mean square energy fluctuations (∆E)2 = E2 − E 2 for T = 10 and T = 40. Compare the magnitude of the ratio (∆E)2 /T 2 with the heat capacity determined in part (e). You might have been surprised to find in Problem 15.10d that the form of P (E) is a Gaussian centered about the mean energy of the system. What is the relation of this form of P (E) to the central limit theorem (see Problem 7.15)? If the microstates are distributed according to the Boltzmann probability, why is the total energy distributed according to the Gaussian distribution? CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 601 15.7 Simulation of the Ising Model You are probably familiar with ferromagnetic materials, such as iron and nickel, which exhibit a spontaneous magnetization in the absence of an applied magnetic field. This nonzero magnetization occurs only if the temperature is less than a well-defined temperature known as the Curie or critical temperature Tc. For temperatures T > Tc, the magnetization vanishes. Hence, Tc separates the disordered phase for T > Tc from the ferromagnetic phase for T < Tc. The origin of magnetism is quantum mechanical in nature and its study is of much experimental and theoretical interest. The study of simple classical models of magnetism has provided much insight. The two- and three-dimensional Ising model is the most commonly studied classical model and is particularly useful in the neighborhood of the magnetic phase transition. The thermal quantities of interest for the Ising model include the mean energy E and the heat capacity C. One way to determine C at constant external magnetic field is from its definition C = ∂ E /∂T . An alternative way is to relate C to the statistical fluctuations of the total energy in the canonical ensemble (see Appendix 15B): C = 1 kT 2 E2 − E 2 (canonical ensemble). (15.19) Another quantity of interest is the mean magnetization M and the corresponding zero field magnetic susceptibility: χ = ∂ M ∂B B=0 . (15.20) The zero field magnetic susceptibility χ is an example of a linear response function, because it measures the ability of a spin to respond to a change in the external magnetic field. In analogy to the heat capacity, χ is related to the fluctuations of the magnetization (see Appendix 15C): χ = 1 kT M2 − M 2 (15.21) where M and M2 are evaluated in zero external magnetic field. The relations (15.19) and (15.21) are examples of the general relation between linear response functions and equilibrium fluctuations. The Metropolis algorithm was stated in Section 15.6 as a method for generating states with the desired Boltzmann probability, but the flipping of single spins can also be interpreted as a reasonable approximation to the real dynamics of an anisotropic magnet whose spins are coupled to the vibrations of the lattice. The coupling leads to random spin flips, and we expect that one Monte Carlo step per spin is proportional to the average time between single spin flips observed in the laboratory. Hence, we can regard single spin flips as a time dependent process and observe the relaxation to equilibrium. In the following, we will frequently refer to the application of the Metropolis algorithm to the Ising model as single spin flip dynamics. In Problem 15.11 we use the Metropolis algorithm to simulate the one-dimensional Ising model. Note that the parameters J and kT do not appear separately, but appear together in the dimensionless ratio J/kT . Unless otherwise stated, we measure temperature in units of J/k and set B = 0. CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 602 Problem 15.11. One-dimensional Ising model (a) Write a program to simulate the one-dimensional Ising model in equilibrium with a heat bath. Modify method doOneMCStep in IsingDemon (see class IsingDemon on page 591 or class Ising on page 602). Use periodic boundary conditions. Assume that the external magnetic field is zero. Draw the microscopic state (configuration) of the system after each Monte Carlo step per spin. (b) Choose N = 20 and T = 1 and start with all spins up. What is the initial effective temperature of the system? Run for at least 1000 mcs, where mcs is the number of Monte Carlo steps per spin. Visually inspect the configuration of the system after each Monte Carlo step per spin and estimate the time it takes for the system to reach equilibrium. Does the sign of the magnetization change during the simulation? Increase N and estimate the time for the system to reach equilibrium and for the magnetization to change sign. (c) Change the initial condition so that the orientation of each spin is chosen at random. What is the initial effective temperature of the system in this case? Estimate the time it takes for the system to reach equilibrium. (d) Choose N = 50 and determine E , E2 , and M2 as a function of T in the range 0.1 ≤ T ≤ 5. Plot E as a function of T and discuss its qualitative features. Compare your computed results for E to the exact result (for B = 0): E(T ) = −N tanhβJ. (15.22) Use the relation (15.19) to determine the T - dependence of C. (e) As you probably noticed in part (b), the system can overturn completely during a long run and thus the value of M can vary widely from run to run. Because M = 0 for T > 0 for the one-dimensional Ising model, it is better to assume M = 0 and compute χ from the relation χ = M2 /kT . Use this relation (15.21) to estimate the T -dependence of χ. (f) One of the best laboratory realizations of a one-dimensional Ising ferromagnet is a chain of bichloride-bridged Fe2+ ions known as FeTAC (see Greeney et al.). Measurements of χ yield a value of the exchange interaction J given by J/k = 17.4K. Note that experimental values of J are typically given in temperature units. Use this value of J to plot your Monte Carlo results for χ versus T with T given in Kelvin. At what temperature is χ a maximum for FeTAC? (g) Is the acceptance probability an increasing or decreasing function of T ? Does the Metropolis algorithm become more or less efficient as the temperature is lowered? (h) Compute the probability P (E) for a system of N = 50 spins at T = 1. Run for at least 1000 mcs. Plot lnP (E) versus (E − E )2 and discuss its qualitative features. We next apply the Metropolis algorithm to the Ising model on the square lattice. The Ising class is listed in the following. Listing 15.5: The Ising class. package org . opensourcephysics . sip . ch15 ; import java . awt . ; import org . opensourcephysics . frames . ; CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 603 public class Ising { public s t a t i c final double criticalTemperature = 2.0/Math . log (1.0+Math . sqrt ( 2 . 0 ) ) ; public int L = 32; public int N = L L ; / / number of spins public double temperature = criticalTemperature ; public int mcs = 0; / / number of MC moves per spin public int energy ; public double energyAccumulator = 0; public double energySquaredAccumulator = 0; public int magnetization = 0; public double magnetizationAccumulator = 0; public double magnetizationSquaredAccumulator = 0; public int acceptedMoves = 0; private double [ ] w = new double [ 9 ] ; / / array to hold Boltzmann f a c t o r s public LatticeFrame l a t t i c e ; public void i n i t i a l i z e ( int L , LatticeFrame displayFrame ) { l a t t i c e = displayFrame ; this . L = L ; N = L L ; l a t t i c e . r e s i z e L a t t i c e (L , L ) ; / / s e t l a t t i c e s i z e l a t t i c e . setIndexedColor (1 , Color . red ) ; l a t t i c e . setIndexedColor ( −1 , Color . green ) ; for ( int i = 0; i Math . random ( ) ) ) { int newSpin = − l a t t i c e . getValue ( i , j ) ; l a t t i c e . setValue ( i , j , newSpin ) ; acceptedMoves++; energy += dE ; magnetization += 2 newSpin ; } } energyAccumulator += energy ; energySquaredAccumulator += energy energy ; magnetizationAccumulator += magnetization ; magnetizationSquaredAccumulator += magnetization magnetization ; mcs++; } } One of the most time consuming parts of the Metropolis algorithm is the calculation of the exponential function e−β∆E. Because there are only a small number of possible values of β∆E for the Ising model (see Figure 15.11), we store the small number of different probabilities for the spin flips in the array w. The values of this array are computed in method initialize. To implement the Metropolis algorithm, we determine the change in the energy ∆E and then accept the trial flip if ∆E ≤ 0. If this condition is not satisfied, we generate a random number in the unit interval and compare it to e−β∆E. We can use a single if statement for these two conditions, because in Java (and C/C++) the second condition of an || (or) statement is evaluated only if the first is false. This feature is very useful because we do not want to generate random numbers when they are not needed, as is the case for ∆E ≤ 0. (The same feature holds for the compound & & (and) statement for which the second condition is only evaluated if the first is true.) CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 605 A typical laboratory system has at least 1018 spins. In contrast, the number of spins that can be simulated typically ranges from 103 to 109. As we have discussed in other contexts, the use of periodic boundary conditions minimizes finite size effects. However, more sophisticated boundary conditions are sometimes convenient. For example, we can give the surface spins extra neighbors, whose direction is related to the mean magnetization of the microstate (see Saslow). In class Ising data for the values of the physical observables are accumulated after each Monte Carlo step per spin. The optimum time for sampling various physical quantities is explored in Problem 15.13. Note that if a flip is rejected, the old configuration is retained. Thermal equilibrium is not described properly unless the old configuration is again included in computing the averages. Achieving thermal equilibrium can account for a substantial fraction of the total run time for very large systems. The most practical choice of initial conditions in these cases is a configuration from a previous run that is at a temperature close to the desired temperature. The code for reading and saving configurations can be found in Appendix 8A. Problem 15.12. Equilibration of the two-dimensional Ising model (a) Write a target class that uses class Ising and plots the magnetization and energy as a function of the number of Monte Carlo steps. Your program should also display the mean magnetization, the energy, the specific heat, the susceptibility, and the acceptance probability when the simulation is stopped. Averages such as the mean energy and the susceptibility should be normalized by the number of spins so that it is easy to compare systems with different values of N. Choose the linear dimension L = 32 and the heat bath temperature T = 2. Estimate the time needed to equilibrate the system given that all the spins are initially up. (b) Visually determine if the spin configurations are “ordered” or “disordered” at T = 2 after equilibrium has been established. (c) Repeat part (a) with the initial direction of each spin chosen at random. Make sure you explicitly compute the initial energy and magnetization in initialize. Does the equilibration time increase or decrease? (d) Repeat parts (a)–(c) for T = 2.5. Problem 15.13. Comparison with exact results In general, a Monte Carlo simulation yields exact answers only after an infinite number of configurations have been sampled. How then can we be sure that our program works correctly, and our results are statistically meaningful? One way is to reproduce exact results in known limits. In the following, we test class Ising by considering a small system for which the mean energy and magnetization can be calculated analytically. (a) Calculate analytically the T -dependence of E, M, C, and χ for the Ising model on the square lattice with L = 2. (A summary of the calculation is given in Appendix 15C.) For simplicity, we have omitted the brackets denoting the thermal averages.) (b) Simulate the Ising model with L = 2 and estimate E, M, C, and χ for T = 0.5 and 0.25. Use the relations (15.19) to compute C. Compare your estimated values to the exact results found in part (a). Approximately how many Monte Carlo steps per spin are necessary to obtain E and M to within 1%? How many Monte Carlo steps per spin are necessary to obtain C to within 1%? CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 606 (c) Choose L = 4 and the direction of each spin at random and equilibrate the system at T = 3. Look at the time series of M and E after every Monte Carlo step per spin and estimate how often M changes sign. Does E change sign when M changes sign? How often does M change sign for L = 8 and L = 32 (and T = 3)? Although the direction of the spins is initially chosen at random, it is likely that the number of up spins will not exactly cancel the number of down spins. Is that statement consistent with your observations? If the net number of spins is up, how long does the net magnetization remain positive for a given value of L? (d) The calculation of χ is more complicated because the sign of M can change during the simulation for smaller values of L. Compare your results for χ from using (15.21) and from using (15.21) with M replaced by |M| . Which way of computing χ gives more accurate results? Now that you have checked your program and obtained typical equilibrium configurations, we consider in more detail the calculation of the mean values of the physical quantities of interest. Suppose we wish to compute the mean value of the physical quantity A. In some cases, the calculation of A for a given configuration is time consuming, and we do not want to compute its value more often than necessary. For example, we would not compute A after the flip of only one spin because the values of A in the two configurations would almost be the same. Ideally, we wish to compute A for configurations that are statistically independent. Because we do not know a priori the mean number of spin flips needed to obtain configurations that are statistically independent, it is a good idea to estimate this time in your preliminary calculations. One way to estimate the time interval over which configurations are correlated is to compute the time displaced autocorrelation function CA(t) which is defined as CA(t) = A(t + t0)A(t0) − A 2 A2 − A 2 (15.23) where A(t) is the value of the quantity A at time t. The averages in (15.23) are over all possible time origins t0. Because the choice of the time origin is arbitrary for an equilibrium system, CA depends only on the time difference t rather than t and t0 separately. For sufficiently large t, A(t) and A(0) will become uncorrelated, and hence A(t + t0)A(t0) → A(t + t0) A(t0) = A 2. Hence CA(t) → 0 as t → ∞. Also, CA(t = 0) is normalized to unity. In general, CA(t) will decay exponentially with t with a decay or correlation time τA whose magnitude depends on the choice of the physical quantity A as well as the physical parameters of the system, for example, the temperature. The time dependence of the two most common correlation functions CM(t) and CE(t) is investigated in Problem 15.14. As an example of the calculation of CE(t), consider the equilibrium time series for E for the L = 4 Ising model on the square lattice at T = 4: −4, −8, 0, −8, −20, −4, 0, 0, −24, −32, −24, −24, −8, −8, −16, −12. The averages of E and E2 over these sixteen values are E = −12, E2 = 240, and E2 − E 2 = 96. We wish to compute E(t)E(0) for all possible choices of the time origin. For example, E(4)E(0) is given by E(4)E(0) = 1 12 (−20 × −4) + (−4 × −8) + (0 × 0) + (0 × −8) + (−24 × −20) + (−32 × −4) + (−24 × 0) + (−24 × 0) + (−8 × −24) + (−8 × −32) + (−16 × −24) + (−12 × −24) . (15.24) CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 607 We averaged over the twelve possible choices of the origin for the time difference t = 4. Verify that E(4)E(0) = 460/3 and CE(4) = 7/72. To implement this procedure on a computer, we could store the time series in memory, if it is not too long, or save it in a data file. You can save the data for M(t) and E(t) by pressing the Save XML menu item under the File menu on the frame containing the plots for M(t) and E(t). The class IsingAutoCorrelatorApp in Listing 15.6 reads in data created by the IsingApp class. Method computeCorrelation computes the mean and mean square of the magnetization and the energy, which are needed to compute CM and CE as defined in (15.23). Then it computes the time displaced autocorrelation for all possible choices of t0. Listing 15.6: Listing of class for computing autocorrelation function of M and E. package org . opensourcephysics . sip . ch15 ; import java . u t i l . ; import javax . swing . ; import org . opensourcephysics . controls . ; import org . opensourcephysics . display . ; import org . opensourcephysics . frames . ; public class IsingAutoCorrelatorApp extends AbstractCalculation { PlotFrame plotFrame = new PlotFrame ( "tau" , " and " , "Time correlations" ) ; double [ ] energy = new double [ 0 ] , magnetization = new double [ 0 ] ; int numberOfPoints ; public void calculate ( ) { computeCorrelation ( control . getInt ( "Maximum time interval, tau" ) ) ; } public void readXMLData ( ) { energy = new double [ 0 ] ; magnetization = new double [ 0 ] ; numberOfPoints = 0; String filename = "ising_data.xml" ; JFileChooser chooser = OSPFrame . getChooser ( ) ; int result = chooser . showOpenDialog ( null ) ; i f ( result==JFileChooser .APPROVE_OPTION) { filename = chooser . getSelectedFile ( ) . getAbsolutePath ( ) ; } else { return ; } XMLControlElement xmlControl = new XMLControlElement ( filename ) ; i f ( xmlControl . failedToRead ( ) ) { control . println ( "failed to read: "+filename ) ; } else { / / g e t s the d a t a s e t s in the xml f i l e I t e r a t o r i t = xmlControl . getObjects ( Dataset . class , false ) . i t e r a t o r ( ) ; while ( i t . hasNext ( ) ) { Dataset dataset = ( Dataset ) i t . next ( ) ; i f ( dataset . getName ( ) . equals ( "magnetization" ) ) { magnetization = dataset . getYPoints ( ) ; } CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 608 i f ( dataset . getName ( ) . equals ( "energy" ) ) { energy = dataset . getYPoints ( ) ; } } numberOfPoints = magnetization . length ; control . println ( "Reading: "+filename ) ; control . println ( "Number of points = "+numberOfPoints ) ; } calculate ( ) ; plotFrame . repaint ( ) ; } public void computeCorrelation ( int tauMax ) { plotFrame . clearData ( ) ; double energyAccumulator = 0 , magnetizationAccumulator = 0; double energySquaredAccumulator = 0 , magnetizationSquaredAccumulator = 0; for ( int t = 0; t 0 plotFrame . append (0 , tau , ( ( c_MAccumulator/ counter )− averageMagnetizationSquared )/normM ) ; plotFrame . append (1 , tau , ( ( c_EAccumulator/ counter )− averageEnergySquared )/normE ) ; } plotFrame . s e t V i s i b l e ( true ) ; } public void reset ( ) { control . setValue ( "Maximum time interval, tau" , 20); CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 609 readXMLData ( ) ; } public s t a t i c void main ( String args [ ] ) { CalculationControl . createApp (new IsingAutoCorrelatorApp ( ) ) ; } } Problem 15.14. Correlation times (a) As a check on IsingAutoCorrelatorApp, use the time series for E given in the text to do a hand calculation of CE(t) in the way that it is computed in the computeCorrelation method. (b) Use class IsingAutoCorrelatorApp to compute the equilibrium values of CM(t) and CE(t). Save the values of the magnetization and energy only after the system has reached equilibrium. Estimate the correlation times from the energy and the magnetization correlation functions for L = 8, and T = 3, T = 2.3, and T = 2. One way to determine τ is to fit C(t) to the exponential form C(t) ∼ e−t/τ. Another way is to define the integrated correlation time as τ = t=1 C(t). (15.25) The sum is cut off at the first negative value of C(t). Are the negative values of C(t) physically meaningful? How does the behavior of C(t) change if you average your results over longer runs? How do your estimates for the correlation times compare with your estimates of the relaxation time found in Problem 15.12? Why would the term “decorrelation time” be more appropriate than “correlation time?” Are the correlation times τM and τE comparable? (c) To simulate the relaxation to equilibrium as realistically as possible, we have randomly selected the spins to be flipped. However, if we are interested only in equilibrium properties, it might be possible to save computer time by selecting the spins sequentially. Determine if the correlation time is greater, smaller, or approximately the same if the spins are chosen sequentially rather than randomly. If the correlation time is greater, does it still save CPU time to choose spins sequentially? Why is it not desirable to choose spins sequentially in the one-dimensional Ising model? How can we quantify the accuracy of our measurements, for example, the accuracy of the estimated mean energy? As discussed in Chapter 11, the usual measure of the accuracy is the standard deviation of the mean. If we make n measurements of E, then the most probable error in E is given by σm = σ √ n (15.26) where the standard deviation σ is defined as σ2 = E2 − E 2 . (15.27) The difficulty is that, in general, our measurements of the time series Ei are not independent, but are correlated. Hence, σm as given by (15.26) is an underestimate of the actual error. CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 610 ∗Problem 15.15. Estimate of errors One way to determine whether the measurements are independent is to compute the correlation time. Another way is based on the idea that the magnitude of the error should not depend on how we group the data (see Section 11.4). For example, suppose that we group every two data points to form n/2 new data points E (2) i given by E (g=2) i = (1/2)[E2i−1 +E2i]. If we replace n by n/2 and E by E(2) in (15.26) and (15.27), we would find the same value of σm as before, provided that the original Ei are independent. If the computed σm is not the same, we continue this averaging process until σm calculated from E (g) i = 1 2 E (g/2) 2i−1 + E (g/2) 2i (g = 2,4,8,...), (15.28) is approximately the same as that calculated from E(g/2). (a) Use this averaging method to estimate the errors in your measurements of E and M . Choose L = 8, T = Tc = 2/ ln(1 + √ 2) ≈ 2.269, and mcs ≥ 16384 and calculate averages after every Monte Carlo step per spin after the system has equilibrated. (The significance of Tc will be explored in Section 15.8.) A rough measure of the correlation time is the number of terms in the time series that need to be averaged for σm to be approximately unchanged. What is the qualitative dependence of the correlation time on T − Tc? (b) Repeat for L = 16. Do you need more Monte Carlo steps than in part (a) to obtain statistically independent data? If so, why? (c) The exact value of E/N for the Ising model on a square lattice with L = 16 and T = Tc = 2/ ln(1 + √ 2) is given by E/N = −1.45306 (to five decimal places). The exact result for E/N allows us to determine the actual error in this case. Compute E by averaging E after each Monte Carlo step per spin for mcs ≥ 106. Compare your actual error to the estimated error given by (15.26) and (15.27) and discuss their relative values. 15.8 The Ising Phase Transition Now that we have tested our program for the two-dimensional Ising model, we explore some of its properties. Problem 15.16. Qualitative behavior of the two-dimensional Ising model (a) Use class Ising and your version of IsingApp to compute the mean magnetization, the mean energy, the heat capacity, and the susceptibility. Because we will consider the Ising model for different values of L, it will be convenient to convert these quantities to intensive quantities such as the mean energy per spin, the specific heat (per spin), and the susceptibility per spin. For simplicity, we will use the same notation for both the extensive and the corresponding intensive quantities. Choose L = 4 and consider T in the range 1.5 ≤ T ≤ 3.5 in steps of ∆T = 0.2. Choose the initial condition at T = 3.5 such that the orientation of the spins is chosen at random. Because all the spins might overturn and the magnetization would change sign during the course of your observation, estimate the mean value of |M| in addition to that of M. The susceptibility should be calculated as χ = 1 kT [ M2 − |M| 2 ]. (15.29) CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 611 0.0 1.0 2.0 3.0 4.0 1.5 2.0 2.5 3.0 3.5 T CV Figure 15.3: The temperature dependence of the specific heat C (per spin) of the Ising model on a square lattice with periodic boundary conditions for L = 8 and L = 16. One thousand Monte Carlo steps per spin were used for each value of the temperature. The continuous line represents the temperature dependence of C in the limit of an infinite lattice. (Note that C is infinite at T = Tc for an infinite lattice.) Use at least 1000 Monte Carlo steps per spin and estimate the number of equilibrium configurations needed to obtain M and E to 5% accuracy. Plot E , m, |m|, C, and χ as a function of T and describe their qualitative behavior. Do you see any evidence of a phase transition? (b) Repeat the calculations of part (a) for L = 8 and L = 16. Plot E , m, |m|, C, and χ as a function of T and describe their qualitative behavior. Is the evidence of a phase transition more obvious? (c) The correlation length ξ can be obtained from the r-dependence of the spin correlation function c(r). The latter is defined as c(r) = sisj − m2 (15.30) where r is the distance between sites i and j. The system is translationally invariant so we write si = sj = m. The average is over all sites for a given configuration and over many configurations. Because the spins are not correlated for large r, c(r) → 0 in this limit. Assume that c(r) ∼ e−r/ξ for r sufficiently large and estimate ξ as a function of T . How does your estimate of ξ compare with the size of the domains of spins with the same orientation? Our studies of phase transitions are limited by the relatively small system sizes we can simulate. Nevertheless, we observed in Problem 15.16 that even systems as small as L = 4 exhibit behavior that is reminiscent of a phase transition. In Figure 15.3 we show our Monte Carlo data for the T -dependence of the specific heat of the two-dimensional Ising model for CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 612 M TTc 1 0 Figure 15.4: The temperature dependence of m(T ), the mean magnetization per spin, for the Ising model in two dimensions in the thermodynamic limit. L = 8 and L = 16. We see that C exhibits a broad maximum which becomes sharper for larger L. Does your data for C exhibit similar behavior? We next summarize some of the qualitative properties of ferromagnetic systems in zero magnetic field in the thermodynamic limit (N → ∞). At T = 0, the spins are perfectly aligned in either direction; that is, the mean magnetization per spin m(T ) = M(T ) /N is given by m(T = 0) = ±1. As T is increased, the magnitude of m(T ) decreases continuously until T = Tc at which m(T ) vanishes (see Figure 15.4). Because m(T ) vanishes continuously rather than abruptly, the transition is termed continuous rather than discontinuous. (The term first order describes a discontinuous transition.) How can we characterize a continuous magnetic phase transition? Because a nonzero m implies that a net number of spins are spontaneously aligned, we designate m as the order parameter of the system. Near Tc, we can characterize the behavior of many physical quantities by power law behavior just as we characterized the percolation threshold (see Table 12.1). For example, we can write m near Tc as m(T ) ∼ (Tc − T )β (15.31) where β is a critical exponent (not to be confused with the inverse temperature). Various thermodynamic derivatives such as the susceptibility and specific heat diverge at Tc and are characterized by critical exponents. We write χ ∼ |T − Tc|−γ , (15.32) and C ∼ |T − Tc|−α (15.33) where we have introduced the critical exponents γ and α. We have assumed that χ and C are characterized by the same critical exponents above and below Tc. Another measure of the magnetic fluctuations is the linear dimension ξ(T ) of a typical magnetic domain. We expect the correlation length ξ(T ) to be the order of a lattice spacing for T Tc. Because the alignment of the spins becomes more correlated as T approaches Tc from above, ξ(T ) increases as T approaches Tc. We can characterize the divergent behavior of ξ(T ) near Tc by the critical exponent ν: ξ(T ) ∼ |T − Tc|−ν . (15.34) CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 613 As we found in our discussion of percolation in Chapter 12, a finite system cannot exhibit a true phase transition. We expect that if ξ(T ) is less than the linear dimension L of the system, our simulations will yield results comparable to an infinite system. In contrast, if T is close to Tc, our simulations will be limited by finite-size effects. Because we can simulate only finite lattices, it is difficult to obtain estimates for the critical exponents α, β, and γ by using the definitions (15.31)–(15.33) directly. We learned in Section 12.4 that we can use finite-size scaling to extrapolate finite L results to L → ∞. For example, from Figure 15.3 we see that the temperature at which C exhibits a maximum becomes better defined for larger lattices. This behavior provides a simple definition of the transition temperature Tc(L) for a finite system. According to finite size scaling theory, Tc(L) scales as Tc(L) − Tc(L = ∞) ∼ aL−1/ν (15.35) where a is a constant and ν is defined in (15.34). The finite size of the lattice is important when the correlation length is comparable to the linear dimension of the system: ξ(T ) ∼ L ∼ |T − Tc|−ν . (15.36) As in Section 12.4, we can set T = Tc and consider the L-dependence of M, C, and χ: m(T ) ∼ (Tc − T )β → L−β/ν (15.37) C(T ) ∼ |T − Tc|−α → Lα/ν (15.38) χ(T ) ∼ |T − Tc|−γ → Lγ/ν . (15.39) In Problem 15.17 we use the relations (15.37)–(15.39) to estimate the critical exponents β, γ, and α. Problem 15.17. Finite-size scaling for the two-dimensional Ising model (a) Use the relation (15.35) together with the exact result ν = 1 to estimate the value of Tc for an infinite square lattice. Because it is difficult to obtain a precise value for Tc with small lattices, we will use the exact result kTc/J = 2/ ln(1+ √ 2) ≈ 2.269 for the infinite lattice in the remaining parts of this problem. (b) Determine the mean value of the absolute value of the magnetization per spin |m|, the specific heat C, and the susceptibility χ at T = Tc for L = 4, 8, 16, and 32. Compute χ using (15.21) with |M| instead of M . Use as many Monte Carlo steps per spin as possible. Plot the logarithm of |m| and χ versus L and use the scaling relations (15.37)–(15.39) to determine the critical exponents β and γ. Use the exact result ν = 1. Do your log-log plots of |m| and χ yield reasonably straight lines? Compare your estimates for β and γ with the exact values given in Table 12.1. (c) Make a log-log plot of C versus L. If your data for C is sufficiently accurate, you will find that the log-log plot of C versus L is not a straight line but shows curvature. The reason is that the exponent α in (15.33) equals zero for the two-dimensional Ising model, and hence (15.38) needs to be interpreted as C ∼ C0 lnL. (15.40) Is your data for C consistent with (15.40)? The constant C0 in (15.40) is approximately 0.4995. CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 614 So far we have performed our Ising model simulations on a square lattice. How do the critical temperature and the critical exponents depend on the symmetry and the dimension of the lattice? Based on your experience with the percolation transition in Chapter 12, you probably know the answer. Problem 15.18. The effects of symmetry and dimension on the critical properties (a) The simulation of the Ising model on the triangular lattice is relevant to the understanding of the experimentally observed phases of materials that can be absorbed on the surface of graphite. The nature of the triangular lattice is discussed in Chapter 8 (see Figure 8.5). The main difference between the triangular lattice and the square lattice is the number of nearest neighbors. Make the necessary modifications in your program’ for example, determine the energy changes due to a flip of a single spin and the corresponding values of the transition probabilities. Compute C and χ for different values of T in the interval [2,5]. Assume that ν = 1 and use finite-size scaling to estimate Tc in the limit of an infinite triangular lattice. Compare your estimate of Tc to the known value kTc/J = 3.641 (to three decimal places). (b) No exact analytic results are available for the Ising model in three dimensions. (It has been shown by Istrail that this model cannot be solved analytically.) Write a Monte Carlo program to simulate the Ising model on the simple cubic lattice. Compute C and χ for T in the range 3.2 ≤ T ≤ 5 in steps of 0.2 for different values of L. Estimate Tc(L) from the maximum of C and χ. How do these estimates of Tc(L) compare? Use the values of Tc(L) that exhibit a stronger L-dependence and plot Tc(L) versus L−1/ν for different values of ν in the range 0.5 to 1 (see [15.35)]. Show that the extrapolated value of Tc(L = ∞) does not depend sensitively on the value of ν. Compare your estimate for Tc(L = ∞) to the known value kTc/J = 4.5108 (to four decimal places). (c) Compute |m|, C, and χ at T = Tc ≈ 4.5108 for different values of L on the simple cubic lattice. Do a finite-size scaling analysis to estimate β/ν, α/ν, and γ/ν. The best known values of the critical exponents for the three-dimensional Ising model are given in Table 12.1. For comparison, published Monte Carlo results in 1976 for the finite-size behavior of the Ising model on the simple cubic Ising lattice are for L = 6 to L = 20; 2000–5000 Monte Carlo steps per spin were used for calculating the averages after equilibrium had been reached. Can you obtain more accurate results? Problem 15.19. Critical slowing down (a) Consider the Ising model on a square lattice with L = 16. Compute the autocorrelation functions CM(t) and CE(t) and determine the correlation times τM and τE for T = 2.5, 2.4, and 2.3. Determine the correlation times as discussed in Problem 15.14b. How do these correlation times compare with one another? Show that τ increases as the critical temperature is approached, an effect known as critical slowing down. (b) We can characterize critical slowing down by the dynamical critical exponent z defined by τ ∼ ξz . (15.41) On a finite lattice we have τ ∼ Lz at T = Tc. Compute τ for different values of L at T = Tc and make a very rough estimate of z. (The value of z for the two-dimensional Ising model with spin flip dynamics is ≈ 2.167.) CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 615 The values of τ and z found in Problem 15.19 depend on our choice of dynamics (algorithm). The reason for the large value of z is the existence of large domains of parallel spins near the critical point. It is difficult for the Metropolis algorithm to decorrelate a domain because it has to do so one spin at a time. What is the probability of flipping a single spin in the middle of a domain at T = Tc? Which spins in a domain are more likely to flip? What is the dominant mechanism for decorrelating a domain of spins? In one dimension Cordery et al. showed how z can be calculated exactly by considering the motion of a domain wall as a random walk. Although we have generated a trial change by flipping a single spin, it is possible that other types of trial changes would be more efficient. A problem of much current interest is the development of more efficient algorithms near phase transitions (see Project 15.32). 15.9 Other Applications of the Ising Model Because the applications of the Ising model range from flocking birds to beating hearts, we can mention only a few of the applications here. In the following, we briefly describe applications of the Ising model to first-order phase transitions, lattice gases, antiferromagnetism, and the order-disorder transition in binary alloys. So far we have discussed the continuous phase transition in the Ising model and have found that the energy and magnetization vary continuously with the temperature, and thermodynamic derivatives such as the specific heat and the susceptibility diverge near Tc (in the limit of an infinite lattice). In Problem 15.20 we discuss a simple example of a first-order phase transition. Such transitions are accompanied by discontinuous (finite) changes in thermodynamic quantities such as the energy and the magnetization. Problem 15.20. The Ising model in an external magnetic field (a) Modify your two-dimensional Ising program so that the energy of interaction with an external magnetic field B is included. It is convenient to measure B in terms of the dimensionless ratio h = βB. (Remember that B has already absorbed a factor of µ.) Compute m, the mean magnetization per spin, as a function of h for T < Tc. Consider a square lattice with L = 32 and equilibrate the system at T = 1.8 and h = 0. Adopt the following procedure to obtain m(h). (i) Use an equilibrium configuration at h = 0 as the initial configuration for h1 = ∆h = 0.2. (ii) Run the system for 100 Monte Carlo steps per spin before computing averages. (iii) Average m over 100 Monte Carlo steps per spin. (iv) Use the last configuration for hn as the initial configuration for hn+1 = hn + ∆h. (v) Repeat steps (ii)–(iv) until m ≈ 0.95. Plot m versus h. Do the measured values of m correspond to equilibrium averages? (b) Start from the last configuration in part (a) and decrease h by ∆h = −0.2 in the same way as in part (a) until h passes through zero and m ≈ −0.95. Extend your plot of m versus h to include negative h values. Does m remain positive for small negative h? Do the measured values of m for negative h correspond to equilibrium averages? Draw the spin configurations for several values of h. Do you see evidence of domains? CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 616 (c) Now increase h by ∆h = 0.2 until the m versus h curve forms an approximately closed loop. What is the value of m at h = 0? This value of m is the spontaneous magnetization. (d) A first-order phase transition is characterized by a discontinuity (for an infinite lattice) in the order parameter. In the present case the transition is characterized by the behavior of m as a function of h. What is your measured value of m for h = 0.2? If m(h) is double valued, which value of m corresponds to the equilibrium state, an absolute minima in the free energy? Which value of m corresponds to a metastable state, a relative minima in the free energy? What are the equilibrium and metastable values of m for h = −0.2? First-order transitions exhibit hysteresis, and the properties of the system depend on the history of the system, for example, whether h is increasing or decreasing. Because of the long lifetime of metastable states near a first–order phase transition, a system can mistakenly be interpreted as being in the state of minimum free energy. We also know that near a continuous phase transition, the relaxation to equilibrium becomes very long (see Problem 15.19), and hence a system with a continuous phase transition can also behave as if it were in a metastable state. For these reasons it is difficult to distinguish the nature of a phase transition using computer simulations. This problem is discussed further in Section 15.11. (e) Repeat the above simulations for T = 3, a temperature above Tc. Why do your results differ from the simulations in parts (a)–(c) done for T < Tc? The Ising model also describes systems that might appear to have little in common with ferromagnetism. For example, we can interpret the Ising model as a lattice gas, where a down spin represents a lattice site occupied by a molecule and an up site represents an empty site. Each lattice site can be occupied by at most one molecule, and the molecules interact with their nearest neighbors. The lattice gas is a crude model of the behavior of a real gas of molecules and is a simple model of the liquid-gas transition and the critical point. What properties does the lattice gas have in common with a real gas? What properties of real gases does the lattice gas neglect? If we wish to simulate a lattice gas, we have to decide whether to do the simulation at fixed density or at fixed chemical potential µ and a variable number of particles. The implementation of the latter is straightforward because the grand canonical ensemble for a lattice gas is equivalent to the canonical ensemble for Ising spins in an external magnetic field; that is, the effect of the magnetic field is to fix the mean number of up spins. Hence, we can simulate a lattice gas in the grand canonical ensemble by doing spin flip dynamics. (The volume of the lattice is an irrelevant parameter.) Another application of a lattice gas model is to phase separation in a binary or A-B alloy. In this case spin up and spin down correspond to a site occupied by an A atom and B atom, respectively. As an example, the alloy β-brass has a low temperature ordered phase in which the two components (copper and zinc) have equal concentrations and form a cesium chloride structure. As the temperature is increased, some zinc atoms exchange positions with copper atoms, but the system is still ordered. However, above the critical temperature Tc = 742 K, the zinc and copper atoms become mixed and the system is disordered. This transition is an example of an order-disorder transition. If we wish to approximate the actual dynamics of an alloy, then the number of A atoms and the number of B atoms is fixed, and we cannot use spin flip dynamics to simulate a binary alloy. A dynamics that does conserve the number of down and up spins is known as spin exchange or Kawasaki dynamics. In this dynamics a trial interchange of two nearest neighbor spins is made and the change in energy ∆E is calculated. The criterion for the acceptance or rejection of the trial change is the same as before. CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 617 Problem 15.21. Simulation of a lattice gas (a) Modify your Ising program so that spin exchange dynamics rather than spin flip dynamics is implemented. Determine the possible values of ∆E on the square lattice and the possible values of the transition probability and change the way a trial change is made. If we are interested only in the mean value of quantities such as the total energy, we can reduce the computation time by not considering the interchange of parallel spins (which has no effect). For example, we can keep a list of bonds between occupied and empty sites and make trial moves by choosing bonds at random from this list. For small lattices such a list is unnecessary, and a trial move can be generated by simply choosing a spin and one of its nearest neighbors at random. (b) Consider a square lattice with L = 32 and 512 sites initially occupied. (The number of occupied sites is a conserved variable and must be specified initially.) Determine the mean energy for T in the range 1 ≤ T ≤ 4. Plot the mean energy as a function of T . Does the energy appear to vary continuously? (c) Repeat the calculations of part (b) with 612 sites initially occupied and plot the mean energy as a function of T . Does the energy vary continuously? Do you see any evidence of a firstorder phase transition? (d) Because down spins correspond to particles, we can compute their single particle diffusion coefficient. Use an array to record the position of each particle as a function of time. After equilibrium has been reached, compute R(t)2 , the mean square displacement of a particle. Is it necessary to “interchange” two like spins? If the particles undergo a random walk, the self-diffusion constant D is defined as D = lim t→∞ 1 2dt R(t)2 . (15.42) Estimate D for different temperatures and numbers of occupied sites. Note that if a particle starts at x0 and returns to x0 by moving in one direction on the average using periodic boundary conditions, the net displacement in the x direction is L not 0 (see Section 8.10 for a discussion of how to compute the diffusion constant for systems with periodic boundary conditions). Although you are probably familiar with ferromagnetism, for example, a magnet on a refrigerator door, nature provides more examples of antiferromagnetism. In the language of the Ising model, antiferromagnetism means that the exchange parameter J is negative and nearest neighbor spins prefer to be aligned in opposite directions. As we will see in Problem 15.22, the properties of the antiferromagnetic Ising model on a square lattice are similar to the ferromagnetic Ising model. For example, the energy and specific heat of the ferromagnetic and antiferromagnetic Ising models are identical at all temperatures in zero magnetic field, and the system exhibits a phase transition at the Néel temperature TN . On the other hand, the total magnetization and susceptibility do not exhibit critical behavior near TN . Instead, we need to define two sublattices for the square lattice corresponding to the red and black squares of a checkerboard and introduce the staggered magnetization Ms, which is equal to the difference of the magnetization of the two sublattices. We will find in Problem 15.22 that the temperature dependence of Ms and the staggered susceptibility χs are identical to the analogous quantities in the ferromagnetic Ising model. Problem 15.22. Antiferromagnetic Ising model CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 618 ? Figure 15.5: An example of frustration on a triangular lattice. The interaction in antiferromag- netic. (a) Modify the Ising class to simulate the antiferromagnetic Ising model on the square lattice in zero magnetic field. Because J does not appear explicitly in class Ising, change the sign of the energy calculations in the appropriate places in the program. To compute the staggered magnetization on a square lattice, define one sublattice to be the sites (x,y) for which the product mod(x,2) × mod(y,2) = 1; the other sublattice corresponds to the remaining sites. (b) Choose L = 32 and all spins up initially. What configuration of spins corresponds to the state of lowest energy? Compute the temperature dependence of the mean energy, the magnetization, the specific heat, and the susceptibility. Does the temperature dependence of any of these quantities show evidence of a phase transition? (c) In part (b) you might have noticed that χ shows a cusp. Compute χ for different values of L at T = TN ≈ 2.269. Do a finite-size scaling analysis and verify that χ does not diverge at T = TN . (d) Compute the temperature dependence of Ms and the staggered susceptibility χs defined as [see (15.21)] χs = 1 kT Ms 2 − Ms 2 . (15.43) (Below Tc it is better to compute |Ms| instead of Ms for small lattices.) Verify that the temperature dependence of Ms for the antiferromagnetic Ising model is the same as the temperature dependence of M for the Ising ferromagnet. Could you have predicted this similarity without doing the simulation? Does χs show evidence of a phase transition? (e) Consider the behavior of the antiferromagnetic Ising model on a triangular lattice. Choose L ≥ 32 and compute the same quantities as before. Do you see any evidence of a phase transition? Draw several configurations of the system at different temperatures. Do you see evidence of many small domains at low temperatures? Is there a unique ground state? If you cannot find a unique ground state, you share the same frustration as do the individual spins in the antiferromagnetic Ising model on the triangular lattice. We say that this model exhibits frustration because there is no spin configuration on the triangular lattice such that all spins are able to minimize their energy (see Figure 15.5). The Ising model is one of many models of magnetism. The Heisenberg, Potts, and x-y models are other examples of models of magnetic materials. Monte Carlo simulations of these CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 619 temperature fusion curve solid gas liquid critical point triple point sublimation curve vapor pressure curve pressure Figure 15.6: A sketch of the phase diagram for a simple material. models and others have been important in the development of our understanding of phase transitions in both magnetic and nonmagnetic materials. Some of these models are discussed in Section 15.14. 15.10 Simulation of Classical Fluids The existence of matter as a solid, liquid, and gas is well known (see Figure 15.6). Our goal in this section is to use Monte Carlo methods to gain additional insight into the qualitative differences between these phases. The Monte Carlo simulation of classical systems is simplified considerably by the fact that the velocity (momentum) variables are decoupled from the position variables. The total energy of the system can be written as E = K({vi}) + U({ri}), where the kinetic energy K is a function of only the particle velocities {vi}, and the potential energy U is a function of only the particle positions {ri}. This separation implies we need to sample only the positions of the molecules, that is, the “configurational” degrees of freedom. Because the velocity appears quadratically in the kinetic energy, it can be shown using classical statistical mechanics that the contribution of the velocity coordinates to the mean energy is 1 2 kT per degree of freedom. Is this simplification possible for quantum systems? The physically relevant quantities of a fluid include its mean energy, specific heat, and equation of state. Another interesting quantity is the radial distribution function g(r) which we introduced in Chapter 8. We will find in Problems 15.23–15.25 that g(r) is a probe of the density fluctuations and, hence, a probe of the local order in the system. If only two-body forces are present, the mean potential energy per particle can be expressed as [see (8.16)] U N = ρ 2 g(r)u(r)dr, (15.44) and the (virial) equation of state can be written as [see (8.17)] βP ρ = 1 − βρ 2d g(r)r du(r) dr dr. (15.45) CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 620 Hard core interactions. To separate the effects of the short range repulsive interaction from the longer range attractive interaction, we first investigate a model of hard disks with the interparticle interaction u(r) =    +∞ (r < σ) 0 (r ≥ σ). (15.46) Such an interaction has been extensively studied in one dimension (hard rods), two dimensions (hard disks), and three dimensions (hard spheres). Hard sphere systems were the first systems studied by Metropolis and coworkers. Because there is no attractive interaction present in (15.46), there is no transition from a gas to a liquid. Is there a phase transition between a fluid phase at low densities and a solid at high densities? Can a solid form in the absence of an attractive interaction? What are the physically relevant quantities for a system with a hard core interaction? The mean potential energy is of no interest because the potential energy is always zero. The major quantity of interest is g(r) which yields information about the correlations of the particles and the equation of state. If the interaction is given by (15.46), it can be shown that (15.45) reduces to βP ρ = 1 + 2π 3 ρσ3 g(σ) (d = 3) (15.47a) = 1 + π 2 ρσ2 g(σ) (d = 2) (15.47b) = 1 + ρσg(σ). (d = 1) (15.47c) We will calculate g(r) for different values of r and then extrapolate our results to r = σ (see Problem 15.23b). Because the application of molecular dynamics and Monte Carlo methods to hard disks is similar, we discuss the latter method only briefly and do not include a program. The idea is to choose a disk at random and move it to a trial position as implemented in the following: int i = ( int ) (N Math . random ( ) ) ; / / choose a p a r t i c l e at random x t r i a l += (2.0 Math . random ( ) − 1.0) delta ; / / d e l t a i s maximum displacement y t r i a l += (2.0 Math . random ( ) − 1.0) delta ; If the new position overlaps another disk, the move is rejected and the old configuration is retained; otherwise, the move is accepted. A reasonable, although not necessarily optimum, choice for the maximum displacement δ is to choose δ such that approximately 20% of the trial moves are accepted. The major difficulty in implementing this algorithm is determining the overlap of two particles. If the number of particles is not too large, it is sufficient to compute the distances between the trial particle and all the other particles, instead of just considering the smaller number of particles that are in the immediate vicinity of the trial particle. For larger systems this procedure is too time consuming, and it is better to divide the system into cells and to only compute the distances between the trial particle and particles in the same and neighboring cells. The choice of initial positions for the disks is more complicated than it might first appear. One strategy is to place each successive disk at random in the box. If a disk overlaps one that is already present, generate another pair of random numbers and attempt to place the disk again. If the desired density is low, an acceptable initial configuration can be computed fairly quickly in this way, but if the desired density is high, the probability of adding a disk will be very small (see Problem 15.24a). To reach higher densities, we might imagine beginning with the desired CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 621 number of particles in a low density configuration and moving the boundaries of the central cell inward until a boundary just touches one of the disks. Then the disks are moved a number of Monte Carlo steps and the boundaries are moved inward again. This procedure also becomes more difficult as the density increases. The most efficient procedure is to start the disks on a lattice at the highest density of interest such that no overlap of disks occurs. We first consider a one-dimensional system of hard rods for which the equation of state and g(r) can be calculated exactly. The equation of state is given by P NkT = 1 L − Nσ . (15.48) Because hard rods cannot pass through one another, the excluded volume is Nσ and the available volume is L − Nσ. Note that the form of (15.48) is the same as the van der Waals equation of state (cf. Reif) with the contribution from the attractive part of the interaction equal to zero. Problem 15.23. Monte Carlo simulation of hard rods (a) Write a program to do a Monte Carlo simulation of a system of hard rods. Adopt periodic boundary conditions and refer to class HardDisks in Chapter 8 for the structure of the program. The major difference is the nature of the trial moves. Measure all lengths in terms of the hard rod diameter σ. Choose L = 36 and N = 30. How does the number density ρ = N/L compare to the maximum possible density? Choose the initial positions to be on a one-dimensional grid and let the maximum displacement be δ = 0.1. Approximately how many Monte Carlo steps per particle are necessary to reach equilibrium? What is the equilibrium acceptance probability? Compute the pair correlation function g(x). (b) Compute g(x) as a function of the distance x for x ≤ L/2. Why does g(x) = 0 for x < 1? What is the physical interpretation of the peaks in g(x)? Because the mean pressure can be determined from g(x) at x = 1+ [see (15.47)], determine g(x) at contact. An easy way to extrapolate your results for g(x) to x = 1 is to fit the three values of g(x) closest to x = 1 to a parabola. Use your result for g(x = 1+) to determine the mean pressure. (c) Compute g(x) at several lower densities by using an equilibrium configuration from a previous run and increasing L. How do the size and the location of the peaks in g(x) change? Problem 15.24. Monte Carlo simulation of hard disks (a) The maximum packing density can be found by placing the disks on a triangular lattice with the nearest neighbor distance equal to the disk diameter σ. What is the maximum packing density of hard disks; that is, how many disks can be packed together in a cell of area A? (b) Write a simple program that adds disks at random into a rectangular box of area A = Lx ×Ly with the constraint that no two disks overlap. If a disk overlaps a disk already present, generate another pair of random numbers and try to place the disk again. If the density is low, the probability of adding a disk is high, but if the desired density is high most of the disks will be rejected. For simplicity, do not worry about periodic boundary conditions and accept a disk if its center lies within the box. Choose Lx = 6 and Ly = √ 3Lx/2 and determine the maximum density ρ = N/A that you can attain in a reasonable amount of CPU time. How does this density compare to the maximum packing density? What is the qualitative nature of the density dependence of the acceptance probability? CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 622 (c) Modify your Monte Carlo program for hard rods to a system of hard disks. Begin at a density ρ slightly lower than the maximum packing density ρ0. Choose N = 64 with Lx = 8.81 and Ly = √ 3Lx/2. Compare the density ρ = N/(LxLy) to the maximum packing density. Choose the initial positions of the particles to be on a triangular lattice. A reasonable first choice for the maximum displacement δ is δ = 0.1. Compute g(r) for ρ/ρ0 = 0.95, 0.92, 0.88, 0.85, 0.80, 0.70, 0.60, and 0.30. Keep the ratio of Lx/Ly fixed and save a configuration from the previous run to be the initial configuration of the new run at lower ρ. (See page 266 for how to save and read configurations.) Allow at least 400 Monte Carlo steps per particle for the system to equilibrate and average g(r) for mcs ≥ 400. (d) What is the qualitative behavior of g(r) at high and low densities? For example, describe the number and height of the peaks of g(r). If the system is crystalline, then g(r) is not spherically symmetric. What would you compute in this case? (e) Use your results for g(r = 1+) to compute the mean pressure P as a function of ρ [see (15.47b)]. Plot the ratio P V /NkT as a function of ρ, where the volume V is the area of the system. How does the temperature T enter into the Monte Carlo simulation? Is the ratio P V /NkT an increasing or decreasing function of ρ? At low densities we might expect the system to act like an ideal gas with the volume replaced by (V − Nσ). Compare your low density results with this prediction. (f) Take snapshots of the disks at intervals of ten to twenty Monte Carlo steps per particle. Do you see any evidence of the solid becoming a fluid at lower densities? (g) Compute an effective diffusion coefficient D by determining the mean square displacement R2(t) of the particles after equilibrium is reached. Use the relation (15.42) and identify the time t with the number of Monte Carlo steps per particle. Estimate D for the densities considered in part (b) and plot the product ρD as a function of ρ. What is the dependence of D on ρ for a dilute gas? Can you identify a range of ρ where D drops abruptly? Do you observe any evidence of a phase transition? (h) The magnitude of the maximum displacement parameter δ is arbitrary. If the density is high and δ is large, then a high proportion of the trial moves will be rejected. On the other hand, if δ is small, the acceptance probability will be close to unity, but the successive configurations will be strongly correlated. Hence, if δ is too large or is too small, the simulation would be inefficient. One way to choose δ is to find the value of δ that maximizes the mean square displacement over a fixed time interval. The idea is that the mean square displacement is a measure of the exploration of phase space. Fix the density and determine the value of δ that maximizes R2(t) . What is the corresponding acceptance probability? Continuous potentials. Our simulations of hard disks suggest that there is a phase transition from a fluid at low densities to a solid at higher densities. This conclusion is consistent with molecular dynamics and Monte Carlo studies of larger systems. Although the existence of a fluid-solid transition for hard sphere and hard disk systems is now well accepted, the relatively small numbers of particles used in any simulation should remind us that results of this type cannot be taken as evidence independently of any theoretical justification. The existence of a fluid-solid transition for hard spheres implies that the transition is determined by the repulsive part of the potential. We now consider a system with both a repulsive and an attractive contribution. Our primary goal will be to determine the influence of the attractive part of the potential on the structure of a liquid. CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 623 We adopt as our model interaction the Lennard–Jones potential: u(r) = 4 σ r 12 − σ r 6 . (15.49) The nature of the Lennard–Jones potential and the appropriate choice of units for simulations was discussed in Chapter 8 (see Table 8.1). We consider in Problem 15.25 the application of the Metropolis algorithm to a system of N particles in a cell of fixed volume V (area) interacting via the Lennard–Jones potential. Because the simulation is at fixed T , V , and N, the simulation samples configurations of the system according to the Boltzmann distribution (15.4). Problem 15.25. Monte Carlo simulation of a Lennard–Jones system (a) The properties of a two-dimensional Lennard–Jones system have been studied by many workers under a variety of conditions. Write a program to compute the total energy of a system of N particles on a triangular lattice of area Lx × Ly with periodic boundary conditions. Choose N = 64,Lx = 9.2, and Ly = √ 3Lx/2. Why does this energy correspond to the energy at temperature T = 0? Does the energy per particle change if you consider bigger systems at the same density? (b) Write a program to compute the mean energy, pressure, and the radial distribution function using the Metropolis algorithm. One way of computing the change in the potential energy of the system due to a trial move of one of the particles is to use an array pe for the potential energy of interaction of each particle. For simplicity, compute the potential energy of particle i by considering its interaction with the other N − 1 particles. The total potential energy of the system is the sum of the array elements pe(i) over all N particles divided by two to account for double counting. For simplicity, accumulate data after each Monte Carlo step per particle. (c) Choose the same values of N, Lx, and Ly as in part (a) but give each particle an initial random displacement from its triangular lattice site of magnitude 0.2. Do the Monte Carlo simulation at a very low temperature such as T = 0.1. Choose the maximum trial displacement δ = 0.15 and consider mcs ≥ 400. Does the system retain its symmetry? Does the value of δ affect your results? (d) Use the same initial conditions as in part (a), but take T = 0.5. Choose δ = 0.15 and run for a number of Monte Carlo steps per particle that is sufficient to yield a reasonable result for the mean energy. Do a similar simulation at T = 1 and T = 2. What is the best choice of the initial configuration in each case? The harmonic theory of solids predicts that the total energy of a system is due to a T = 0 contribution plus a term due to the harmonic oscillation of the atoms. The contribution of the latter part should be proportional to the temperature. Compare your results for E(T )−E(0) with this prediction. Use the values of σ and given in Table 8.1 to determine the temperature and energy in SI units for your simulations of solid argon. (e) Decrease the density by multiplying Lx, Ly, and all the particle coordinates by 1.07. What is the new value of ρ? Estimate the number of Monte Carlo steps per particle needed to compute E and P at T = 0.5 to approximately 10% accuracy. Is the total energy positive or negative? How do E and P compare to their ideal gas values? Follow the method discussed in Problem 15.24 and compute an effective diffusion constant. Is the system a liquid or a solid? Plot g(r) versus r and compare g(r) to your results for hard disks at the same density. CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 624 What is the qualitative behavior of g(r)? What is the interpretation of the peaks in g(r) in terms of the structure of the liquid? If time permits, consider a larger system at the same density and temperature and compute g(r) for larger r. (f) Consider the same density as in part (e) at T = 0.6 and T = 1. Look at some typical configurations of the particles. Use your results for E(T ), P (T ), g(r) and the other data you have collected and discuss whether the system is a gas, liquid, or solid at these temperatures. What criteria can you use to distinguish a gas from a liquid? If time permits, repeat these calculations for ρ = 0.7. (g) Compute E, P , and g(r) for N = 64, Lx = Ly = 20, and T = 3. These conditions correspond to a dilute gas. How do your results for P compare with the ideal gas equation of state? How does g(r) compare with the results you obtained for the liquid? (h) The chemical potential can be measured using the Widom insertion method. From thermodynamics we know that µ = ∂F ∂N V ,T = −kT ln ZN+1 ZN (15.50) in the limit N → ∞, where F is the Helmholtz free energy and ZN is the partition function for N particles. The ratio ZN+1/ZN is the average of e−β∆E over all possible states of the added particle with added energy ∆E. The idea is to compute the change in the energy ∆E that would occur if an imaginary particle were added to the N particle system at random. Average the value of e−β∆E over many configurations generated by the Metropolis algorithm. The chemical potential is then given by µ = −kT ln e−β∆E . (15.51) Note that in the Widom insertion method, no particle is actually added to the system during the simulation. The chemical potential computed in (15.51) is the excess chemical potential and does not include the part of the chemical potential due to the momentum degrees of freedom, which is equal to the chemical potential of an ideal gas. Compute the chemical potential of a dense gas, liquid, and solid. In what sense is the chemical potential a measure of how easy it is to add a particle to the system? 15.11 Optimized Monte Carlo Data Analysis As we have seen, the important physics near a phase transition occurs on long length scales. For this reason, we might expect that simulations, which for practical reasons are restricted to relatively small systems, might not be useful for simulations near a phase transition. Nevertheless, we have found that methods such as finite-size scaling can yield information about how systems behave in the thermodynamic limit. We now explore some additional Monte Carlo techniques that are useful near a phase transition. The Metropolis algorithm yields mean values of various thermodynamic quantities, for example, the energy at particular values of the temperature T . Near a phase transition many thermodynamic quantities change rapidly, and we need to determine these quantities at many closely spaced values of T . If we were to use standard Monte Carlo methods, we would have to do many simulations to cover the desired range of values of T . To overcome this problem, we introduce the use of histograms which allow us to extract more information from a single Monte CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 625 Carlo simulation. The idea is to use our knowledge of the equilibrium probability distribution at one value of T (and other external parameters) to estimate the desired thermodynamic averages at neighboring values of the external parameters. The first step of the single histogram method is to simulate the system at an inverse temperature β0 which is near the values of β of interest and measure the energy of the system after every Monte Carlo step per spin (or other fixed interval). The measured probability that the system has energy E can be expressed as P (E,β0) = H0(E) E H0(E) . (15.52) The histogram H0(E) is the number of configurations with energy E, and the denominator is the total number of measurements of E. Because the probability of a given configuration is given by the Boltzmann distribution, we have P (E,β) = g(E)e−βE E g(E)e−βE (15.53) where g(E) is the number of microstates with energy E. (The density of states g(E) should not be confused with the radial distribution function g(r). If the energy is a continuous function, g(E) becomes the number of states per unit energy interval. However, g(E) is usually referred to as the density of states regardless of whether E is a continuous or discrete variable.) If we compare (15.52) and (15.53) and note that g(E) is independent of T , we can write g(E) = a0H0(E)eβ0E (15.54) where a0 is a proportionality constant that depends on β0. If we eliminate g(E) from (15.53) by using (15.54), we obtain the desired relation P (E,β) = H0(E)e−(β−β0)E E H0(E)e−(β−β0)E . (15.55) Note that we have expressed the probability at the inverse temperature β in terms of H0(E), the histogram at the inverse temperature β0. Because β is a continuous variable, we can estimate the β dependence of the mean value of any function A that depends on E, for example, the mean energy and the specific heat. We write the mean of A(E) as A(β) = E A(E)P (E,β). (15.56) If the quantity A depends on another quantity M, for example, the magnetization, then we can generalize (15.56) to A(β) = E,M A(E,M)P (E,M,β) (15.57a) = E,M A(E,M)H0(E,M)e−(β−β0)E E,M H0(E,M)e−(β−β0)E . (15.57b) The histogram method is useful only when the configurations relevant to the range of temperatures of interest occur with reasonable probability during the simulation at temperature CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 626 T0. For example, if we simulate an Ising model at low temperatures at which only ordered configurations occur (most spins aligned in the same direction), we cannot use the histogram method to obtain meaningful thermodynamic averages at high temperatures for which most configurations are disordered. Problem 15.26. Application of the histogram method (a) Consider a 4 × 4 Ising lattice in zero magnetic field and use the Metropolis algorithm to compute the mean energy per spin, the mean magnetization per spin, the specific heat, and the susceptibility per spin for T = 1 to T = 3 in steps of ∆T = 0.05. Average over at least 5000 Monte Carlo steps for each value of T per spin after equilibrium has been reached. (b) What are the minimum and maximum values of the total energy E that might be observed in a simulation of a Ising model on a 4×4 lattice? Use these values to set the size of the array needed to accumulate data for the histogram H(E). Accumulate data for H(E) at T = 2.27, a value of T close to Tc, for at least 5000 Monte Carlo steps per spin after equilibration. Compute the energy and specific heat using (15.56). Compare your computed results with the data obtained by simulating the system directly, that is, without using the histogram method, at the same temperatures. At what temperatures does the histogram method break down? (c) What are the minimum and maximum values of the magnetization M that might be observed in a simulation of a Ising model on a 4 × 4 lattice? Use these values to set the size of the two-dimensional array needed to accumulate data for the histogram H(E,M). Accumulate data for H(E,M) at T = 2.27, a value of T close to Tc, for at least 5000 Monte Carlo steps per spin after equilibration. Compute the same thermodynamic quantities as in part (a) using (15.57b). Compare your computed results with the data obtained by simulating the system directly, that is, without using the histogram method, at the same temperatures. At what temperatures does the histogram method break down? (d) Repeat part (c) for a simulation centered about T = 1.5 and T = 2.5. (e) Repeat part (c) for an 8 × 8 and a 16 × 16 lattice at T = 2.27. The histogram method can be used to do a more sophisticated finite-size scaling analysis to determine the nature of a transition. Suppose that we perform a Monte Carlo simulation and observe a peak in the specific heat as a function of the temperature. What can this observation tell us about a possible phase transition? In general, we can conclude very little without doing a careful analysis of the behavior of the system at different sizes. For example, a discontinuity in the energy in an infinite system might be manifested in small systems by a broad peak in the specific heat. However, we have seen that the specific heat of a system with a continuous phase transition in the thermodynamic limit may manifest itself in the same way in a small system. Another difficulty is that the peak in the specific heat of a small system occurs at a temperature that differs from the transition temperature in the infinite system (see Project 15.37). Finally, there might be no transition at all, and the peak might simply represent a broad crossover from high to low temperature behavior (see Project 15.38). We now discuss a method due to Lee and Kosterlitz that uses the histogram data to determine the nature of a phase transition (if it exists). To understand this method, we use the Helmholtz free energy F of a system. At low T , the low energy configurations dominate the contributions to the partition function Z, even though there are relatively few such configurations. At high T , the number of disordered configurations with high E is large, and hence high energy CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 627 configurations dominate the contribution to Z. These considerations suggest that it is useful to define a restricted free energy F(E) that includes only the configurations at a particular energy E. We define F(E) = −kT ln g(E)e−βE . (15.58) For systems with a first-order phase transition, a plot of F(E) versus E will show two local minima corresponding to configurations that are characteristic of the high and low temperature phases. At low T the minimum at the lower energy will be the absolute minimum, and at high T the higher energy minimum will be the absolute minimum of F. At the transition, the two minima will have the same value of F(E). For systems with no transition in the thermodynamic limit, there will only be one minimum for all T . How will F(E) behave for the relatively small lattices that we can simulate? In systems with first-order transitions, the distinction between low and high temperature phases will become more pronounced as the system size is increased. If the transition is continuous, there are domains at all sizes, and we expect that the behavior of F(E) will not change significantly as the system size increases. If there is no transition, there might be a spurious double minima for small systems, but this spurious behavior should disappear for larger systems. Lee and Kosterlitz proposed the following method for categorizing phase transitions. 1. Do a simulation at a temperature close to the suspected transition temperature and compute H(E). Usually the temperature at which the peak in the specific heat occurs is chosen as the simulation temperature. 2. Use the histogram method to compute F(E) ∝ −lnH0(E) + (β − β0)E at neighboring values of T . If there are two minima in F(E), vary β until the values of F(E) at the two minima are equal. This temperature is an estimate of the possible transition temperature Tc. 3. Measure the difference ∆F at Tc between F(E) at the minima and F(E) at the maximum between the two minima. 4. Repeat steps (1)–(3) for larger systems. If ∆F increases with size, the transition is first order. If ∆F remains the same, the transition is continuous. If ∆F decreases with size, there is no thermodynamic transition. The above procedure is applicable when the phase transition occurs by varying the temperature. Transitions can also occur by varying the pressure or the magnetic field. These field-driven transitions can be tested by a similar method. For example, consider the Ising model in a magnetic field at low temperatures below Tc. As we vary the magnetic field from positive to negative, there is a transition from a phase with magnetization M > 0 to a phase with M < 0. Is this transition first order or continuous? To answer this question, we can use the Lee–Kosterlitz method with a histogram H(E,M) generated at zero magnetic field and calculate F(M) instead of F(E). The quantity F(M) is proportional to −ln E H(E,M)e−(β−β0)E. Because the states with positive and negative magnetization are equally likely to occur for zero magnetic field, we should see a double minima structure for F(M) with equal minima. As we increase the size of the system, ∆F should increase for a first-order transition and remain the same for a continuous transition. Problem 15.27. Characterization of a phase transition (a) Use your modified version of class Ising from Problem 15.26 to determine H(E,M). Read the H(E,M) data from a file and compute and plot F(E) for the range of temperatures of CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 628 interest. First generate data at T = 2.27 and use the Lee–Kosterlitz method to verify that the Ising model in two dimensions has a continuous phase transition in zero magnetic field. Consider lattices of sizes L = 4, 8, and 16. (b) Do a Lee–Kosterlitz analysis of the Ising model at T = 2 and zero magnetic field by plotting F(M). Determine if the transition from M > 0 to M < 0 is first order or continuous. This transition is called field driven because the transition occurs if we change the magnetic field. Make sure your simulations sample configurations with both positive and negative magnetization by using small values of L such as L = 4, 6, and 8. (c) Repeat part (b) at T = 2.5 and determine if there is a field-driven transition at T = 2.5. ∗Problem 15.28. The Potts model In the q-state Potts model, the total energy or Hamiltonian of the lattice is given by E = −J i,j=nn(i) δsi,sj (15.59) where si at site i can have the values 1,2,...,q; the Kronecker delta function δa,b equals unity if a = b, and is zero otherwise. As before, we will measure the temperature in energy units. Convince yourself that the q = 2 Potts model is equivalent to the Ising model (except for a trivial difference in the energy minimum). One of the many applications of the Potts model is to helium absorbed on the surface of graphite. The graphite-helium interaction gives rise to preferred adsorption sites directly above the centers of the honeycomb graphite surface. As discussed by Plischke and Bergersen, the helium atoms can be described by a three-state Potts model. (a) The transition in the Potts model is continuous for small q and first order for larger q. Write a Monte Carlo program to simulate the Potts model for a given value of q and store the histogram H(E). Test your program by comparing the output for q = 2 with your Ising model program. (b) Use the Lee–Kosterlitz method to analyze the nature of the phase transition in the Potts model for q = 3, 4, 5, 6, and 10. First find the location of the specific heat maximum, and then collect data for H(E) at the specific heat maximum. Lattice sizes of order L ≥ 50 are required to obtain convincing results for some values of q. Another way to determine the nature of a phase transition is to use the Binder cumulant method. The cumulant is defined by UL ≡ 1 − E4 3 E2 2 . (15.60) It can be shown that the minimum value of UL is UL,min = 2 3 − 1 3 E2 + − E2 − 2E+E− 2 + O(L−d ) (15.61) where E+ and E− are the energies of the two phases in a first-order transition. These results are derived by considering the distribution of energy values to be a sum of Gaussians about each phase at the transition, which become sharper and sharper as L → ∞. If UL,min = 2/3 in the infinite size limit, then the transition is continuous. CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 629 Problem 15.29. The Binder cumulant and the nature of the transition (a) Suppose that the energy in a system is given by a Gaussian distribution with a zero mean. What is the corresponding value of UL? (b) Consider the two-dimensional Ising model in the absence of a magnetic field and consider the cumulant VL ≡ 1 − M4 3 M2 2 . (15.62) Compute VL for a temperature much higher than Tc. What is the value of VL? What is the value of VL at T = 0? (c) Compute VL for values of T in the range 2.20 ≤ T ≤ 2.35 for L = 10, 20, and 40. Plot VL as a function of T for these three values of L. Note that the three curves for VL cross at a value of T that is approximately Tc. What is the approximate value of VL at this crossing? Can you conclude that the transition is continuous? (d) Repeat Problem 15.28 using the Binder cumulant method and determine the nature of the transition. 15.12 ∗ Other Ensembles So far, we have considered the microcanonical ensemble (fixed N, V , and E) and the canonical ensemble (fixed N, V , and T ). Monte Carlo methods are very flexible and can be adapted to the calculation of averages in any ensemble. Two other ensembles of particular importance are the constant pressure and the grand canonical ensembles. The main difference in the Monte Carlo method is that there are additional moves corresponding to changing the volume or changing the number of particles. The constant pressure ensemble is particularly important for studying first-order phase transitions because the phase transition occurs at a fixed pressure, unlike a constant volume simulation where the system passes through a two phase coexistence region before changing phase completely as the volume is changed. In the NP T ensemble, the probability of a microstate is proportional to e−β(E+P V ). For a classical system, the mean value of a physical quantity A that depends on the positions of the particles can be expressed as A NPT = ∞ 0 dV e−βP V dr1 dr2 ···drN A({r})e−βU({r}) ∞ 0 dV e−βP V dr1 dr2 ···drN e−βU({r}) . (15.63) The potential energy U({r}) depends on the set of particle coordinates ({r}). To simulate the NP T ensemble, we need to sample the coordinates r1,r2,...,rN of the particles and the volume V of the system. For simplicity, we assume that the central cell is a square or a cube so that V = Ld. It is convenient to use the set of scaled coordinates {s}, where si is defined as si = ri L . (15.64) If we substitute (15.64) into (15.63), we can write A NPT as A NPT = ∞ 0 dV e−βP V V N ds1 ds2 ···dsN A({s})e−βU({s}) ∞ 0 dV e−βP V V N ds1 ds2 ···dsN e−βU({s}) (15.65) CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 630 where the integral over {s} is over the unit square (cube). The factor of V N arises from the change of variables r → s. If we let V N = elnV N = eN lnV , we see that the quantity that is analogous to the Boltzmann factor can be written as e−W = e−βP V −βU({s})+N lnV . (15.66) Because the pressure is fixed, a trial configuration is generated from the current configuration by either randomly displacing a particle or making a random change in the volume, for example, V → V + δ(2r − 1), where r is a uniform random number in the unit interval and δ is the maximum change in volume. The trial configuration is accepted if the change ∆W ≤ 0 and with probability e−∆W if ∆W > 0. It is not necessary or efficient to change the volume after every Monte Carlo step per particle. In the grand canonical or µV T ensemble, the chemical potential µ is fixed and the number of particles fluctuates. The average of any function of the particle positions can be written (in three dimensions) as A µVT = ∞ N=0 (1/N!)λ−3N eβµN dr1dr2 ···drN A({r})e−βUN ({r}) ∞ N=0 (1/N!)λ−3N eβµN dr1dr2 ···drN e−βUN ({r}) (15.67) where λ = (h2/2πmkT )1/2. We have made the N-dependence of the potential energy U explicit. If we write 1/N! = e−lnN! and λ−3N = e−N lnλ3 , we can write the quantity that is analogous to the Boltzmann factor as e−W = eβµN−N lnλ3−lnN!+N lnV −βUN . (15.68) If we write the chemical potential as µ = µ∗ + kT ln(λ3 /V ), (15.69) then W can be expressed as e−W = e−βµ∗N−lnN!−βUN . (15.70) There are two possible ways of obtaining a trial configuration. The first involves the displacement of a selected particle; such a move is accepted or rejected according to the usual criteria, that is, by the change in the potential energy UN . In the second possible way, we choose with equal probability whether to attempt to add a particle at a randomly chosen position in the central cell or to remove a particle that is already present. In either case, the trial configuration is accepted if W in (15.70) is increased. If W is decreased, the change is accepted with a probability equal to 1 N + 1 e β µ∗−(UN+1−UN ) (insertion), (15.71a) or Ne −β µ∗+(UN−1−UN ) (removal). (15.71b) In this approach µ∗ is an input parameter, and µ is not determined until the end of the calculation when N µVT is obtained. As we have discussed, the probability that a system at a temperature T has energy E is given by [see (15.53)] P (E,β) = g(E)e−βE Z . (15.72) CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 631 If the density of states g(E) was known, we could calculate the mean energy (and other thermodynamic quantities) at any temperature from the relation E = 1 Z E Eg(E)e−βE . (15.73) Hence, the density of states is a quantity of much interest. Suppose that we were to try to compute g(E) by doing a random walk in energy space by flipping the spins at random and accepting all configurations that are obtained. Then the histogram of the energy H(E) the number of visits to each possible energy E of the system, would converge to g(E) if the walk visited all possible configurations. In practice, it would be impossible to realize such a long random walk given the extremely large number of configurations. For example, the Ising model on a 10 × 10 square lattice has 2100 ≈ 1.3 × 1030 spin configurations. The main difficulty of doing a simple random walk to determine g(E) is that the walk would spend most of its time visiting the same energy values over and over again and would not reach the values of E that are less probable. The idea of the Wang–Landau algorithm is to do a random walk in energy space by flipping single spins at random and to accept the changes with a probability that is proportional to the reciprocal of the density of states. That is, energy values that would be visited often using a simple random walk would be visited less often because they have a bigger density of states. There is only one problem—we don’t know the density of states. We will see that the Wang–Landau algorithm estimates the density of states at the same time that it does a random walk in phase space. For simplicity, we discuss the algorithm in the context of the Ising model for which E is a discrete variable. 1. Start with an initial arbitrary configuration of spins and a guess for the density of states. The simplest guess is to set g(E) = 1 for all possible energies E. 2. Choose a spin at random and make a trial flip. Compute the energy before the flip, E1, and after, E2, and accept the change with probability p(E1 → E2) = min g(E1) g(E2) ,1 , (15.74) Equation (15.74) implies that if g(E2) ≤ g(E1), the state with energy E2 is always accepted; otherwise, it is accepted with probability g(E1)/g(E2). That is, the state with energy E2 is accepted if a random number r ≤ g(E1)/g(E2). 3. Suppose that after step (2) the energy of the system is E. (E is E2 if the change is accepted or remains at E1 if the change is not accepted.) Then g(E) = f g(E) (15.75) H(E) = H(E) + 1. (15.76) That is, we multiply the current value of g(E) by the modification factor f > 1, and we update the existing entry for H(E) in the energy histogram. Because g(E) becomes very large, in practice we must work with the logarithm of the density of states so that ln(g(E)) will fit into double precision numbers. Therefore, each update of the density of states is implemented as ln(g(E)) → ln(g(E)) + ln(f ), and the ratio of the density of states is computed as exp[ln(g(E1)) − ln(g(E2))]. CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 632 4. A reasonable choice of the initial modification factor is f = f0 = e1 2.71828... . If f0 is too small, the random walk will need a very long time to reach all possible energies; however, too large a choice of f0 will lead to large statistical errors. 5. Proceed with the random walk in energy space until a “flat” histogram H(E) is obtained, that is, until all the possible energy values are visited an approximately equal number of times. If the histogram was truly flat, all the possible energies would have been visited an equal number of times. Of course, it is impossible to obtain a perfectly flat histogram, and we will say that H(E) is flat when H(E) for all possible E is not less than p of the average histogram H(E) ; p is chosen according to the size and the complexity of the system and the desired accuracy of the density of states. For the two-dimensional Ising model on small lattices, p can be chosen to be as high as 0.95, but for large systems the criterion for flatness may never be satisfied if p is too close to unity. 6. Once the flatness criterion has been satisfied, reduce the modification factor f using a function such as f1 = f0, reset the histogram to H(E) = 0 for all values of E, and begin the next iteration of the random walk during which the density of states is modified by f1 at each step. The density of states is not reset during the simulation. We continue performing the random walk until the histogram H(E) is again flat. 7. Reduce the modification factor fi+1 = fi, reset the histogram to H(E) = 0 for all values of E, and continue the random walk. Stop the simulation when f is smaller than a predefined value (such as ffinal = exp(10−8) ≈ 1.00000001). The modification factor acts as a control parameter for the accuracy of the density of states during the simulation and also determines how many Monte Carlo sweeps are necessary for the entire simulation. At the end of the simulation, the algorithm provides only a relative density of states. To determine the normalized density of states gn(E), we can either use the fact that the total number of states for the Ising model is E g(E) = 2N or that the number of ground states (for which E = −2N) is 2. The latter normalization guarantees the accuracy of the density of states at low energies, which is important in the calculation of thermodynamic quantities at low temperature. If we apply the former condition, we cannot guarantee the accuracy of g(E) for energies at or near the ground state, because the rescaling factor is dominated by the maximum density of states. We can use one of these two normalization conditions to obtain the absolute density of states and use the other normalization condition to check the accuracy of our result. Problem 15.30. Sampling the density of states (a) Implement the Wang–Landau algorithm for the two-dimensional Ising model for L = 4, 8, and 16. For simplicity, choose p = 0.8 as your criterion for flatness. How many Monte Carlo steps per spin are needed for each iteration? Determine the density of states and describe its qualitative dependence on E. (b) Compute P (E) = g(E)e−βE/Z for different temperatures for the L = 16 system. If T = 0.1, what range of energies will contribute to the specific heat? What is the range of relevant energies for T = 1.0, T = Tc, and T = 4.0? (c) Use the density of states that you computed in part (a) to compute the mean energy, the specific heat, the free energy, and the entropy as a function of temperature. Compare your results to your results for E and C that you found using the Metropolis algorithm in Problem 15.16. CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 633 (d) Use the Wang–Landau algorithm to determine the density of states for the one-dimensional Ising model. In this case you can compare your computed values of g(E) to the exact answer: g(E) = 2 N! i!(N − i)! (15.77) where E = 2i − N, i = 0,2,...,N, and N is even. How does the accuracy of the computed values of g(E) depend on the choice of p for the flatness criterion? (Exact results are available for g(E) for the two-dimensional Ising model as well, but no explicit combinatorial formula exists. See the article by Beale.) (e)∗ The results that you have obtained so far have probably convinced you that the Wang– Landau algorithm is ideal for simulating a variety of systems with many degrees of freedom. What about critical slowing down? Does the Wang–Landau algorithm overcome this limitation of other single spin flip algorithms? To gain some insight, we ask, given the exact g(E), how efficiently does the Wang–Landau sample the different values of E? Use either the exact density of states in two dimensions computed by Beale or the approximate one that you computed in part (a) and set f = 1. Because the system is doing a random walk in energy space, it is reasonable to compute the diffusion constant of the random walker in energy space: DE(t) = [E(t) − E(0)]2 /t (15.78) where t is the time difference, and the choice of the time origin is arbitrary. The idea is to find the dependence of D on the energy E of the system at a particular time origin. How long does it take the system to return to this energy? Run for a sufficiently long time so that DE is independent of t. Plot DE as a function of E. Where is D a maximum? If time permits, determine DE at the energy Ec corresponding to the critical temperature. How does DEc depend on L? 15.13 More Applications You are probably convinced that Monte Carlo methods are powerful, flexible, and applicable to a wide variety of systems. Extensions to the Monte Carlo methods that we have not discussed include multiparticle moves, biased moves where particles tend to move in the direction of the force on them, bit manipulation for Ising-like models, and the use of multiple processors to update different parts of a large system simultaneously. We also have not described the simulation of systems with long-range potentials such as Coulombic systems and dipole-dipole interactions. For these potentials, it is necessary to include the interactions of the particles in the center cell with the infinite set of periodic images. We conclude this chapter with a discussion of Monte Carlo methods in a context that might seem to have little in common with the types of problems we have discussed. This context is called multivariate or combinatorial optimization, a fancy way of saying, “How do you find the global minimum of a function that depends on many parameters?” Problems of this type arise in many areas of scheduling and design as well as in physics, biology, and chemistry. We explain the nature of this type of problem for the traveling salesman problem, although we would prefer to call it the traveling peddler or traveling salesperson problem. Suppose that a salesman wishes to visit N cities and follow a route such that no city is visited more than once and the end of the trip coincides with the beginning. Given these constraints, the problem is to find the optimum route such that the total distance traveled is a minimum. An CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 634 W B J T P L E K Figure 15.7: What is the optimum route for this random arrangement of N = 8 cities? The route begins and ends at city W. A possible route is shown. example of N = 8 cities and a possible route is shown in Figure 15.7. All known exact methods for determining the optimal route require a computing time that increases as eN , and hence, in practice, an exact solution can be found only for a small number of cities. (The traveling salesman problem belongs to a large class of problems known as NP-complete. The NP refers to nondeterministic polynomial. Such problems cannot be done in a time proportional to a finite polynomial in N on standard computers, though polynomial time algorithms are known for hypothetical nondeterministic (quantum) computers.) What is a reasonable estimate for the maximum number of cities that you can consider without the use of a computer? To understand the nature of the different approaches to the traveling salesman problem, consider the plot in Figure 15.8 of the “energy” or “cost” function E(a). We can associate E(a) with the length of the route and interpret a as a parameter that represents the order in which the cities are visited. If E(a) has several local minima, what is a good strategy for finding the global (absolute) minimum of E(a)? One way is to vary a systematically and find the value of E everywhere. This way corresponds to an exact enumeration method and would mean knowing the length of each possible route, an impossible task if the number of cities is too large. Another way is to use a heuristic method, that is, an approximate method for finding a route that is close to the absolute minimum. One strategy is to choose a value of a, generate a small random change δa, and accept this change if E(a + δa) is less than or equal to E(a). This iterative improvement strategy corresponds to a search for steps that lead downhill (see Figure 15.8). Because this strategy usually leads to a local and not a global minimum, it is useful to begin from several initial choices of a and to keep the best result. What would be the application of this type of strategy to the salesman problem? Because we cannot optimize the path exactly when N becomes large, we have to be satisfied with solving the optimization problem approximately and finding a relatively good local minimum. To understand the motivation for the simulated annealing algorithm, consider a seemingly unrelated problem. Suppose we wish to make a perfect single crystal. You might know that we should start with the material at a high temperature at which the material is a liquid melt and then gradually lower the temperature. If we lower the temperature too quickly (a rapid quench), the resulting crystal would have many defects or not become a crystal at all. The gradual lowering of the temperature is known as annealing. The method of annealing can be used to estimate the minimum of E(a). We choose a value of a, generate a small random change δa, and calculate E(a + δa). If E(a + δa) is less than or equal to E(a), we accept the change. However, if ∆E = E(a + δa) − E(a) > 0, we accept the change CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 635 E(a) a Figure 15.8: Plot of the function E(a) as a function of the parameter a. with a probability p = e−∆E/T , where T is an effective temperature. This procedure is the familiar Metropolis algorithm with the temperature playing the role of a control parameter. The simulated annealing process consists of first choosing a value for T for which most moves are accepted and then gradually lowering the temperature. At each temperature, the simulation should last long enough for the system to reach quasiequilibrium. The annealing schedule, that is, the rate of temperature decrease, determines the quality of the solution. The idea is to allow moves that result in solutions of worse quality than the current solution (uphill moves) in order to escape from local minima. The probability of doing such a move is decreased during the search. The slower the temperature is lowered, the higher the chance of finding the optimum solution, but the longer the run time. The effective use of simulated annealing depends on finding a annealing schedule that yields good solutions without taking too much time. It has been proven that if the cooling rate is sufficiently slow, the absolute (global) minimum will eventually be reached. The bounds for “sufficiently slow” depend on the properties of the search landscape (the nature of E(a)) and are exceeded for most problems of interest. However, simulated annealing is usually superior to conventional heuristic algorithms. The moral of the simulated annealing method is that sometimes it is necessary to climb a hill to reach a valley. The first application of the method of simulated annealing was to the optimal design of computers. In Problem 15.31 we apply this method to the traveling salesman problem. Problem 15.31. Simulated annealing and the traveling salesman problem (a) Generate a random arrangement of N = 8 cities in a square of linear dimension L = √ N and calculate the optimum route by hand. Then write a Monte Carlo program and apply the method of simulated annealing to this problem. For example, use two arrays to store the xand y coordinate of each city and an array to store the distances between them. The state of the system, that is, the route represented by a sequence of cities, can be stored in another array. The length of this route is associated with the energy of an imaginary thermal system. A reasonable choice for the initial temperature is one that is the same order as the initial energy. One way to generate a random rearrangement of the route is to choose two cities at random and to interchange the order of visit. Choose this method or one that you devise and find a reasonable annealing schedule. Compare your annealing results to exact results CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 636 whenever possible. Extend your results to larger N, for example, N = 12, 24, and 48. For a given annealing schedule, determine the probability of finding a route of a given length. More suggestions can be found in the references. (b) The microcanonical Monte Carlo algorithm (demon) discussed in Section 15.3 can also be used to do simulated annealing. The advantages of the demon algorithm are that it is deterministic and allows large temperature fluctuations. One way to implement the analog of simulated annealing is to impose a maximum value on the energy of the demon, Ed,max, which is gradually decreased. Guo et al. choose Ed,max to be initially equal to √ N/4. Their results are comparable to the usual simulated annealing method but require approximately half the CPU time. Apply this method to the same city positions that you considered in part (a) and compare your results. 15.14 Projects Many of the original applications of Monte Carlo methods were done for systems of approximately one hundred particles and lattices of order 322 spins. It would be instructive to redo many of these applications with much better statistics and with larger system sizes. In the following, we discuss some additional recent developments, but we have omitted other important topics such as Brownian dynamics and umbrella sampling. More ideas for projects can be found in the references. Project 15.32. Overcoming critical slowing down The usual limiting factor of most simulations is the speed of the computer. Of course, one way to overcome this problem is to use a faster computer. Near a continuous phase transition, the most important limiting factor on even the fastest available computers is the existence of critical slowing down (see Problem 15.19). In this project we discuss the nature of critical slowing down and ways of overcoming it in the context of the Ising model. As we have mentioned, the existence of critical slowing down is related to the fact that the size of the correlated regions of spins becomes very large near the critical point. The large size of the correlated regions and the corresponding divergent behavior of the correlation length ξ near Tc implies that the time τ required for a region to lose its coherence becomes very long if a local dynamics is used. At T = Tc, τ ∼ Lz for L 1. For single spin flip algorithms, z ≈ 2 and τ becomes very large for L 1. On a serial computer, the CPU time needed to obtain n configurations increases as L2, the time needed to visit L2 spins. This factor of L2 is expected and not a problem because a larger system contains proportionally more information. However, the time needed to obtain n approximately independent configurations is of order τL2 ∼ L2+z ≈ L4 for the Metropolis algorithm. We conclude that an increase of L by a factor of 10 requires 104 more computing time. Hence, the existence of critical slowing down limits the maximum value of L that can be considered. If we are interested only in the static properties of the Ising model, the choice of dynamics is irrelevant as long as the transition probability satisfies the detailed balance condition (15.18). It is reasonable to look for a global algorithm for which groups or clusters of spins are flipped simultaneously. We are already familiar with cluster properties in the context of percolation (see Chapter 12). A naive definition of a cluster of spins might be a domain of parallel nearest neighbor spins. We can make this definition explicit by introducing a bond between any two nearest neighbor spins that are parallel. The introduction of a bond between parallel spins CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 637 (a) (b) Figure 15.9: (a) A cluster of two up spins. (b) A cluster of two down spins. The filled and open circles represent the up and down spins, respectively. Note the bond between the two spins in the cluster. Adapted from Newman and Barkema. defines a site-bond percolation problem. More generally, we may assume that such a bond exists with probability p and that this bond probability depends on the temperature T . The dependence of p on T can be determined by requiring that the percolation transition of the clusters occurs at the Ising critical point and by requiring that the critical exponents associated with the clusters be identical to the analogous thermal exponents. For example, we can define a critical exponent νp to characterize the divergence of the connectedness length of the clusters near pc. The analogous thermal exponent ν quantifies the divergence of the thermal correlation length ξ near Tc. We will argue in the following that these (and other) critical exponents are identical if we define the bond probability as p = 1 − e−2J/kT (bond probability). (15.79) The relation (15.79) holds for any spatial dimension. What is the value of p at T = Tc for the two-dimensional Ising model on the square lattice? A simple argument for the temperature dependence of p in (15.79) is as follows. Consider the two configurations in Figure 15.9 which differ from one another by the flip of the cluster of two spins. In Figure 15.9(a) the six nearest neighbor spins of the cluster are in the opposite direction and, hence, are not part of the cluster. Thus, the probability of this configuration with a cluster of two spins is pe−βJe6βJ, where p is the probability of a bond between the two up spins, e−βJ is proportional to the probability that these two spins are parallel, and e6βJ is proportional to the probability that the six nearest neighbors are antiparallel. In Figure 15.9(b) the cluster spins have been flipped, and the possible bonds between the cluster spins and its nearest neighbors have to be “broken.” The probability of this configuration with a cluster of two (down) spins is p(1 − p)6e−βJe−6βJ, where the factor of (1 − p)6 is the probability that the six nearest neighbor spins are not part of the cluster. Because we want the probability that a cluster is flipped to be unity, we need to have the probability of the two configurations and their corresponding clusters be the same. Hence, we must have peβJ e−6βJ = p(1 − p)6 eβJ e6βJ , (15.80) or (1 − p)6 = e−12βJ. It is straightforward to solve for p and obtain the relation (15.79). Now that we know how to generate clusters of spins, we can use these clusters to construct a global dynamics instead of only flipping one spin at a time as in the Metropolis algorithm. The idea is to grow a single (site-bond) percolation cluster in a way that is analogous to the single (site) percolation cluster algorithm discussed in Section 13.1. The algorithm can be implemented by the following steps: CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 638 (i) Choose a seed spin at random. Its four nearest neighbor sites (on the square lattice) are the perimeter sites. Form an ordered array corresponding to the perimeter spins that are parallel to the seed spin and define a counter for the total number of perimeter spins. (ii) Choose the first spin in the ordered perimeter array. Remove it from the array and replace it by the last spin in the array. Generate a random number r. If r ≤ p, the bond exists between the two spins, and the perimeter spin is added to the cluster. (iii) If the spin is added to the cluster, inspect its parallel perimeter spins. If any of these spins are not already a part of the cluster, add them to the end of the array of perimeter spins. (iv) Repeat steps (ii) and (iii) until no perimeter spins remain. (v) Flip all the spins in the single cluster. This algorithm is known as single cluster flip or Wolff dynamics. Note that bonds, rather than sites, are tested so that a spin might have more than one chance to join a cluster. In the following, we consider both the static and dynamical properties of the two-dimensional Ising model using the Wolff algorithm to generate the configurations. (a) Modify your program for the Ising model on a square lattice so that single cluster flip dynamics (the Wolff algorithm) is used. Compute the mean energy and magnetization for L = 16 as a function of T for T = 2.0 to 2.7 in steps of 0.1. Compare your results to those obtained using the Metropolis algorithm. How many cluster flips do you need to obtain comparable accuracy at each temperature? Is the Wolff algorithm more efficient at every temperature near Tc? (b) Fix T at the critical temperature of the infinite lattice (Tc = 2/ln(1 + √ 2)) and use finite size scaling to estimate the values of the various static critical exponents, for example, γ and α. Compare your results to those obtained using the Metropolis algorithm. (c) Because we are generating site-bond percolation clusters, we can study their geometrical properties as we did for site percolation. For example, measure the distribution sns of cluster sizes at p = pc (see Problem 13.3). How does ns depend on s for large s (see Project 13.15)? What is the fractal dimension of the clusters in the Ising model at T = Tc? (d) The natural unit of time for single cluster flip dynamics is the number of cluster flips tcf. Measure CM(tcf) and/or CE(tcf) and estimate the corresponding correlation time τcf for T = 2.5, 2.4, 2.3, and Tc for L = 16. As discussed in Problem 15.19, τcf can be found from the relation, τcf = tcf=1 C(tcf). The sum is cut–off at the first negative value of C(tcf). Estimate the value of zcf from the relation τcf = Lzcf . (e) To compare our results for the Wolff algorithm to our results for the Metropolis algorithm, we should use the same unit of time. Because only a fraction of the spins are updated at each cluster flip, the time tcf is not equal to the usual unit of time, which corresponds to an update of the entire lattice or one Monte Carlo step per spin. We have that τ measured in Monte Carlo steps per spin is related to τcf by τ = τcf c /L2, where c is the mean number of spins in the single clusters, and L2 is the number of spins in the entire lattice. Verify that the mean cluster size scales as c ∼ Lγ/ν with γ = 7/4 and ν = 1. (The quantity c is the same quantity as the mean cluster size S defined in Chapter 12. The exponents characterizing the divergence of the various properties of the clusters are identical to the analogous thermal exponents.) CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 639 (f) To obtain the value of z that is directly comparable to the value found for the Metropolis algorithm, we need to rescale the time as in part (e). We have that τ ∼ Lz ∝ Lzcf Lγ/νL−d. Hence, z is related to the measured value of zcf by z = zcf −(d −γ/ν). What is your estimated value of z? (It has been estimated that zcf ≈ 0.50 for the d = 2 Ising model, which would imply that z ≈ 0.25.) (g) One of the limitations of the usual implementation of the Metropolis algorithm is that only one spin is flipped at a time. However, there is no reason why we could not choose f spins at random, compute the change in energy ∆E for flipping these f spins, and accepting or rejecting the trial move in the usual way according to the Boltzmann probability. Explain why this generalization of the Metropolis algorithm would be very inefficient, especially if f >> 1. We conclude that the groups of spins to be flipped must be chosen with the physics of the system in mind and not simply at random. Another cluster algorithm is to assign all bonds between parallel spins with probability p. As usual, no bonds are included between sites that have different spin orientations. From this configuration of bonds, we can form clusters of spins using one of the cluster identification algorithms we discussed in Chapter 12. The smallest cluster contains a single spin. After the clusters have been identified, all the spins in each cluster are flipped with probability 1/2. This algorithm is known as the Swendsen-Wang algorithm and preceded the Wolff algorithm. Because the Wolff algorithm is easier to program and gives a smaller value of z than the Swendsen-Wang algorithm for the d = 3 and d = 4 Ising models, the Wolff algorithm is more commonly used. Project 15.33. Invaded cluster algorithm In Problem 13.7 we found that invasion percolation is an example of a self-organized critical phenomenon. In this cluster growth algorithm, random numbers are independently assigned to the bonds of a lattice. The growth starts from the seed sites of the left-most column. At each step the cluster grows by the occupation of the perimeter bond with the smallest random number. The growth continues until the cluster satisfies a stopping condition. We found that if we stop adding sites when the cluster is comparable in extent to the linear dimension L, then the fraction of bonds that are occupied approaches the percolation threshold pc as L → ∞. The invaded percolation algorithm automatically finds the percolation threshold! Machta and co-workers have used this idea to find the critical temperature of a spin system without knowing its value in advance. For simplicity, we will discuss their algorithm in the context of the Ising model, although it can be easily generalized to the q-state Potts model (see the references). Consider a lattice on which there is a spin configuration {si}. The bonds of the lattice are assigned a random order. Bonds (i,j) are tested in this assigned order to see if si is parallel to sj. If so, the bond is occupied and spins i and j are a part of the same cluster. Otherwise, the bond is not occupied and is not considered for the remainder of the current Monte Carlo step. The set of occupied bonds partitions the lattice into clusters of connected sites. The clusters can be found using the Newman–Ziff algorithm (see Section 12.3). The cluster structure evolves until a stopping condition is satisfied. Then a new spin configuration is obtained by flipping each cluster with probability 1/2, thus completing one Monte Carlo step. The fraction f of bonds that were occupied during the growth process and the energy of the system are measured. The bonds are then randomly reordered and the process begins again. Note that the temperature is not an input parameter. If open boundary conditions are used, the appropriate stopping rule is that a cluster spans the lattice (see Chapter 12, page 450). For periodic boundary conditions, the spanning rule discussed in Project 12.17 is appropriate. CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 640 Write a program to simulate the invaded cluster algorithm for the Ising model on the square lattice. Start with all spins up and determine how many Monte Carlo steps are needed for equilibration. How does this number compare to that required by the Metropolis algorithm at the critical temperature for the same value of L? An estimate for the critical temperature can be found from the relation (15.79) with f corresponding to p. After you are satisfied that your program is working properly, determine the dependence of the critical temperature on the concentration c of nonmagnetic impurities. That is, randomly place nonmagnetic impurities on a fraction c of the sites. Project 15.34. Physical test of random number generators In Section 7.9 we discussed various statistical tests for the quality of random number generators. In this project we will find that the usual statistical tests might not be sufficient for determining the quality of a random number generator for a particular application. The difficulty is that the quality of a random number generator for a specific application depends in part on how the subtle correlations that are intrinsic to all deterministic random number generators couple to the way that the random number sequences are used. In this project we explore the quality of two random number generators when they are used to implement single spin flip dynamics (the Metropolis algorithm) and single cluster flip dynamics (Wolff algorithm) for the two-dimensional Ising model. (a) Write methods to generate sequences of random numbers based on the linear congruential algorithm xn = 16807xn−1 mod(231 − 1), (15.81) and the generalized feedback shift register (GFSR) algorithm xn = xn−103 ⊕ xn−250. (15.82) In both cases xn is the nth random number. Both algorithms require that xn be divided by the largest possible value of xn to obtain numbers in the range 0 ≤ xn < 1. The GFSR algorithm requires bit manipulation. Which random number generator does a better job of passing the various statistical tests discussed in Problem 7.35? (b) Use the Metropolis algorithm and the linear congruential random number generator to determine the mean energy per spin E/N and the specific heat (per spin) C for the L = 16 Ising model at T = Tc = 2/ln(1 + √ 2). Make ten independent runs (that is, ten runs that use different random number seeds) and compute the standard deviation of the means σm from the ten values of E/N and C, respectively. Published results by Ferrenberg, Landau, and Wong are for 106 Monte Carlo steps per spin for each run. Calculate the differences δe and δc between the average of E/N and C over the ten runs and the exact values (to five decimal places), E/N = −1.45306 and C = 1.49871. If the ratio δ/σm for the two quantities is order unity, then the random number generator does not appear to be biased. Repeat your runs using the GFSR algorithm to generate the random number sequences. Do you find any evidence of statistical bias? (c) Repeat part (b) using Wolff dynamics. Do you find any evidence of statistical bias? (d) Repeat the computations in parts (b) and (c) using the random number generator supplied with your programming language. CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 641 Project 15.35. Nucleation and the Ising model (a) Equilibrate the two-dimensional Ising model at T = 4Tc/9 and B = 0.3 for a system with L ≥ 50. What is the equilibrium value of m? Then flip the magnetic field so that it points down, that is, B = −0.3. Use the Metropolis algorithm and plot m as a function of the time t (the number of Monte Carlo steps per spin). What is the qualitative behavior of m(t)? Does it fluctuate about a positive value for a time long enough to determine various averages? If so, the system can be considered to have been in a metastable state. Watch the spins evolve for a time before m changes sign. Visually determine a place in the lattice where a “droplet” of the stable phase (down spins) first appears and then grows. Change the random number seed and rerun the simulation. Does the droplet appear in the same spot at the same time? Can the magnitude of the field be increased further, or is there an upper bound above which a metastable state is not well defined? (b) As discussed in Project 15.32, we can define clusters of spins by placing a bond with probability p between parallel spins. In this case there is an external field and the proper definition of the clusters is more difficult. For simplicity, assume that there is a bond between all nearest–neighbor down spins and find all the clusters of down spins. One way to identify the droplet that initiates the decay of the metastable state is to monitor the number of spins in the largest cluster as a function of time after the quench. At what time does the number of spins in the largest cluster begin to grow quickly? This time is an estimate of the nucleation time. Another way of estimating the nucleation time is to follow the evolution of the center of mass of the largest cluster. For early times after the quench, the center of mass position has large fluctuations. However, at a certain time these fluctuations decrease considerably, which is another criterion for the nucleation time. What is the order of magnitude of the nucleation time? (c) While the system is in a metastable state, clusters of down spins grow and shrink randomly until eventually one of the clusters becomes large enough to grow, nucleation occurs, and the system decays to its stable macroscopic state. The cluster that initiates this decay is called the nucleating droplet. This type of nucleation is due to spontaneous thermal fluctuations and is called homogeneous nucleation. Although the criteria for the nucleation time that we used in part (b) are plausible, they are not based on fundamental considerations. From theoretical considerations the nucleating droplet can be thought of as a cluster that just makes it to the top of the saddle point of the free energy that separates the metastable and stable states. We can identify the nucleating droplet by using the fact that a saddle point structure should initiate the decay of the metastable state 50% of the time. The idea is to save the spin configurations at regular intervals at about the time that nucleation is thought to have occurred. We then restart the simulation using a saved configuration at a certain time and use a different random number sequence to flip the spins. If we have intervened at a time such that the largest cluster decays in more than 50% of the trials, then the intervention time (the time at which we changed the random number seed) is before nucleation. Similarly, if less than 50% of the clusters decay, the intervention is after the nucleation time. The nucleating droplet is the cluster that decays in approximately half of the trial interventions. Because we need to do a number of interventions (usually in the range 20–100) at different times, the intervention method is much more CPU intensive than the other criteria. However, it has the advantage that it has a sound theoretical basis. Redo some of the simulations that you did in part (b) and compare the different estimates of the nucleation time. What is the nature and size of the nucleating droplet? If time permits, determine the CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 642 probability that the system nucleates at time t for a given quench depth. (Measure the time t after the flip of the field.) (d) Heterogeneous nucleation occurs in nature because of the presence of impurities, defects, or walls. One way of simulating heterogeneous nucleation in the Ising model is to fix a certain number of spins in the direction of the stable phase (down). For simplicity, choose the impurity to be five spins in the shape of a + sign. What is the effect of the impurity on the lifetime of the metastable state? What is the probability of droplet growth on and off the impurity as a function of quench depth B? (e) The questions raised in parts (b)–(d) become even more interesting when the interaction between the spins extends beyond nearest neighbors. Assume that a given spin interacts with all spins that are within a distance R with an interaction strength of 4J/q, where q is the number of spins within the interaction range R. (Note that q = 4 for nearest neighbor interactions on the square lattice.) A good choice is R = 10, although your preliminary simulations should be for smaller R. How does the value of Tc change as R is increased? Project 15.36. The n-fold way: Simulations at low temperature Monte Carlo simulations become very inefficient at low temperatures because almost all trial configurations will be rejected. For example, consider an Ising model for which all spins are up, but a small magnetic field is applied in the negative direction. The equilibrium state will have most spins pointing down. Nevertheless, if the magnetic field is small and the temperature is low enough, equilibrium will take a very long time to occur. What we need is a more efficient way of sampling configurations if the acceptance probability is low. The n-fold way algorithm is one such method. The idea is to accept more low probability configurations but to weight them appropriately. If we use the usual Metropolis rule, then the probability of flipping the ith spin is pi = min 1,e−∆E/kT . (15.83) One limitation of the Metropolis algorithm is that it becomes very inefficeint if the probabilities pi are very small. If we sum over all the spins, then we can define the total weight Q = i pi. (15.84) The idea is to choose a spin to flip (with probability one) by computing a random number rQ between 0 and Q and finding spin i that satisfies the condition: i−1 k=0 pk ≤ rQ < i k=0 pk. (15.85) There are two more ingredients we need to make this algorithm practical. We need to determine how long a configuration would remain unchanged if we had used the Metropolis algorithm. Also, the algorithm would be very inefficient because on average the computation of which spin to flip from (15.85) would take O(N) computations. This second problem can be easily overcome by realizing that there are only a few possible values of pi. For example, for the Ising CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 643 model on a square lattice in a magnetic field, there are only n = 10 possible values of pi. Thus, instead of (15.85), we have i−1 α=0 nαpα ≤ rQ < i α=0 nαpα (15.86) where α labels one of the n possible values of pi or classes, and nα is the number of spins in class α. Hence, instead of O(N) calculations, we need to perform only O(n) calculations. Once we know which class we have chosen, we can randomly flip one of the spins in that class. Next we need to determine the time spent in a configuration. The probability in one Metropolis Monte Carlo step of choosing a spin at random is 1/N, and the probability of actually flipping that spin is pi, which is given by (15.83). Thus, the probability of flipping any spin is 1 N N−1 i=0 pi = 1 N n−1 α=0 nαpα = Q N . (15.87) The probability of not flipping any spin is q ≡ 1 − Q/N, and the probability of not flipping after s steps is qs. Thus, if we generate a random number r between 0 and 1, the time s in Monte Carlo steps per spin to remain in the current configuration will be determined by solving qs−1 ≤ r < qs . (15.88) If Q/N << 1, then both sides of (15.88) are approximately equal, and we can approximate s by s ≈ lnr lnq = lnr ln(1 − Q/N) ≈ − N Q lnr. (15.89) That is, we would have to wait s Monte Carlo steps per spin on the average before we would flip a spin using the Metropolis algorithm. Note that the random number r in (15.88) and (15.89) should not be confused with the random number rQ in (15.86). The n-fold algorithm can be summarized by the following steps: (i) Start with an initial configuration and determine the class to which each spin belongs. Store all the possible values of pi in an array. Compute Q. Store in an array the number of spins in class α, nα. (ii) Determine s from (15.89). Accumulate any averages, such as the energy and magnetization weighted by s. Also, accumulate the total time tTotal += s. (iii) Choose a class of spin using (15.86) and randomly choose which spin in the chosen class to flip. (iv) Update the classes of the chosen spin and its four neighbors. (v) Repeat steps (ii)–(iv). To conveniently carry out step (iv), set up the following arrays: spinClass[i] returns the class of the ith spin, spinInClass[k][alpha] returns the kth spin in class α, and spinIndex[i] returns the value of k for the ith spin to use in the array spinInClass[k][alpha]. If we define the local field of a spin by the sum of the fields of its four neighbors, then this local field can take on the values {−4,−2,0,2,4}. The ten classes correspond to these five local field values and the center spin equal to −1 plus these five local field values and the center spin equal to +1. If we order these ten classes from 0 to 9, then the class of a spin that is flipped changes by +5 mod 10, and the class of a neighbor changes by the new spin value equal to ±1. CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 644 Figure 15.10: A typical configuration of the planar model on a 24 × 24 square lattice that has been quenched from T = ∞ to T = 0 and equilibrated for 200 Monte Carlo steps per spin after the quench. Note that there are six vortices. The circle around each vortex is a guide to the eye and is not meant to indicate the size of the vortex. (a) Write a program to implement the n-fold way algorithm for the Ising model on a square lattice with an applied magnetic field. Check your program by comparing various averages at a few temperatures with the results from your program using the Metropolis algorithm. (b) Choose the magnetic field B = −0.5 at the temperature T = 1. Begin with an initial configuration of all spins up and use the n-fold way to estimate how long it takes before the majority of the spins flip. Do the same simulation using the Metropolis algorithm. Which algorithm is more efficient? (c) Repeat part (b) for other temperature and field values. For what conditions is the n-fold way algorithm more efficient than the standard Metropolis algorithm? (d) Repeat part (b) for different values of the magnetic field and plot the number of Monte Carlo steps needed to flip the spins as a function of 1/|B| for values of B from 0 to ≈ 3. Average over at least 10 starting configurations for each field value. Project 15.37. The Kosterlitz–Thouless transition The planar model (also called the x-y model) consists of spins of unit magnitude that can point in any direction in the x-y plane. The energy or Hamiltonian function of the planar model in CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 645 zero magnetic field can be written as E = −J i,j=nn(i) [si,xsj,x + si,ysj,y] (15.90) where si,x represents the x-component of the spin at the ith site, J measures the strength of the interaction, and the sum is over all nearest neighbors. We can rewrite (15.90) in a simpler form by substituting si,x = cosθi and si,y = sinθi. The result is E = −J i,j=nn(i) cos(θi − θj) (15.91) where θi is the angle that the ith spin makes with the x- axis. The most studied case is the two-dimensional model on a square lattice. In this case the mean magnetization M = 0 for all temperatures T > 0, but, nevertheless, there is a phase transition at a nonzero temperature TKT, which is known as the Kosterlitz–Thouless (KT) transition. For T ≤ TKT, the spin-spin correlation function C(r) decreases as a power law; for T > TKT, C(r) decreases exponentially. The power law decay of C(r) for T ≤ TKT implies that every temperature below TKT acts as if it was a critical point. We say that the planar model has a line of critical points. In the following, we explore some of the properties of the planar model and the mechanism that causes the transition. (a) Write a program that uses the Metropolis algorithm to simulate the planar model on a square lattice using periodic boundary conditions. Because θ and hence the energy of the system is a continuous variable, it is not possible to store the previously computed values of the Boltzmann factor for each possible value of ∆E. Instead of computing e−β∆E for each trial change, it is faster to set up an array w such that the array element w(j) = e−β∆E, where j is the integer part of 1000∆E. This procedure leads to an energy resolution of 0.001, which should be sufficient for most purposes. (b) One way to show that the magnetization M vanishes for all T is to compute θ2 , where θ is the angle that a spin makes with the magnetization M for a given configuration. (Although the mean magnetization vanishes, M 0 at any given time.) Compute θ2 as a function of the number of spins N at T = 0.1 and show that θ2 diverges as lnN. Begin with a 4 × 4 lattice and choose the maximum change in θi to be ∆θmax = 1.0. If necessary, change θmax so that the acceptance probability is about 40%. If θ2 diverges, then the fluctuations in the direction of the spins diverges, which implies that there is no preferred direction for the spins, and hence the mean magnetization vanishes. (c) Modify your program so that an arrow is drawn at each site to show the orientation of each spin. You can use the Vector2DFrame to draw a lattice of arrows. Look at a typical configuration and analyze it visually. Begin with a 32 × 32 lattice with spins pointing in random directions and do a temperature quench to T = 0.5. (Simply change the value of β in the Boltzmann probability.) Such a quench should lock in some long lived but metastable vortices. A vortex is a region of the lattice where the spins rotate by at least 2π as your eye moves around a closed path (see Figure 15.10). To determine the center of a vortex, choose a group of four spins that are at the corners of a unit square and determine whether the spins rotate by ±2π as your eye goes from one spin to the next in a counterclockwise direction around the square. Assume that the difference between the direction of two neighboring spins δθ is in the range −π < δθ < π. A total rotation of +2π indicates the existence of CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 646 a positive vortex, and a change of −2π indicates a negative vortex. Count the number of positive and negative vortices. Repeat these observations for several configurations. What can you say about the number of vortices of each sign? (d) Write a method to determine the existence of a vortex for each 1 × 1 square of the lattice. Represent the center of the vortices using a different symbol to distinguish between a positive and a negative vortex. Do a Monte Carlo simulation to compute the mean energy, the specific heat, and number of vortices in the range from T = 0.5 to T = 1.5 in steps of 0.1. Use the last configuration at the previous temperature as the first configuration for the next temperature. Begin at T = 0.5 with all θi = 0. Draw the vortex locations for the last configuration at each temperature. Use at least 1000 Monte Carlo steps per spin at each temperature to equilibrate and at least 5000 Monte Carlo steps per spin for computing the averages. Use an 8×8 or 16×16 lattice if your computer resources are limited and larger lattices if you have sufficient resources. Describe the T -dependence of the energy, the specific heat, and the vorticity (equal to the number of vortices per unit area). Plot the logarithm of the vorticity versus T for T < 1.1. What can you conclude about the T -dependence of the vorticity? Explain why this form is reasonable. Describe the vortex configurations. At what temperature do you find a vortex that appears to be free, that is, a vortex that is not obviously paired with another vortex of opposite sign? (e) The Kosterlitz–Thouless theory predicts that the susceptibility χ diverges above the transition as χ ∼ Aeb/ ν (15.92) where is the reduced temperature = (T −TKT)/TKT, ν = 0.5, and A and b are nonuniversal constants. Compute χ from the relation (15.21) with M = 0. Assume the exponential form (15.92) for χ in the range T = 1 and T = 1.2 with ν = 0.7 and find the best values of TKT, A, and b. (Although theory predicts ν = 0.5, simulations for small systems indicate that ν = 0.7 gives a better fit.) One way to determine TKT, A, and b is to assume a value of TKT and then do a least squares fit of lnχ to determine A and b. Choose the set of parameters that minimizes the variance of lnχ. How does your estimated value of TKT compare with the temperature at which free vortices first appear? At what temperature does the specific heat have a peak? The Kosterlitz–Thouless theory predicts that the specific heat peak does not occur at TKT. This prediction has been confirmed by simulations (see Tobochnik and Chester). To obtain quantitative results, you will need lattices larger than 32 × 32. Project 15.38. The classical Heisenberg model in two dimensions The energy or Hamiltonian of the classical Heisenberg model is similar to the Ising model and the planar model, except that the spins can point in any direction in three dimensions. The energy in zero external magnetic field is E = −J N i,j=nn(i) si·sj = −J N i,j=nn(i) [si,xsj,x + si,ysj,y + si,zsj,z] (15.93) where s is a classical vector of unit length. The spins have three components, in contrast to the spins in the Ising model which only have one component and the spins in the planar model which have two components. We will consider the two-dimensional Heisenberg model for which the spins are located on a two-dimensional lattice. Early simulations and approximate theories led researchers to CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 647 believe that there was a continuous phase transition, similar to that found in the Ising model. The Heisenberg model received more interest after it was related to quark confinement. Lattice models of the interaction between quarks, called lattice gauge theories, predict that the confinement of quarks could be explained if there are no phase transitions in these models. (The lack of a phase transition in these models implies that the attraction between quarks grows with distance.) The two-dimensional Heisenberg model is an analog of the four-dimensional models used to model quark-quark interactions. Shenker and Tobochnik used a combination of Monte Carlo and renormalization group methods to show that this model does not have a phase transition. Subsequent work on lattice gauge theories showed similar behavior. (a) Modify your Ising model program to simulate the Heisenberg model in two dimensions. One way to do so is to define three arrays, one for each of the three components of the unit spin vectors. A trial Monte Carlo move consists of randomly changing the direction of a spin, si. First compute a small vector ∆s = ∆smax(q1,q2,q3), where −1 ≤ qn ≤ 1 is a uniform random number, and ∆smax is the maximum change of any spin component. If |∆s| > ∆smax, compute another ∆s. This latter step is necessary to insure that the change in a spin direction is symmetrically distributed around the current spin direction. Then let the trial spin equal si + ∆s normalized to a unit vector. The standard Metropolis algorithm can now be used to determine if the trial spin is accepted. Compute the mean energy, the specific heat, and the susceptibility as a function of T . Choose lattice sizes of L = 8, 16, 32, and larger, if possible, and average over at least 2000 Monte Carlo steps per spin at each temperature. Is there any evidence of a phase transition? Does the susceptibility appear to diverge at a nonzero temperature? Plot the logarithm of the susceptibility versus the inverse temperature and determine the temperature dependence of the susceptibility in the limit of low temperatures. (b) Use the Lee–Kosterlitz analysis at the specific heat peak to determine if there is a phase transition. Project 15.39. Domain growth kinetics When the Ising model is quenched from a high temperature to very low temperatures, domains of the ordered low temperature phase typically grow with time as a power law R ∼ tα, where R is a measure of the average linear dimension of the domains. A simple measure of the domain size is the perimeter length of a domain which can be computed from the energy per spin , and is given by R = 2 2 + . (15.94) Equation (15.94) can be motivated by the following argument. Imagine a region of N spins made up of a domain of up spins with a perimeter size R embedded in a sea of down spins. The total energy of this region is −2N + 2R, where for each spin on the perimeter, the energy is increased by 2 because one of the neighbors of a perimeter spin will be of opposite sign. The energy per spin is = −2 + 2R/N. Because N is of order R2, we arrive at the result given in (15.94). (a) Modify your Ising model program so that the initial configuration is random, that is, a typical high temperature configuration. Write a target class to simulate a quench of the system. The input parameters should be the lattice size, the quench temperature (use 0.5 initially), the maximum time (measured in Monte Carlo steps per spin) for each quench, and the number of Monte Carlo steps between drawing the lattice. Plot ln R versus lnt after each quench is finished, where t is measured from the time of the quench. CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 648 (b) Choose L = 64 and a maximum time of 128 mcs. Averages over 10 quenches will give acceptable results. What value do you obtain for α? Repeat for other temperatures and system sizes. Does the exponent change? Run for a longer maximum time to check your results. (c) Modify your program to simulate the q-state Potts model. Consider various values of q. Do your results change? Results for large q and large system sizes are given in Grest et al. (d)∗ Modify your program to simulate a three-dimensional system. How should you modify (15.94)? Are your results similar? Project 15.40. Ground state energy of the Ising spin glass A spin glass is a magnetic system with frozen-in disorder. An example of such a system is the Ising model with the exchange constant Jij between nearest neighbor spins randomly chosen to be ±1. The disorder is said to be “frozen-in” because the set of interactions {Jij} does not change with time. Because the spins cannot arrange themselves so that every pair of spins is in its lowest energy state, the system exhibits frustration similar to the antiferromagnetic Ising model on a triangular lattice (see Problem 15.22). Is there a phase transition in the spin glass model, and if so, what is its nature? The answers to these questions are very difficult to obtain by doing simulations. One of the difficulties is that we need to do not only an average over the possible configurations of spins for a given set of {Jij}, but also an average over different realizations of the interactions. Another difficulty is that there are many local minima in the energy (free energy at finite temperature) as a function of the configurations of spins, and it is very difficult to find the global minimum. As a result, Monte Carlo simulations typically become stuck in these local minima or metastable states. Detailed finite-size scaling analyses of simulations indicate that there might be a transition in three dimensions. It is generally accepted that the transition in two dimensions is at zero temperature. In the following, we will look at some of the properties of an Ising spin glass on a square lattice at low temperatures. (a) Write a program to apply simulated annealing to an Ising spin glass using the Metropolis algorithm with the temperature fixed at each stage of the annealing schedule (see Problem 15.31a). Search for the lowest energy configuration for a fixed set of {Jij}. Use at least one other annealing schedule for the same {Jij} and compare your results. Then find the ground state energy for at least ten other sets of {Jij}. Use lattice sizes of L = 5 and L = 10. Discuss the nature of the ground states you are able to find. Is there much variation in the ground state energy E0 from one set of {Jij} to another? Theoretical calculations give an average over realizations of E0/N ≈ −1.4. If you have sufficient computer resources, repeat your computations for the three-dimensional spin glass. (b) Modify your program to do simulated annealing using the demon algorithm (see Problem 15.31b). How do your results compare to those that you found in part (a)? Project 15.41. Zero temperature dynamics of the Ising model We have seen that various kinetic growth models (Section 13.3) and reaction-diffusion models (Section 7.8) lead to interesting and nontrivial behavior. Similar behavior can be seen in the zero temperature dynamics of the Ising model. Consider the one-dimensional Ising model with J > 0 and periodic boundary conditions. The initial orientation of the spins is chosen at random. We update the configurations by choosing a spin at random and computing the change in energy ∆E. If ∆E < 0, then flip the spin; else if ∆E = 0, flip the spin with 50% probability. The spin is not flipped if ∆E > 0. This type of Monte Carlo update is known as Glauber dynamics. How does this algorithm differ from the Metropolis algorithm at T = 0? CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 649 (a) A quantity of interest is f (t), the fraction of spins that have not yet flipped at time t. As usual, the time is measured in terms of Monte Carlo steps per spin. Published results (Derrida et al.) for N = 105 indicate that f (t) behaves as f (t) ∼ t−θ , (15.95) for t ≈ 3 to t ≈ 10,000. The exact value of θ is 0.375. Verify this result and extend your results to the one-dimensional q-state Potts model. In the latter model each site is initially given a random integer between 1 and q. A site is chosen at random and set equal to either of its two neighbors with equal probability. (b) Another interesting quantity is the probability distribution Pn(t) that n sites have not yet flipped as a function of the time t (see Das and Sen). Plot Pn versus n for two times on the same graph. Discuss the shape of the curves and their differences. Choose L ≥ 100 and t = 50 and 100. Try to fit the curves to a Gaussian distribution. Because the possible values of n are bounded, fit each side of the maximum of Pn to a Gaussian with different widths. There are a number of scaling properties that can be investigated. Show that Pn=0(t) scales approximately as t/L2. Thus, if you compute Pn=0(t) for a number of different times and lengths such that t/L2 has the same value, you should obtain the same value of Pn=0. Project 15.42. The inverse power law potential Consider the inverse power law potential V (r) = V0 σ r n , (15.96) with V0 > 0. One reason for the interest in potentials of this form is that thermodynamic quantities such as the mean energy E do not depend on V0 and σ separately, but depend on a single dimensionless parameter, which is defined as (see Project 8.25) Γ = V0 kT σ a (15.97) where a is defined in three and two dimensions by 4πa3ρ/3 = 1 and πa2ρ = 1, respectively. The length a is proportional to the mean distance between particles. A Coulomb interaction corresponds to n = 1, and a hard sphere system corresponds to n → ∞. What phases do you expect to occur for arbitrary n? (a) Compare the qualitative features of g(r) for a “soft” potential with n = 4 to a system of hard disks at the same density. (b) Let n = 12 and compute the mean energy E as a function of Γ for a three-dimensional system with N = 16, 32, 64, and 128. Does E depend on N? Can you extrapolate your results for the N-dependence of E to N → ∞? Do you see any evidence of a fluid-solid phase transition? If so, estimate the value of Γ at which it occurs. What is the nature of the transition if it exists? What is the symmetry of the ground state? (c) Let n = 4 and determine the symmetry of the ground state. For this value of n, there is a solid-to-solid phase transition at which the solid changes symmetry. To determine the value of Γ at which this phase transition exists and the symmetry of the smaller Γ solid phase (see Dubin and Dewitt), it is necessary to use a Monte Carlo method in which the shape of CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 650 the simulation cell changes to accomodate the different symmetry (the Rahman–Parrinello method), an interesting project. An alternative is to prepare a bcc lattice at Γ =≈ 105 (for example, T = 0.06 and ρ = 0.95). Then instantaneously change the potential from n = 4 to n = 12; the new value of Γ is ≈ 4180, and the new stable phase is fcc. The transition can be observed by watching the evolution of g(r). Project 15.43. Rare gas clusters There has been much recent interest in structures that contain many particles but that are not macroscopic. An example is the unusual structure of sixty carbon atoms known as a “buckeyball.” A less unusual structure is a cluster of argon atoms. Questions of interest include the structure of the clusters, the existence of “magic” numbers of particles for which the cluster is particularly stable, the temperature dependence of the quantities, and the possibility of different phases. This latter question has been subject to some controversy because transitions between different kinds of behavior in finite systems are not well defined, as they are for infinite systems. (a) Write a Monte Carlo program to simulate a three-dimensional system of particles interacting via the Lennard–Jones potential. Use open boundary conditions; that is, do not enclose the system in a box. The number of particles N and the temperature T should be input parameters. (b) Find the ground state energy E0 as a function of N. For each value of N begin with a random initial configuration and accept any trial displacement that lowers the energy. Repeat for at least ten different initial configurations. Plot E0/N versus N for N = 2 to 20 and describe the qualitative dependence of E0/N on N. Is there any evidence of magic numbers, that is, value(s) of N for which E0/N is a minimum? For each value of N save the final configuration. Plot the positions of the atoms. Does the cluster look like a part of a crystalline solid? (c) Repeat part (b) using simulated annealing. The initial temperature should be sufficiently low so that the particles do not move far away from each other. Slowly lower the temperature according to some annealing schedule. Are your results for E0/N lower than those you obtained in part (b)? (d) To gain more insight into the structure of the clusters, compute the mean number of neighbors per particle for each value of N. What is a reasonable criteria for two particles to be neighbors? Also compute the mean distance between each pair of particles. Plot both quantities as a function of N and compare their dependence on N with your plot of E0/N. (e) Do you find any evidence for a “melting” transition? Begin with the configuration that has the minimum value of E0/N and slowly increase the temperature T . Compute the energy per particle and the mean square displacement of the particles from their initial positions. Plot your results for these quantities versus T . Project 15.44. The hard disks fluid-solid transition Although we have mentioned (see Section 15.10) that there is much evidence for a fluid-solid transition in a hard disk system, the nature of the transition still is a problem of current research. In this project we follow the work of Lee and Strandburg and apply the constant pressure Monte Carlo method (see Section 15.12) and the Lee–Kosterlitz method (see Section 15.11) to investigate the nature of the transition. Consider N = L2 hard disks of diameter σ = 1 in a two-dimensional box of volume V = √ 3L2v/2 with periodic boundary conditions. The quantity CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 651 v ≥ 1 is the reduced volume and is related to the density ρ by ρ = N/V = 2/( √ 3v); v = 1 corresponds to maximum packing. The aspect ratio of 2/ √ 3 is used to match the perfect triangular lattice. Do a constant pressure (actually constant p∗ = P/kT ) Monte Carlo simulation. The trial displacement of each disk is implemented as discussed in Section 15.10. Lee and Strandburg find that a maximum displacement of 0.09 gives a 45% acceptance probability. The other type of move is a random isotropic change of the volume of the system. If the change of the volume leads to an overlap of the disks, the change is rejected. Otherwise, if the trial volume ˜V is less than the current volume V , the change is accepted. A larger trial volume is accepted with probability e−p∗( ˜V −V )+N ln( ˜V /V ) . (15.98) Volume changes are attempted 40–200 times for each set of individual disk moves. The quantity of interest is N(v), the distribution of the reduced volume v. Because we need to store information about N(v) in an array, it is convenient to discretize the volume in advance and choose the mesh size so that the acceptance probability for changing the volume by one unit is 40–50%. Do a Monte Carlo simulation of the hard disk system for L = 10 (N = 100) and p∗ = 7.30. Published results are for 107 Monte Carlo steps. To apply the Lee–Kosterlitz method, smooth lnN(v) by fitting it to an eighth-order polynomial. Then extrapolate lnN(v) using the histogram method to determine p∗ c(L = 10), the pressure at which the two peaks of N(v) are of equal height. What is the value of the free energy barrier ∆F? If sufficient computer resources are available, compute ∆F for larger L (published results are for L = 10, 12, 14, 16, and 20) and determine if ∆F depends on L. Can you reach any conclusions about the nature of the transition? Project 15.45. Vacancy mediated dynamics in binary alloys When a binary alloy is rapidly quenched from a high temperature to a low temperature unstable state, a pattern of domain formation called spinodal decomposition takes place as the two metals in the alloy separate. This process is of much interest experimentally. Lifshitz and Slyozov have predicted that at long times, the linear domain size increases with time as R ∼ t1/3. This result is independent of the dimension for d ≥ 2, and has been verified experimentally and in computer simulations. The behavior is modified for binary fluids due to hydrodynamic effects. Most of the computer simulations of this growth process have been based on the Ising model with spin exchange dynamics. In this model there is an A or B atom (spin up or spin down) at each site, where A and B represent different metals. The energy of interaction between atoms on two neighboring sites is −J if the two atoms are the same type and +J if they are different. Monte Carlo moves are made by exchanging unlike atoms. (The number of A and B atoms must be conserved.) A typical simulation begins with an equilibrated system at high temperatures. Then the temperature is changed instantaneously to a low temperature below the critical temperature Tc. If there are equal numbers of A and B atoms on the lattice, then spinodal decomposition occurs. If you watch a visualization of the evolution of the system, you will see wavy-like domains of each type of atom thickening with time. The growth of the domains is very slow if we use spin exchange dynamics. We will see that if simulations are performed with vacancy mediated dynamics, the scaling behavior begins at much earlier times. Because of the large energy barriers that prevent real metallic atoms from exchanging position, it is likely that spinodal decomposition in real alloys also occurs with vacancy mediated dynamics. We can do a realistic simulation by including just one vacancy because the number of vacancies in a real alloy is also very small. In this case the only possible Monte Carlo move on a square lattice is to exchange the vacancy with one of its four neighboring atoms. To implement this algorithm, you will need an array to keep track of which type of atom CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 652 is at each lattice site and variables to keep track of the location of the single vacancy. The simulation will run very fast because there is little bookkeeping and all the possible trial moves are potentially good ones. In contrast, in standard spin exchange dynamics, it is necessary to either waste computer time checking for unlike nearest neighbor atoms or keep track of where they are. The major quantity of interest is the growth of the domain size R. One way to determine R is to measure the pair correlation function C(r) = sisj , where r = |ri − rj|, and si = 1 for an A atom and si = −1 for a B atom. The first zero in C(r) is a measure of the domain size. An alternative measure of the domain size is the quantity R = 2/( E /N + 2), where E /N is the average energy per site and N is the number of sites (see Project 15.39). The quantity R is a rough measure of the length of the perimeter of a domain and is proportional to the domain size. (a) Write a program to simulate vacancy mediated dynamics. The initial state consists of the random placement of A and B atoms (half of the sites have A and half B atoms); one vacancy replaces one of the atoms. Explain why this configuration corresponds to infinite temperature. Choose a square lattice with L ≥ 50. (b) Instantaneously quench the system by running the Metropolis algorithm at a temperature of T = Tc/2 ≈ 1.13. You should first look at the lattice after every attempted move of the vacancy to see the effect of vacancy dynamics. After you are satisfied that your program is working correctly and that you understand the algorithm, speed up the simulation by only collecting data and showing the lattice at times equal to t = 2n where n = 1, 2, 3 . . . . Measure the domain size using either the energy or C(r) as a function of time averaged over many different initial configurations. (c) At what time does the logR versus logt plot become linear? Do both measures of the domain size give the same results? Does the behavior change for different quench temperatures? Try 0.2Tc and 0.7Tc. A log-log plot of the domain size versus time should give the exponent 1/3. (d) Repeat the measurements in three dimensions. Do you obtain the same exponent? Project 15.46. Heat flow using the demon algorithm In our applications of the demon algorithm one demon shared its energy equally with all the spins. As a result the spins all attained the same mean energy of interaction. Many interesting questions arise when the system is not spatially uniform and is in a nonequilibrium but timeindependent (steady) state. Let us consider heat flow in a one-dimensional Ising model. Suppose that instead of all the sites sharing energy with one demon, each site has its own demon. We can study the flow of heat by requiring the demons at the boundary spins to satisfy different conditions than the demons at the other spins. The demon at spin 0 adds energy to the system by flipping this spin so that it is in its highest energy state, that is, in the opposite direction of spin 1. The demon at spin N − 1 removes energy from the system by flipping spin N − 1 so that it is in its lowest energy state, that is, in the same direction as spin N − 2. As a result, energy flows from site 0 to site N − 1 via the demons associated with the intermediate sites. In order that energy not build up at the “hot” end of the Ising chain, we require that spin 0 can only add energy to the system if spin N − 1 simultaneously removes energy from the system. Because the demons at the two ends of the lattice satisfy different conditions than the other demons, we do not use periodic boundary conditions. CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 653 The temperature is determined by the generalization of the relation (15.10); that is, the temperature at site i is related to the mean energy of the demon at site i. To control the temperature gradient, we can update the end spins at a different rate than the other spins. The maximum temperature gradient occurs if we update the end spins after every update of an internal spin. A smaller temperature gradient occurs if we update the end spins less frequently. The temperature gradient between any two spins can be determined from the temperature profile, the spatial dependence of the temperature. The energy flow can be determined by computing the magnitude of the energy per unit time that enters the lattice at site 0. To implement this procedure we modify IsingDemon by converting the variables demonEnergy and demonEnergyAccumulator to arrays. We do the usual updating procedure for spins 1 through N − 2 and visit spins 0 and N − 1 at regular intervals denoted by timeToAddEnergy. The class ManyDemons can be downloaded from the ch15 directory. (a) Write a target class that inputs the number of spins N and the initial energy of the system, outputs the number of Monte Carlo steps per spin and the energy added to the system at the high temperature boundary, and plots the temperature as a function of position. (b) As a check on ManyDemons, modify the class so that all the demons are equivalent; that is, impose periodic boundary conditions and do not use method boundarySpins. Compute the mean energy of the demon at each site and use (15.10) to define a local site temperature. Use N ≥ 52 and run for about 10,000 mcs. Is the local temperature approximately uniform? How do your results compare with the single demon case? (c) In ManyDemons the energy is added to the system at site 0 and is removed at site N − 1. Determine the mean demon energy for each site and obtain the corresponding local temperature and the mean energy of the system. Draw the temperature profile by plotting the temperature as a function of site number. The temperature gradient is the difference in temperature from site N − 2 to site 1 divided by the distance between them. (The distance between neighboring sites is unity.) Because of local temperature fluctuations and edge effects, the temperature gradient should be estimated by fitting the temperature profile in the middle of the lattice to a straight line. Reasonable choices for the parameters are N = 52 and timeToAddEnergy = 1. Run for at least 10000 mcs. (d) The heat flux Q is the energy flow per unit length per unit time. The energy flow is the amount of energy that demon 0 adds to the system at site 0. The time is conveniently measured in terms of Monte Carlo steps per spin. Determine Q for the parameters used in part (c). (e) If the temperature gradient ∂T /∂x is not too large, the heat flux Q is proportional to ∂T /∂x. We can determine the thermal conductivity κ by the relation Q = −κ ∂T ∂x . (15.99) Use your results for ∂T /∂x and Q to estimate κ. (f) Determine Q, the temperature profile, and the mean temperature for different values of timeToAddEnergy. Is the temperature profile linear for all values of timeToAddEnergy? If the temperature profile is linear, estimate ∂T /∂x and determine κ. Does κ depend on the mean temperature? CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 654 Note that by using many demons we were able to compute a temperature profile by using an algorithm that manipulates only integer numbers. The conventional approach is to solve a heat equation similar in form to the diffusion equation. Now we use the same idea to compute the magnetization profile when the end spins of the lattice are fixed. (g) Modify ManyDemons by not calling method boundarySpins. Also, constrain spins 0 and N −1 to be +1 and −1, respectively. Estimate the magnetization profile by plotting the mean value of the spin at each site versus the site number. Choose N = 22 and mcs ≥ 1000. How do your results vary as you increase N? (h) Compute the mean demon energy and, hence, the local temperature at each site. Does the system have a uniform temperature even though the magnetization is not uniform? Is the system in thermal equilibrium? (i) The effect of the constraint on the end spins is easier to observe in two and three dimensions than in one dimension. Write a program for a two-dimensional Ising model on a L×L square lattice. Constrain the spins at site (i,j) to be +1 and −1 for i = 0 and i = L − 1, respectively. Use periodic boundary conditions in the y direction. How do your results compare with the one-dimensional case? (j) Remove the periodic boundary condition in the y direction and constrain all the boundary spins from i = 0 to (L/2) − 1 to be +1 and the other boundary spins to be −1. Choose an initial configuration where all the spins on the left half of the system are +1 and the others are −1. Do the simulation and draw a configuration of the spins once the system has reached equilibrium. Draw a line between each pair of spins of opposite sign. Describe the curve separating the +1 spins from the −1 spins. Begin with L = 20 and determine what happens as L is increased. Appendix 15A: Relation of the Mean Demon Energy to the Tem- perature We know that the energy of the demon Ed is constrained to be positive and that the probability for the demon to have energy Ed is proportional to e−Ed/kT . Hence, in general, Ed is given by Ed = Ed Ed e−Ed/kT Ed e−Ed/kT (15.100) where the summations in (15.100) are over the possible values of Ed. If an Ising spin is flipped in zero magnetic field, the minimum nonzero decrease in energy of the system is 4J (see Figure 15.11). Hence, the possible energies of the demon are 0, 4J, 8J, 12J, ... . We write x = 4J/kT and perform the summations in (15.100). The result is Ed/kT = 0 + xe−x + 2xe−2x + ··· 1 + e−x + e−2x + ··· = x ex − 1 . (15.101) The form (15.10) can be obtained by solving (15.101) for T in terms of Ed. Convince yourself that the relation (15.101) is independent of dimension for lattices with an even number of nearest neighbors. CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 655 ∆E = -8J ∆E = -4J ∆E = 0 ∆E = 4J ∆E = 8J Figure 15.11: The five possible transitions of the Ising model on the square lattice with spin flip dynamics. If the magnetic field is nonzero, the possible values of the demon energy are 0, 2H, 4J −2H, 4J + 2H, ... . If J is a multiple of H, then the result is the same as before with 4J replaced by 2H, because the possible energy values for the demon are multiples of 2H. If the ratio 4J/2H is irrational, then the demon can take on a continuum of values, and thus Ed = kT . The other possibility is that 4J/2H = m/n, where m and n are prime positive integers that have no common factors (other than 1). In this case it can be shown that (see Mak) kT /J = 4/m ln(1 + 4J/m Ed ) . (15.102) Surprisingly, (15.102) does not depend on n. Test these relations for H 0 by choosing values of J and H and computing the sums in (15.100) directly. Appendix 15B: Fluctuations in the Canonical Ensemble We first obtain the relation of the constant volume heat capacity CV to the energy fluctuations in the canonical ensemble. We write CV as CV = ∂ E ∂T = − 1 kT 2 ∂ E ∂β . (15.103) From (15.11) we have E = − ∂ ∂β lnZ, (15.104) CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 656 and ∂ E ∂β = − 1 Z2 ∂Z ∂β s Es e−βEs − 1 Z s E2 s e−βEs (15.105) = E 2 − E2 . (15.106) The relation (15.19) follows from (15.103) and (15.106). Note that the heat capacity is at constant volume because the partial derivatives were performed with the energy levels Es kept constant. The corresponding quantity for a magnetic system is the heat capacity at constant external magnetic field. The relation of the magnetic susceptibility χ to the fluctuations of the magnetization M can be obtained in a similar way. We assume that the energy can be written as Es = E0,s − HMs (15.107) where E0,s is the energy of interaction of the spins in the absence of a magnetic field, H is the external applied field, and Ms is the magnetization in the s state. The mean magnetization is given by M = 1 Z Ms e−βEs . (15.108) Because ∂Es/∂H = −Ms, we have ∂Z ∂H = s βMs e−βEs . (15.109) Hence, we obtain M = 1 β ∂ ∂H lnZ. (15.110) If we use (15.108) and (15.110), we find ∂ M ∂H = − 1 Z2 ∂Z ∂H s Ms e−βEs + 1 Z s βM2 s e−βEs (15.111) = −β M 2 + β M2 . (15.112) The relation (15.21) for the zero-field susceptibility follows from (15.112) and the definition (15.20). Appendix 15C: Exact Enumeration of the 2 × 2 Ising Model Because the number of possible states or configurations of the Ising model increases as 2N , we can enumerate the possible configurations only for small N. As an example, we calculate the various quantities of interest for a 2 × 2 Ising model on the square lattice with periodic boundary conditions. In Table 15.2 we group the sixteen states according to their total energy and magnetization. We can compute all the quantities of interest using Table 15.2. The partition function is given by Z = 2e8βJ + 12 + 2e−8βJ . (15.113) CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 657 # Spins Up g(E,M) Energy Magnetization 4 1 −8 4 3 4 0 2 2 4 0 0 2 2 8 0 1 4 0 −2 0 1 −8 −4 Table 15.2: The energy and magnetization of the 24 states of the zero field Ising model on the 2 × 2 square lattice. The quantity g(E,M) is the number of microstates with the same energy. If we use (15.104) and (15.113), we find E = − ∂ ∂β lnZ = − 1 Z 2(8)e8βJ + 2(−8)e−8βJ . (15.114) Because the other quantities of interest can be found in a similar manner, we only give the results: E2 = 1 Z (2 × 64)e8βJ + (2 × 64)e−8βJ (15.115) M = 1 Z (0) = 0 (15.116) |M| = 1 Z (2 × 4)e8βJ + 8 × 2 (15.117) M2 = 1 Z (2 × 16)e8βJ + 8 × 4 . (15.118) The dependence of C and χ on βJ can be found by using (15.114) and (15.115) and (15.116) and (15.118), respectively. References and Suggestions for Further Reading M. P. Allen and D. J. Tildesley, Computer Simulation of Liquids (Clarendon Press, 1987). See Chapter 4 for a discussion of Monte Carlo methods. Paul D. Beale, “Exact distribution of energies in the two-dimensional Ising model,” Phys. Rev. Lett. 76, 78 (1996). The author discusses a Mathematica program that can compute the exact density of states for the two-dimensional Ising model. K. Binder, editor, Monte Carlo Methods in Statistical Physics, 2nd ed. (Springer–Verlag, 1986). Also see K. Binder, editor, Applications of the Monte Carlo Method in Statistical Physics (pringer–Verlag, 1984) and K. Binder, editor, The Monte Carlo Method in Condensed Matter Physics (Springer–Verlag, 1992). The latter book discusses the Binder cumulant method in the introductory chapter. Marvin Bishop and C. Bruin, “The pair correlation function: A probe of molecular order,” Am. J. Phys. 52, 1106–1108 (1984). The authors compute the pair correlation function for a two-dimensional Lennard–Jones model. CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 658 A. B. Bortz, M. H. Kalos and J. L. Lebowitz, “A new algorithm for Monte Carlo simulation of Ising spin systems,” J. Comput. Phys. 17, 10–18 (1975). This paper first introduced the nfold way algorithm, which was rediscovered independently by many workers in the 1970s and 80s. S. G. Brush, “History of the Lenz–Ising model,” Rev. Mod. Phys. 39, 883–893 (1967). James B. Cole, “The statistical mechanics of image recovery and pattern recognition,”Am. J. Phys. 59, 839–842 (1991). A discussion of the application of simulated annealing to the recovery of images from noisy data. R. Cordery, S. Sarker, and J. Tobochnik, “Physics of the dynamical critical exponent in one dimension,” Phys. Rev. B 24, 5402–5403 (1981). Michael Creutz, “Microcanonical Monte Carlo simulation,” Phys. Rev. Lett. 50, 1411 (1983). See also Gyan Bhanot, Michael Creutz, and Herbert Neuberger, “Microcanonical simulation of Ising systems,” Nuc. Phys. B 235, 417–434 (1984). Pratap Kumar Das and Parongama Sen, “Probability distributions of persistent spins in an Ising chain,” J. Phys. A 37, 7179–7184 (2004). B. Derrida, A. J. Bray, and C. Godrèche, “Non-trivial exponents in the zero temperature dynamics of the 1D Ising and Potts models,” J. Phys. A 27, L357–L361 (1994); B. Derrida, V. Hakim, and V. Pasquier, “Exact first passage exponents in 1d domain growth: Relation to a reaction-diffusion model,” Phys. Rev. Lett. 75, 751 (1995). Daniel H. E. Dubin and Hugh Dewitt, “Polymorphic phase transition for inverse-power-potential crystals keeping the first-order anharmonic correction to the free energy,” Phys. Rev. B 49, 3043–3048 (1994). Jerome J. Erpenbeck and Marshall Luban, “Equation of state for the classical hard-disk fluid,” Phys. Rev. A 32, 2920–2922 (1985). These workers use a combined molecular dynamics/Monte Carlo method and consider 1512 and 5822 disks. Alan M. Ferrenberg, D. P. Landau, and Y. Joanna Wong, “Monte Carlo simulations: Hidden errors from “good” random number generators,” Phys. Rev. Lett. 69, 3382 (1992). Alan M. Ferrenberg and Robert H. Swendsen, “New Monte Carlo technique for studying phase transitions,” Phys. Rev. Lett. 61, 2635 (1988); “Optimized Monte Carlo data analysis,” Phys. Rev. Lett. 63, 1195 (1989); “Optimized Monte Carlo data analysis,” Computers in Physics 3 5, 101 (1989). The second and third papers discuss using the multiple histogram method with data from simulations at more than one temperature. P. Fratzl and O. Penrose, “Kinetics of spinodal decomposition in the Ising model with vacancy diffusion,” Phys. Rev. B 50, 3477–3480 (1994). Daan Frenkel and Berend Smit, Understanding Molecular Simulation, 2nd ed. (Academic Press, 2002). Harvey Gould and W. Klein, “Spinodal effects in systems with long-range interactions,” Physica D 66, 61–70 (1993). This paper discusses nucleation in the Ising model and Lennard– Jones systems. Harvey Gould and Jan Tobochnik, “Overcoming critical slowing down,” Computers in Physics 3 (4), 82 (1989). CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 659 James E. Gubernatis, The Monte Carlo Method in the Physical Sciences (AIP Press, 2004). June 2003 was the 50th anniversary of the Metropolis, Rosenbluth, Rosenbluth, Teller, and Teller publication of what is now called the Metropolis algorithm. This algorithm established the Monte Carlo method in physics and other fields and lead to the development of other Monte Carlo algorithms. Six of the papers in the proceedings of the conference give historical perspectives. Hong Guo, Martin Zuckermann, R. Harris, and Martin Grant, “A fast algorithm for simulated annealing,” Physica Scripta T38, 40–44 (1991). Gary S. Grest, Michael P. Anderson, and David J. Srolovitz, “Domain-growth kinetics for the Q-state Potts model in two and three dimensions,” Phys. Rev. B 38, 4752–4760 (1988). R. Harris, “Demons at work,” Computers in Physics 4 (3), 314 (1990). S. Istrail, “Statistical mechanics, three-dimensionality and NP-completeness: I. Universality of intractability of the partition functions of the Ising model across non-planar lattices,” Proceedings of the 32nd ACM Symposium on the Theory of Computing, ACM Press, pp. 87–96, Portland, Oregon, May 21–23, 2000. This paper shows that it is impossible to obtain an analytic solution for the three-dimensional Ising model. J. Kertész, J. Cserti and J. Szép, “Monte Carlo simulation programs for microcomputer,” Eur. J. Phys. 6, 232–237 (1985). S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, “Optimization by simulated annealing,” Science 220, 671–680 (1983). See also, S. Kirkpatrick and G. Toulouse, “Configuration space analysis of traveling salesman problems,” J. Physique 46, 1277–1292 (1985). J. M. Kosterlitz and D. J. Thouless, “Ordering, metastability and phase transitions in twodimensional systems,” J. Phys. C 6, 1181–1203 (1973); J. M. Kosterlitz, “The critical properties of the two-dimensional xy model,” J. Phys. C 7, 1046–1060 (1974). D. P. Landau, Shan-Ho Tsai, and M. Exler, “A new approach to Monte Carlo simulations in statistical physics: Wang–Landau sampling,” Am. J. Phys. 72, 1294–1302 (2004). D. P. Landau, “Finite-size behavior of the Ising square lattice,” Phys. Rev. B 13, 2997–3011 (1976). A clearly written paper on a finite-size scaling analysis of Monte Carlo data. See also D. P. Landau, “Finite-size behavior of the simple-cubic Ising lattice,” Phys. Rev. B 14, 255–262 (1976). D. P. Landau and R. Alben, “Monte Carlo calculations as an aid in teaching statistical mechanics,” Am. J. Phys. 41, 394–400 (1973). David Landau and Kurt Binder, A Guide to Monte Carlo Simulations in Statistical Physics, 2nd ed. (Cambridge University Press, 2005). Jooyoung Lee and J. M. Kosterlitz, “New numerical method to study phase transitions,” Phys. Rev. Lett. 65, 137 (1990); ibid., “Finite-size scaling and Monte Carlo simulations of firstorder phase transitions,” Phys. Rev. B 43, 3265–3277 (1991). Jooyoung Lee and Katherine J. Strandburg, “First-order melting transition of the hard-disk system,” Phys. Rev. B 46, 11190–11193 (1992). Jiwen Liu and Erik Luijten, “Rejection-free geometric cluster algorithm for complex fluids,” Phys. Rev. Lett. 92 035504 (2004) and ibid., Phys. Rev. E 71, 066701-1–12 (2005). CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 660 J. Machta, Y. S. Choi, A. Lucke, T. Schweizer, and L. Chayes, “Invaded cluster algorithm for Potts models,” Phys. Rev. E 54, 1332–1345 (1996). S. S. Mak, “The analytical demon of the Ising model,” Phys. Lett. A 196, 318 (1995). J. Marro and R. Toral, “Microscopic observations on a kinetic Ising model,” Am. J. Phys. 54, 1114–1121 (1986). N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, “Equation of state calculations by fast computing machines,” J. Chem. Phys. 21, 1087–1092 (1953). A. Alan Middleton, “Improved extremal optimization for the Ising spin glass,” Phys. Rev. E 69, 055701-1–4 (2004). The extremal optimization algorithm, which was inspired by the Bak–Sneppen algorithm for evolution (see Problem 14.12), preferentially flips spins that are “unfit.” The adaptive algorithm proposed in this paper is an example of an heuristic that finds exact ground states efficiently for systems with frozen-in disorder. M. E. J. Newman and G. T. Barkema, Monte Carlo Methods in Statistical Physics (Oxford University Press, 1999). M. A. Novotny, “A new approach to an old algorithm for the simulation of Ising-like systems,” Computers in Physics 9 (1), 46 (1995). The n-fold way algorithm is discussed. Also, see M. A. Novotny, “A tutorial on advanced dynamic Monte Carlo methods for systems with discrete state spaces,” in Annual Reviews of Computational Physics IX, edited by Dietrich Stauffer (World Scientific, 2001), pp. 153–210. Ole G. Mouritsen, Computer Studies of Phase Transitions and Critical Phenomena (Springer– Verlag, 1984). E. P. Münger and M. A. Novotny, “Reweighting in Monte Carlo and Monte Carlo renormalizationgroup studies,” Phys. Rev. B 43, 5773–5783 (1991). The authors discuss the histogram method and combine it with renormalization group calculations. Michael Plischke and Birger Bergersen, Equilibrium Statistical Physics, 3rd ed. (Prentice Hall, 2005). A graduate level text that discusses some contemporary topics in statistical physics, many of which have been influenced by computer simulations. William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery, Numerical Recipes, 2nd ed. (Cambridge University Press, 1992). A Fortran program for the traveling salesman problem is given in Section 10.9. Stephen H. Shenker and Jan Tobochnik, “Monte Carlo renormalization-group analysis of the classical Heisenberg model in two dimensions,” Phys. Rev. B 22, 4462–472 (1980). Amihai Silverman and Joan Adler, “Animated simulated annealing,” Computers in Physics 6, 277 (1992). The authors describe a simulation of the annealing process to obtain a defect free single crystal of a model material. H. Eugene Stanley, Introduction to Phase Transitions and Critical Phenomena (Oxford University Press, 1971). See Appendix B for the exact solution of the zero-field Ising model for a two-dimensional lattice. Jan Tobochnik and G. V. Chester, “Monte Carlo study of the planar model,” Phys. Rev. B 20, 3761–3769 (1979). CHAPTER 15. MONTE CARLO SIMULATIONS OF THERMAL SYSTEMS 661 Jan Tobochnik, Harvey Gould, and Jon Machta, “Understanding the temperature and the chemical potential through computer simulations,” Am. J. Phys. 73 (8), 708–716 (2005). This paper extends the demon algorithm to compute the chemical potential. Simon Trebst, David A. Huse, and Matthias Troyer, “Optimizing the ensemble for equilibration in broad-histogram Monte Carlo simulations,” Phys. Rev. E 70, 046701-1–5 (2004). The adaptive algorithm presented in this paper overcomes critical slowing down and improves upon the Wang–Landau algorithm and is another example of the flexibility of Monte Carlo algorithms. I. Vattulainen, T. Ala–Nissila, and K. Kankaala, “Physical tests for random numbers in simulations,” Phys. Rev. Lett. 73, 2513 (1994). B. Widom, “Some topics in the theory of fluids,” J. Chem. Phys. 39, 2808–2812 (1963). This paper discusses the insertion method for calculating the chemical potential. Chapter 16 Quantum Systems We discuss numerical solutions of the time-independent and time-dependent Schrödinger equation and describe several Monte Carlo methods for estimating the ground state of quantum systems. 16.1 Introduction So far we have simulated the microscopic behavior of physical systems using Monte Carlo methods and molecular dynamics. In the latter method, the classical trajectory (the position and momentum) of each particle is calculated as a function of time. However, in quantum systems the position and momentum of a particle cannot be specified simultaneously. Because the description of microscopic particles is intrinsically quantum mechanical, we cannot directly simulate their trajectories on a computer (see Feynman). Quantum mechanics does allow us to analyze probabilities, although there are difficulties associated with such an analysis. Consider a simple probabilistic system described by the onedimensional diffusion equation (see Section 7.2) ∂P (x,t) ∂t = D ∂2P (x,t) ∂x2 , (16.1) where P (x,t) is the probability density of a particle being at position x at time t. One way to convert (16.1) to a difference equation and obtain a numerical solution for P (x,t) is to make x and t discrete variables. Suppose we choose a mesh size for x such that the probability is given at p values of x. If we choose p to be order 103, a straightforward calculation of P (x,t) would require approximately 103 data points for each value of t. In contrast, the corresponding calculation of the dynamics of a single particle based on Newton’s second law would require one data point. The limitations of the direct computational approach become even more apparent if there are many degrees of freedom. For example, for N particles in one dimension, we would have to calculate the probability P (x1,x2,...,xN ,t), where xi is the position of particle i. Because we need to choose a mesh of p points for each xi, we need to specify Np values at each time t. For the same level of precision, p will be proportional to the length of the system (for particles confined to one dimension). Consequently, the calculation time and memory requirements grow exponentially with the length of the system. For example, for 10 particles on a mesh of 100 points, we would 662 CHAPTER 16. QUANTUM SYSTEMS 663 need to store 10100 numbers to represent P , which is already much more than any computer today can store. In two and three dimensions the growth is even faster. Although the direct computational approach is limited to systems with only a few degrees of freedom, the simplicity of this approach will aid our understanding of the behavior of quantum systems. After a summary of the general features of quantum mechanical systems in Section 16.2, we consider this approach to solving the time-independent Schrödinger equation in Sections 16.3 and 16.4. In Section 16.5, we use a half-step algorithm to generate wave packet solutions to the time-dependent Schrödinger equation. Because we have already learned that the diffusion equation (16.1) can be formulated as a random walk problem, it might not surprise you that Schrödinger’s equation can be analyzed in a similar way. Monte Carlo methods are introduced in Section 16.7 to obtain variational solutions of the ground state. We introduce quantum Monte Carlo methods in Section 16.8 and discuss more sophisticated quantum Monte Carlo methods in Sections 16.9 and 16.10. 16.2 Review of Quantum Theory For simplicity, we consider a one-dimensional, nonrelativistic quantum system consisting of one particle. The state of the system is completely characterized by the position space wave function Ψ (x,t), which is interpreted as a probability amplitude. The probability P (x,t)∆x of the particle being in a “volume” element ∆x centered about the position x at time t is equal to P (x,t)∆x = |Ψ (x,t)|2 ∆x, (16.2) where |Ψ (x,t)|2 = Ψ (x,t)Ψ ∗(x,t), and Ψ ∗(x,t) is the complex conjugate of Ψ (x,t). This interpretation of Ψ (x,t) requires the use of normalized wave functions such that ∞ −∞ Ψ ∗ (x,t)Ψ (x,t)dx = 1. (16.3) If the particle is subjected to the influence of a potential energy function V (x,t), the evolution of Ψ (x,t) is given by the time-dependent Schrödinger equation i ∂Ψ (x,t) ∂t = − 2 2m ∂2Ψ (x,t) ∂x2 + V (x,t)Ψ (x,t), (16.4) where m is the mass of the particle, and is Planck’s constant divided by 2π. Physically measurable quantities, such as the momentum, have corresponding operators. The expectation or average value of an observable A is given by A = Ψ ∗ (x,t) ˆAΨ (x,t)dx, (16.5) where ˆA is the operator corresponding to the measurable quantity A. For example, the momentum operator corresponding to the linear momentum p is ˆp = −i ∂/∂x in position space. If the potential energy function is independent of time, we can obtain solutions of (16.4) of the form Ψ (x,t) = φ(x)e−iEt/ . (16.6) CHAPTER 16. QUANTUM SYSTEMS 664 A particle in the state (16.6) has a well-defined energy E. If we substitute (16.6) into (16.4), we obtain the time-independent Schrödinger equation − 2 2m d2φ(x) dx2 + V (x)φ(x) = E φ(x). (16.7) Note that φ(x) is an eigenstate of the Hamiltonian operator ˆH = − 2 2m ∂2 ∂x2 + V (x) (16.8) with the eigenvalue E. That is, ˆH φ(x) = E φ(x). (16.9) In general, there are many eigenstates φn, each with eigenvalues En that satisfy (16.9) and the boundary conditions imposed on the eigenstates by physical considerations. The general form of Ψ (x,t) can be expressed as a superposition of the eigenstates of the operator corresponding to any physical observable. For example, if ˆH is independent of time, we can write Ψ (x,t) = n cn φn(x)e−iEnt/ , (16.10) where Σ represents a sum over the discrete states and an integral over the continuum states. The coefficients cn in (16.10) can be determined from the value of Ψ (x,t) at any time t. For example, if we know Ψ (x,t = 0), we can use the orthonormality property of the eigenstates of any physical operator to obtain cn = φ∗ n(x)Ψ (x,0)dx. (16.11) The coefficient cn can be interpreted as the probability amplitude of a measurement of the total energy yielding a particular value En. There are three steps needed to solve (16.7) numerically. The first is to integrate (16.7) for any given value of the energy E in a way similar to the approach we have used for numerically solving other ordinary differential equations. This approach will usually not satisfy the boundary conditions. The second step is to find the particular values of E that lead to solutions that satisfy the boundary conditions. Finally, we need to normalize the eigenstate wave function using (16.3) so that we can interpret the eigenstate as a probability amplitude. We first discuss the solution of (16.7) without imposing any boundary conditions by treating the solution to (16.7) as an initial value problem for the wave function and its derivative at some value of x for a given value of E. We will use these solutions to develop our intuition about the behavior of one-dimensional solutions to the Schrödinger equation. To use an ODE solver, we express the wave function rate in terms of the independent variable x: dφ dx = φ (16.12a) dφ dx = − 2m 2 [E − V (x)]φ (16.12b) dx dx = 1. (16.12c) Because the time-independent Schrödinger equation is a second-order differential equation, two initial conditions must be specified to obtain a solution. For simplicity, we first assume CHAPTER 16. QUANTUM SYSTEMS 665 that the wave function is zero at the starting point, xmin, and the derivative is nonzero. We also assume that the range of values of x is finite and divide this range into intervals of width ∆x. We initially consider potential energy functions V (x) such that V (x) = 0 for x < 0; V (x) changes abruptly at x = 0 to V0, the value of the stepHeight parameter. An implementation of the numerical solution of (16.12) is shown in Listing 16.1. Listing 16.1: The Schroedinger class models the one-dimensional time-independent Schrödinger equation. package org . opensourcephysics . sip . ch16 ; import org . opensourcephysics . numerics . ; public class Schroedinger implements ODE { double energy = 0; double [ ] phi ; double [ ] x ; double xmin , xmax ; / / range of values of x double [ ] s t a t e = new double [ 3 ] ; / / s t a t e = phi , dphi / dx , x ODESolver solver = new RK45MultiStep ( this ) ; double stepHeight = 0; int numberOfPoints ; public void i n i t i a l i z e ( ) { phi = new double [ numberOfPoints ] ; x = new double [ numberOfPoints ] ; double dx = (xmax−xmin ) / ( numberOfPoints −1); solver . setStepSize ( dx ) ; } void solve ( ) { for ( int i = 0; i 1.0 e9 ) { / / checks f o r diverging s o l u t i o n break ; / / l e a v e the loop } } } public double [ ] getState ( ) { return s t a t e ; } public void getRate ( double [ ] state , double [ ] rate ) { rate [ 0] = s t a t e [ 1 ] ; rate [ 1] = 2.0 ( − energy+evaluatePotential ( s t a t e [ 2 ] ) ) s t a t e [ 0 ] ; rate [ 2] = 1 . 0 ; CHAPTER 16. QUANTUM SYSTEMS 666 } public double evaluatePotential ( double x ) { / / p o t e n t i a l i s nonzero f o r x > 0 i f ( x<0) { return 0; } else { return stepHeight ; } } } The solve method initializes the wave function and position arrays and sets the initial value of dφ/dx to an arbitrary nonzero value of unity. A loop is then used to compute values of φ until the solution diverges or until x ≥ xmax. SchroedingerApp in Listing 16.2 produces a graphical view of φ(x). We will use this program in Problem 16.1 to study the behavior of the solution as we vary the height of the potential step. Listing 16.2: SchroedingerApp solves the one-dimensional time-independent Schrödinger equation for a given energy. package org . opensourcephysics . sip . ch16 ; import org . opensourcephysics . controls . ; import org . opensourcephysics . display . ; import org . opensourcephysics . frames . ; public class SchroedingerApp extends AbstractCalculation { PlotFrame frame = new PlotFrame ( "x" , "phi" , "Wave function" ) ; Schroedinger schroedinger = new Schroedinger ( ) ; public SchroedingerApp ( ) { frame . setConnected (0 , true ) ; frame . setMarkerShape (0 , Dataset .NO_MARKER) ; } public void calculate ( ) { schroedinger . xmin = control . getDouble ( "xmin" ) ; schroedinger . xmax = control . getDouble ( "xmax" ) ; schroedinger . stepHeight = control . getDouble ( "step height at x = 0" ) ; schroedinger . numberOfPoints = control . getInt ( "number of points" ) ; schroedinger . energy = control . getDouble ( "energy" ) ; schroedinger . i n i t i a l i z e ( ) ; schroedinger . solve ( ) ; frame . append (0 , schroedinger . x , schroedinger . phi ) ; } public void reset ( ) { control . setValue ( "xmin" , −5); control . setValue ( "xmax" , 5 ) ; control . setValue ( "step height at x = 0" , 1 ) ; control . setValue ( "number of points" , 500); control . setValue ( "energy" , 1 ) ; } CHAPTER 16. QUANTUM SYSTEMS 667 public s t a t i c void main ( String [ ] args ) { CalculationControl . createApp (new SchroedingerApp ( ) , args ) ; } } Problem 16.1. Numerical solution of the time-independent Schrödinger equation (a) Sketch your guess for φ(x) for a potential step height of V0 = 3 and energies E = 1, 2, 3, 4, and 5. (b) Choose xmin = -10 and xmax = 10, and run SchroedingerApp with the parameters given in part (a). How well do your predictions match the numerical solution? Is there any discontinuity in φ or in the derivative dφ/dx at x = 0? Describe the wave function for both x < 0 and x > 0. Why does the wave function have a larger oscillatory amplitude when x > 0 than when x < 0 if the energy is greater than the potential step height? (c) Describe the behavior of the wave function as the energy approaches the potential step height. Consider E in the range 2.5 to 3.5 in steps of 0.1. (d) Repeat part (b) with the initial condition φ = 1 and dφ/dx = 0. Describe the differences, if any, in φ(x). Problem 16.1 demonstrates that the nature of the solution of (16.7) changes dramatically depending on the relative values of the energy E and the potential energy. If E is greater than V0, the wave function is oscillatory whereas, if E is less than or equal to V0, the wave function grows exponentially. The differential equation solver may fail if the difference between the potential energy and E is too large. There is also an exponentially decaying solution in the region where E < V0, but this solution is difficult to detect. Problem 16.2. Analytic solutions of the time-independent Schrödinger equation (a) Find the analytic solution to (16.7) for the step potential for the cases: E > V0, E < V0, and E = V0. We will use units such that m = = 1 in all the problems in this chapter. (b) Run SchroedingerApp for the three cases to obtain the numerical solution of (16.7). When the numercial solution shows spatial oscillations in a region of space, estimate the wavelength of the oscillations and compare your numerical solution to the analytic results. When the numerical solution shows exponential decay as a function of position, estimate the decay rate and compare your numerical solution with the analytic solution. The solutions that we have obtained so far do not satisfy any condition other than that they solve (16.12). We have plotted only a portion of the wave function, and the solutions can be extended by increasing the number of points and the range of x over which the computation is performed. Physically, these solutions are unrealistic because they cannot be normalized over all of space. The normalization problem can be solved by using a linear combination of energy eigenstates (16.10) with different values of E. This combination is called a wavepacket. Although we used a fourth-order algorithm in Listing 16.1, simpler algorithms can be used. Recall that the solution of (16.7) with V (x) = 0 can be expressed as a linear combination of sine and cosine functions. The oscillatory nature of this solution leads us to expect that the Euler–Cromer algorithm introduced in Chapter 3 will yield satisfactory results. CHAPTER 16. QUANTUM SYSTEMS 668 16.3 Bound State Solutions We first consider potentials for which a particle is confined to a specific region of space. Such a potential is known as the infinite square well and is described by V (x) =    0 for |x| ≤ a ∞ for |x| > a. (16.13) For this potential, an acceptable solution of (16.7) must vanish at the boundaries of the well. We will find that the eigenstates φn(x) can satisfy these boundary conditions only for specific values of the energy En. Problem 16.3. The infinite square well (a) Show analytically that the energy eigenvalues of the infinite square well are given by En = n2π2 2/8ma2, where n is a positive integer. Also show that the normalized eigenstates have the form φn(x) = 1 √ a cos nπx 2a n = 1,3,... (even parity) (16.14a) φn(x) = 1 √ a sin nπx 2a n = 2,4,... (odd parity). (16.14b) What is the parity of the ground state solution? (b) We can solve (16.7) numerically for the infinite square well by setting stepHeight = 0. xmin = −a, and xmax = +a in SchroedingerApp and requiring that φ(x = +a) = 0. What is the condition for φ(x = −a) in the program? Choose a = 1 and calculate the first four energy eigenvalues exactly using SchroedingerApp. Do the numerical and analytic solutions match? Do the solutions satisfy the boundary conditions exactly? Are your numerical solutions normalized? Problem 16.4. Bound state solutions of the time-independent Schrödinger equation (a) Consider the potential energy function defined by V (x) =    0 for −a ≤ x ≤ 0 V0 for 0 < x ≤ a ∞ for |x| > a. (16.15) As for the infinite square well, the eigenfunction is confined between infinite potential barriers at x = ±a. In addition, there is a step potential at x = 0. Choose a = 5 and V0 = 1 and run SchroedingerApp with an energy of E = 0.15. Repeat with an energy of E = 0.16. Why can you conclude that an energy eigenvalue is bracketed by these two values? (b) Choose a strategy for determining the value of E such that the boundary conditions at x = +a are satisfied. Determine the energy eigenvalue to four decimal places. Does your answer depend on the number of points at which the wave function is computed? (c) Repeat the above procedure starting with energy values of 0.58 and 0.59 and find the energy eigenvalue of the second bound state. CHAPTER 16. QUANTUM SYSTEMS 669 If you were persistent in doing all of Problem 16.4, you would have discovered two energy eigenvalues, 0.1505 and 0.5857. The procedure we used is known as the shooting algorithm. The allowed eigenvalues are imposed by the requirement that φn(x) → 0 at the boundaries. Although the shooting algorithm usually yields an eigenvalue solution, we often wish to find specific eigenvalues, such as the eigenvalue E = 1.1195 corresponding to the third excited state for the potential in (16.15). Because the energy of a wave function increases as the wavelength decreases, we can order the energy eigenvalues by counting the number of times the corresponding eigenstate crosses the x-axis, that is, by the number of nodes. The ground state eigenstate has no nodes. Why? Why can we order the eigenvalues by the number of nodes? The number of nodes can be used to narrow the energy bracket in the shooting algorithm. For example, if we are searching for the third energy eigenvalue and we observe 5 nodes, then the energy is too large. To find a specific quantum state, we automate the shooting method as follows: 1. Choose a value of the energy E and count the number of nodes. 2. Increase E and repeat step 1 until the number of nodes is equal to the desired number. 3. Decrease E and repeat step 1 until the number of nodes is one less than the desired number. The desired value of the energy eigenvalue is now bracketed. We can further narrow the energy by doing the following: 4. Set the energy to the bracket midpoint. 5. Initialize φ(x) at the left boundary and iterate φ(x) toward increasing x until φ diverges or until the right boundary is reached. 6. If the quantum number is even (odd) and the last value of φ(x) in step 4 is negative (positive), then the trial value of E is too large. 7. If the quantum number is even (odd) and the last value of φ(x) in step 4 is positive (negative), then the trial value of E is too small. 8. Repeat steps 2–7 until the wave function satisfies the right-hand boundary condition to an acceptable tolerance. This procedure is known as a binary search because every repetition decreases the energy bracket by a factor of two. Problem 16.5 asks you to write a program that finds specific eigenvalues using this proce- dure. Problem 16.5. Shooting algorithm (a) Modify SchroedingerApp to find the eigenvalue associated with a given number of nodes. How is the number of nodes related to the quantum number? Test your program for the infinite square well. What is the value of ∆x needed to determine E1 to two decimal places? three decimal places? (b) Add a method to normalize φ. Normalize and display the first five eigenstates. (c) Find the first five eigenstates and eigenvalues for the potential in (16.15) with a = 1 and V − 0 = 1. (d) Does your result for E1 depend on the starting value of dφ/dx? CHAPTER 16. QUANTUM SYSTEMS 670 -a -b ab x Vb V Figure 16.1: An infinite square well with a potential bump of height Vb in the middle. Problem 16.6. Perturbation of the infinite square well (a) Determine the effect of a small perturbation on the eigenstates and eigenvalues of the infinite square well. Place a small rectangular bump of half-width b and height Vb symmetrically about x = 0 (see Fig. 16.1). Choose b a and determine how the ground state energy and eigenstate change with Vb and b. What is the relative change in the ground state energy for Vb = 10, b = 0.1 and Vb = 20, b = 0.1 with a = 1? Let φ0 denote the ground state eigenstate for b = 0 and let φb denote the ground state eigenstate for b 0. Compute the value of the overlap integral a 0 φb(x)φ0(x)dx. (16.16) This integral would be unity if the perturbation was not present (and the eigenstate was properly normalized). How is the change in the overlap integral related to the relative change in the energy eigenvalue? (b) Compute the ground state energy for Vb = 20 and b = 0.05. How does the value of E1 compare to that found in part (a) for Vb = 10 and b = 0.1? Because numerical solutions to the Schrödinger equation grow exponentially if V (x) − E > 0, it may not be possible to obtain a numerical solution for φ(x) that satisfies the boundary conditions if V (x) − E is large over an extended region of space. The reason is that energy can be specified and φ can be computed only to finite accuracy. Problem 16.7 shows that we can sometimes solve this problem using simpler boundary conditions if the potential is symmetric. In this case, V (x) = V (−x), (16.17) and φ(x) can be chosen to have definite parity. For even parity solutions, φ(−x) = φ(x); odd parity solutions satisfy φ(−x) = −φ(x). The definite parity of φ(x) allows us to specify either φ or φ at x = 0. Hence, the parity of φ determines one of the boundary conditions. For simplicity, choose φ(0) = 1 and φ (0) = 0 for even parity solutions and φ(0) = 0 and φ (0) = 1 for odd parity solutions. Problem 16.7. Symmetric potentials (a) Modify Schroedinger to make use of symmetric potential boundary conditions for the harmonic oscillator: V (x) = 1 2 x2 . (16.18) CHAPTER 16. QUANTUM SYSTEMS 671 Start the solution at x = 0 using appropriate conditions for even and odd quantum numbers and find the first four energy eigenvalues such that the wave function approaches zero for large values of x. Because the computed φ(x) will diverge for sufficiently large x, we seek values of the energy such that a small decrease in E causes the wave function to diverge in one direction, and a small increase causes the wave function to diverge in the opposite direction. Initially choose xmax = 5 so that the classically forbidden region is sufficiently large so that φ(x) can decay to zero for the first few eigenstates. Increase xmax if necessary for the higher energy eigenvalues. Is there any pattern in the values of the energy eignevalues you found? (b) Repeat part (a) for the linear potential V (x) = |x|. Describe the differences between your results for this potential and for the harmonic oscillator potential. The quantum mechanical treatment of the linear potential can be used to model the energy spectrum of a bound quark-antiquark system known as quarkonium. (c) Obtain a numerical solution of the anharmonic oscillator V (x) = 1 2 x2 +bx4. In this case there are no analytic solutions, and numerical solutions are necessary for large values of b. How do the ground state energy and eigenstate depend on b for small b? Problem 16.8. Finite square well The finite square well potential is given by V (x) =    0 for |x| ≤ a V0 for |x| > a. (16.19) The input parameters are the well depth, V0, and the half-width of the well, a. (a) Choose V0 = 10 and a = 1. How do you expect the value of the ground state energy to compare to its corresponding value for the infinite square well? Compute the ground state eigenvalue and eigenstate by determining a value of E such that φ(x) has no nodes and is approximately zero for large x. (See Problem (16.7a) for the procedure for finding the eigenvalues.) (b) Because the well depth is finite, φ(x) is nonzero in the classically forbidden region for which E < V0 and x > |a|. Define the penetration distance as the distance from x = a to a point where φ is ∼ 1/e ≈ 0.37 of its value at x = a. Determine the qualitative dependence of the penetration distance on the magnitude of V0. (c) What is the total number of bound excited states? Why is the total number of bound states finite? As we have found, it is difficult to find bound state solutions of the time-independent Schrödinger equation because the exponential solution allows numerical errors to dominate when V (x) − E > 0 is large. Because we want to easily generate eigenstates in subsequent sections, we have written a general-purpose eigenstate solver that examines the maxima and minima of the solution as well as the nodes to determine the eigenstate’s quantum number. The code for the Eigenstate class is in the ch16 package. The EigenstateApp target class shows how the Eigenstate class is used. Listing 16.3: The EigenstateApp program tests the Eigenstate class. package org . opensourcephysics . sip . ch16 ; CHAPTER 16. QUANTUM SYSTEMS 672 import org . opensourcephysics . frames . PlotFrame ; import org . opensourcephysics . numerics . Function ; public class EigenstateApp { public s t a t i c void main ( String [ ] args ) { PlotFrame drawingFrame = new PlotFrame ( "x" , "|phi|" , "eigenstate" ) ; int numberOfPoints = 300; double xmin = −5, xmax = +5; Eigenstate eigenstate = new Eigenstate (new Potential ( ) , numberOfPoints , xmin , xmax ) ; int n = 3; / / quantum number double [ ] phi = eigenstate . getEigenstate (n ) ; double [ ] x = eigenstate . getXCoordinates ( ) ; i f ( eigenstate . getErrorCode ()== Eigenstate .NO_ERROR) { drawingFrame . setMessage ( "energy = "+eigenstate . energy ) ; } else { drawingFrame . setMessage ( "eigenvalue did not converge" ) ; } drawingFrame . append (0 , x , phi ) ; drawingFrame . s e t V i s i b l e ( true ) ; drawingFrame . setDefaultCloseOperation ( javax . swing . JFrame .EXIT_ON_CLOSE ) ; } } class Potential implements Function { public double evaluate ( double x ) { return ( x x ) / 2 ; } } The getEigenstate method in the Eigenstate class computes the eigenstate for the specified quantum number and returns a zeroed wave function if the algorithm does not converge. We test the validity of the Eigenstate class in Problem 16.9. Problem 16.9. The Eigenstate class (a) Examine the code of the Eigenstate class. What “trick” is used to handle the divergence in the forbidden region of deep wells? (b) Write a class that displays the eigenstates of the simple harmonic oscillator using the Calculation interface. Include input parameters that allow the user to vary the principal quantum number and the number of points. (c) Use a spatial grid of 300 points with −5 < x < 5 and compare the known analytic solution for the simple harmonic oscillator eigenstates to the numerical solution for the lowest three energy eigenstates. What is the largest energy eigenvalue that can be computed to an accuracy of 1%? What causes the decreasing accuracy for larger quantum numbers? What if the domain is increased to −50 < x < 50? (d) Describe the conditions under which the Eigenstate class fails and demonstrate this failure. Improve the Eigenstate class to handle at least one failure mode. CHAPTER 16. QUANTUM SYSTEMS 673 16.4 Time Development of Eigenstate Superpositions If the Hamiltonian is independent of time, the time development of the wave function Ψ (x,t) can be expressed as a linear superposition of energy eigenstates φn(x) with eigenvalue En: Ψ (x,t) = n cn φn(x)e−iEnt/ . (16.20) To understand the time dependence of Ψ (x,t), we begin by studying superpositions of analytic solutions. The static getEigenstate method in the BoxEigenstate class generates these solutions for the infinite square well. Listing 16.4: The BoxEigenstate class generates analytic stationary state solutions for the infinite square well. package org . opensourcephysics . sip . ch16 ; public class BoxEigenstate { s t a t i c double a = 1; / / length of box private BoxEigenstate ( ) { / / p r o h i b i t i n s t a n t i a t i o n because a l l methods are s t a t i c } s t a t i c double [ ] getEigenstate ( int n , int numberOfPoints ) { double [ ] phi = new double [ numberOfPoints ] ; n++; / / quantum number double norm = Math . sqrt (2/ a ) ; for ( int i = 0; i 1/8. (16.24) Problem 16.14. Coherent states Because the energy eigenvalues of the simple harmonic oscillator are equally spaced, there exist wave functions known as coherent states whose probability density propagates quasi-classically. (a) Include a sufficient number of expansion coefficients for V (x) = 10x2 to model an initial Gaussian wave function centered at the origin: Ψ (x,0) = e−16x2 . (16.25) Describe the evolution. (b) Repeat part (a) with Ψ (x,0) = e−16(x−2)2 . (16.26) CHAPTER 16. QUANTUM SYSTEMS 678 (c) Show that the wave functions in parts (a) and (b) change their width but not their Gaussian envelope. Construct a wave function with the following expansion coefficients and observe its behavior c2 n = n n n! e− n . (16.27) The expectation of the number of quanta n is given by n = E − 1 2 ω, (16.28) where E is the energy expectation value of the coherent state. The expansion of an arbitrary wave function in terms of a set of eigenstates is closely related to Fourier analysis. Because the eigenstates of a particle in a box are sinusoidal functions, we could have used the fast Fourier transform algorithm (FFT) to compute the projection coefficients. Because these coefficients are calculated only once in Problem 16.14, evaluating (16.21) directly is reasonable. We will use the FFT to study wave functions in momentum space and to implement the operator splitting method for time evolution in Section 16.6. 16.5 The Time-Dependent Schrödinger Equation Although the numerical solution of the time-independent Schrödinger equation (16.7) is straightforward for one particle, the numerical solution of the time-dependent Schrödinger equation (16.4) is not as simple. A naive approach to its numerical solution can be formulated by introducing a grid for the time coordinate and a grid for the spatial coordinate. We use the notation tn = t0 + n∆t, xs = x0 + s∆x, and Ψ (xs,tn). The idea is to relate Ψ (xs,tn+1) to the value of Ψ (xs,tn) for each value of xs. An example of an algorithm that solves the Schrödinger-like equation ∂Ψ /∂t = ∂2Ψ /∂x2 to first order in ∆t is given by 1 ∆t Ψ (xs,tn+1) − Ψ (xs,tn) = 1 (∆x)2 Ψ (xs+1,tn) − 2Ψ (xs,tn) + Ψ (xs−1,tn) . (16.29) The right-hand side of (16.29) represents a finite difference approximation to the second derivative of Ψ with respect to x. Equation (16.29) is an example of an explicit scheme, because given Ψ at time tn, we can compute Ψ at time tn+1. Unfortunately, this explicit approach leads to unstable solutions; that is, the numerical value of Ψ diverges from the exact solution as Ψ evolves in time. One way to avoid the instability is to retain the same form as (16.29) but to evaluate the spatial derivative on the right side of (16.29) at time tn+1 rather than time tn: 1 ∆t Ψ (xs,tn+1) − Ψ (xs,tn) = 1 (∆x)2 Ψ (xs+1,tn+1) − 2Ψ (xs,tn+1) + Ψ (xs−1,tn+1) . (16.30) Equation (16.30) is an implicit method because the unknown function Ψ (xs,tn+1) appears on both sides. To obtain Ψ (xs,tn+1), it is necessary to solve a set of linear equations at each time step. More details of this approach and the demonstration that (16.30) leads to stable solutions can be found in the references. Visscher and others have suggested an alternative approach in which the real and imaginary parts of Ψ are treated separately and defined at different times. The algorithm ensures that the total probability remains constant. If we let Ψ (x,t) = R(x,t) + i I(x,t), (16.31) CHAPTER 16. QUANTUM SYSTEMS 679 then Schrödinger’s equation i∂Ψ (x,t)/∂t = ˆHΨ (x,t) becomes ( = 1 as usual) ∂R(x,t) ∂t = ˆH I(x,t) (16.32a) ∂I(x,t) ∂t = − ˆH R(x,t). (16.32b) A stable method of numerically solving (16.32) is to use a form of the half-step method (see Appendix 3A). The resulting difference equations are R(x,t + ∆t) = R(x,t) + ˆH I x,t + 1 2 ∆t ∆t (16.33a) I x,t + 3 2 ∆t = I x,t + 1 2 ∆t − ˆH R(x,t)∆t, (16.33b) where the initial values are given by R(x,0) and I(x, 1 2 ∆t). Visscher has shown that this algorithm is stable if −2 ∆t ≤ V ≤ 2 ∆t − 2 2 (m∆x)2 , (16.34) where the inequality (16.34) holds for all values of the potential V . The appropriate definition of the probability density P (x,t) = R(x,t)2 +I(x,t)2 is not obvious because R and I are not defined at the same time. The following choice conserves the total probability: P (x,t) = R(x,t)2 + I x,t + 1 2 ∆t I x,t − 1 2 ∆t (16.35a) P x,t + 1 2 ∆t = R(t + ∆t)R(x,t) + I x,t + 1 2 ∆t 2 . (16.35b) An implementation of (16.33) is given in the TDHalfStep class in Listing 16.7. The real part of the wave function is first updated for all positions, and then the imaginary part is updated using the new values of the real part. Listing 16.7: The TDHalfStep class solves the one-dimensional time-dependent Schrödinger equation. package org . opensourcephysics . sip . ch16 ; public class TDHalfStep { double [ ] x , realPsi , imagPsi , potential ; double dx , dx2 ; double dt = 0.001; public TDHalfStep ( GaussianPacket packet , int numberOfPoints , double xmin , double xmax) { realPsi = new double [ numberOfPoints ] ; imagPsi = new double [ numberOfPoints ] ; potential = new double [ numberOfPoints ] ; x = new double [ numberOfPoints ] ; dx = (xmax−xmin ) / ( numberOfPoints −1); dx2 = dx dx ; double x0 = xmin ; for ( int i = 0 , n = realPsi . length ; i 0) { dt = Math . min( dt , 2/a ) ; } } return dt ; } double step ( ) { for ( int i = 1 , n = imagPsi . length −1; i a). (16.38) Generate a series of snapshots that show the wave packet approaching the barrier and then interacting with it to generate reflected and transmitted packets. Choose V0 = 2 and a = 1 and consider the behavior of the wave packet for k0 = 1, 1.5, 2, and 3. Does the width of the packet increase with time? How does the width depend on k0? For what values of k0 is the motion of the packet in qualitative agreement with the motion of a corresponding classical particle? (b) Consider a square well with V0 = −2 and consider the same questions as in part (a). Problem 16.18. Evolution of two wave packets Modify GaussianPacket in Listing 16.8 to include two wave packets with identical widths and speeds, with the sign of k0 chosen so that the two wave packets approach each other. Choose their respective values of x0 so that the two packets are initially well separated. Let V = 0 and describe what happens when you determine their time dependence. Do the packets influence each other? What do your results imply about the existence of a superposition principle? CHAPTER 16. QUANTUM SYSTEMS 684 16.6 Fourier Transformations and Momentum Space The position space wave function, Ψ (x,t), is only one of many possible representations of a quantum mechanical state. A quantum system also is completely characterized by the momentum space wave function, Φ(p,t). The probability P (p,t)∆p of the particle being in a “volume” element ∆p centered about the momentum p at time t is equal to P (p,t)∆p = |Φ(p,t)|2 ∆p. (16.39) Because either a position space or a momentum space representation provides a complete description of the system, it is possible to transform the wave function from one space to another as: Φ(p,t) = 1 √ 2π ∞ −∞ Ψ (x,t)e−ipx/ dx (16.40) Ψ (x,t) = 1 √ 2π ∞ −∞ Φ(p,t)eipx/ dp. (16.41) The momentum and position space transformations, (16.40) and (16.41), are Fourier integrals. Because a computer stores a wave function on a finite grid, these transformations simplify to the familiar Fourier series (see Section 9.3): Φm = N/2 n=−N/2 Ψne−ipmxn/ , (16.42) Ψn = 1 N N/2 m=−N/2 Φmeipmxn/ , (16.43) where Φm = Φ(pm) and Ψn = Ψ (xn). We have not explicitly shown the time dependence in (16.42) and (16.43). We now use the FFTApp program introduced in Section 9.3 to transform a wave function between position and momentum space. Note that the wavenumber 2π/λ (or 2π/T in the time domain) in classical physics has the same numerical value as momentum in quantum mechanics p = h/λ = 2π /λ in units such that = 1. Consequently, we can use the getWrappedOmega and getNaturalOmega methods in the FFT class to generate arrays containing momentum values for a transformed position space wave function. The FFTApp program in Listing 9.7 transforms N complex data points using an input array that has length 2N. The real part of the jth data point is stored in array element 2j and the imaginary part is stored in element 2j + 1. The FFT class transforms this array and maintains the same ordering of real and imaginary parts. However, the momenta (wavenumbers) are in warp-around order starting with the zero momentum coefficients in the first two elements and switching to negative momenta halfway through the array. The toNaturalOrder class sorts the array in order of increasing momentum. We use the FFTApp class in Problem 16.19. Problem 16.19. Transforming to momentum space (a) The FFTApp class initializes the wave function grid using the following complex exponential: Ψn = Ψ (n∆x) = ein∆x = cosn∆x + i sinn∆x. (16.44) CHAPTER 16. QUANTUM SYSTEMS 685 Use FFTApp to show that a complex exponential has a definite momentum if the grid contains an integer number of wavelengths. In other words, show that there is only one nonzero Fourier component. (b) How small a wavelength (or how large a momentum) can be modeled if the spatial grid has N points and extends over a distance L? (c) Where do the maximum, zero, and minimum values of the momentum occur in wraparound order? After the transformation, the momentum space wave function is stored in an array. The array elements can be assigned a momentum value using the de Broglie relation p = h/λ. The longest wavelength that can exist on the grid is equal to the grid dimension L = (N − 1)∆x, and this wave has a momentum of p0 = h L . (16.45) Points on the momentum grid have momentum values with integer multiples of p0. Problem 16.20. Momentum visualization Add a ComplexPlotFrame to the FFTApp program to show the momentum space wave function of a position space Gaussian wave packet. Add a user interface to control the width of the Gaussian wave packet and verify the Heisenberg uncertainty relation ∆x∆p ≥ /2. Shift the center of the position space wave packet and explain the change in the resulting momentum space wave function. Problem 16.21. Momentum time evolution Modify TDHalfStepApp so that it displays the momentum space wave function in addition to the position space wave function. Describe the momentum space evolution of a Gaussian packet for the infinite square well and a simple harmonic oscillator potential. What evidence of classicallike behavior do you observe? The FFT can be used to implement a fast and accurate method for solving Schrödinger’s equation. We start by writing (16.4) in operator notation as i ∂Ψ (x,t) ∂t = ˆHΨ (x,t) = ( ˆT + ˆV )Ψ (x,t), (16.46) where ˆH, ˆT , and ˆV are the Hamiltonian, kinetic energy, and potential energy operators, respectively. The formal solution to (16.46) is Ψ (x,t) = e−i ˆH(t−t0)/ Ψ (x,t0) = e−i( ˆT + ˆV )(t−t0)/ Ψ (x,t0). (16.47) The time evolution operator ˆU is defined as ˆU = e−i ˆH(t−t0)/ = e−i( ˆT + ˆV )(t−t0)/ . (16.48) It might be tempting to express the time evolution operator as ˆU = e−i ˆT ∆t/ e−i ˆV ∆t/ , (16.49) CHAPTER 16. QUANTUM SYSTEMS 686 but (16.49) is valid only for ∆t ≡ t −t0 << 1, because ˆT and ˆV do not commute. A more accurate approximation (accurate to second order in ∆t) is obtained by using the following symmetric decomposition: ˆU = e−i ˆV ∆t/2 e−i ˆT ∆t/ e−i ˆV ∆t/2 . (16.50) The key to using (16.50) to solve (16.46) is to use the position space wave function when applying e−i ˆV ∆t/2 and to use the momentum space wave function when applying e−i ˆT ∆t/2 . In position space, the potential energy operator is equivalent to simply multiplying by the potential energy function. That is, the effect of the first and last terms in (16.50) is to multiply points on the position grid by a phase factor that is proportional to the potential energy: ˜Ψj = e−iV (xj )∆t/2 Ψj. (16.51) Because the kinetic energy operator in position space involves partial derivatives, it is convenient to transform both the operator and the wave function to momentum space. In momentum space the kinetic energy operator is equivalent to multiplying by the kinetic energy p2/2m. The middle term in (16.50) operates by multiplying points on the momentum grid by a phase factor that is proportional to the kinetic energy: ˜Φj = e −ip2 j ∆t/2m Φj. (16.52) The split-operator algorithm jumps back and forth between position and momentum space to propagate the wave function. The algorithm starts in position space where each grid value Ψj = Ψ (xj,t) is multiplied by (16.51). The wave function is then transformed to momentum space where every momentum value Φj is multiplied by (16.52). It is then transformed back to position space where (16.51) is applied a second time. A single time step can therefore be written as Ψ (x,t + ∆t) = e−iV (x)∆t/2 F−1 e−ip2∆t/2m F[e−iV (x)∆t/2 Ψ (x,t)] , (16.53) where F is the Fourier transform to momentum space and F−1 is its inverse. Problem 16.22. Split-operator algorithm (a) Write a program to implement the split-operator algorithm. It is necessary to evaluate the exponential phase factors only once when implementing the split-operator algorithm. Store the complex exponentials in arrays that match the x-values on the spatial grid and the pvalues on the momentum grid. Use wrap-around order when storing the momentum phase factors because the FFT class inverse transformation assumes that data are in wrap-around order. You can use the getWrappedOmega method in the FFT to obtain the momenta in this ordering. (b) Compare the evolution of a Gaussian wave packet using the split-operator and half-step algorithms using identical grids. How does the finite grid size affect each algorithm? (c) Compare the computation speed of the split-operator and half-step algorithms using a Gaussian wave packet in a square well. Disable plotting and other nonessential computation when comparing the speeds. Problem 16.23. Split-operator accuracy The split-operator and half-step algorithms fail if the time step is too large. Use both algorithms to evolve a simple harmonic oscillator coherent state (see Problem 16.14). Describe the error that occurs if the time step becomes too large. CHAPTER 16. QUANTUM SYSTEMS 687 16.7 Variational Methods One way of obtaining a good approximation of the ground state energy is to use a variational method. This approach has numerous applications in chemistry, atomic and molecular physics, nuclear physics, and condensed matter physics. Consider a system whose Hamiltonian operator ˆH is given by (16.8). According to the variational principle, the expectation value of the Hamiltonian for an arbitrary trial wave function Ψ is greater than or equal to the ground state energy E0. That is, H = E[Ψ ] = Ψ ∗(x) ˆHΨ (x)dx Ψ ∗(x)Ψ (x)dx ≥ E0, (16.54) where E0 is the exact ground state energy of the system. We assume that the wave function is continuous and bounded. The inequality (16.54) reduces to an equality only if Ψ is an eigenstate of ˆH with the eigenvalue E0. For bound states, Ψ may be assumed to be real without loss of generality so that Ψ ∗ = Ψ and thus |Ψ (x)|2 = Ψ (x)2. This assumption implies that we do not need to store two values representing the real and imaginary parts of Ψ . The inequality (16.54) is the basis of the variational method. The procedure is to choose a physically reasonable form for the trial wave function Ψ (x) that depends on one or more parameters. The expectation value E[Ψ ] is computed, and the parameters are varied until a minimum of E[Ψ ] is obtained. This value of E[Ψ ] is an upper bound to the true ground state energy. Often forms of Ψ are chosen so that the integrals in (16.54) can be done analytically. To avoid this restriction we can use numerical integration methods. In most applications of the variational method the integrals in (16.54) are multidimensional and Monte Carlo integration methods are essential. For this reason we will use Monte Carlo integration in the following, even though we will consider only one- and two-body problems. Because it is inefficient to simply choose points at random to compute E[Ψ ], we rewrite (16.54) in a form that allows us to use importance sampling. We write E[Ψ ] = Ψ (x)2EL(x)dx Ψ (x)2 dx , (16.55) where EL is the local energy EL(x) = ˆHΨ (x) Ψ (x) , (16.56) which can be calculated analytically using the trial wave function. The form of (16.55) is that of a weighted average with the weight equal to the normalized probability density Ψ (x)2/ Ψ (x)2 dx. As discussed in Section 11.6, we can sample values of x using the distribution Ψ (x)2 so that the Monte Carlo estimate of E[Ψ ] is given by the sum E[Ψ ] = lim n→∞ 1 n n i=1 EL(xi), (16.57) where n is the number of times that x is sampled from Ψ 2. How can we sample from Ψ 2? In general, it is not possible to use the inverse transform method (see Section 11.5) to generate a nonuniform distribution. A convenient alternative is the Metropolis method which has the advantage that only an unnormalized Ψ 2 is needed for the proposed move. CHAPTER 16. QUANTUM SYSTEMS 688 Problem 16.24. Ground state energy of several one-dimensional systems (a) It is useful to test the variational method on an exactly solvable problem. Consider the one-dimensional harmonic oscillator with V (x) = x2/2. Choose the trial wave function to be Ψ (x) ∝ e−λx2 , with λ the variational parameter. Generate values of x chosen from a normalized Ψ 2(x) using the inverse transform method and verify that λ = 1/2 yields the smallest upper bound by considering λ = 1/2 and four other values of λ near 1/2. Another way to generate a Gaussian distribution is to use the Box–Muller method discussed in Section 11.5. (b) Repeat part (a) using the Metropolis method to generate x distributed according to Ψ (x)2 ∝ e−2λx2 and evaluate (16.57). As discussed in Section 11.7, the Metropolis method can be summarized by the following steps: (i) Choose a trial position xtrial = xn + δn, where δn is a uniform random number in the interval [−δ,δ]. (ii) Compute w = p(xtrial)/p(xn), where in this case p(x) = e−2λx2 . (iii) If w ≥ 1, accept the change and let xn+1 = xtrial. (iv) If w < 1, generate a random number r and let xn+1 = xtrial if r ≤ w. (v) If the trial change is not accepted, then let xn+1 = xn. Remember that it is necessary to wait for equilibrium (convergence to the distribution Ψ 2) before computing the average value of EL. Look for a systematic trend in EL over the course of the random walk. Choose a step size δ that gives a reasonable value for the acceptance ratio. How many trials are necessary to obtain EL to within 1% accuracy compared to the exact analytic result? (c) Instead of finding the minimum of EL as a function of the various variational parameters, minimize the quantity σ2 L = E2 L − EL 2 . (16.58) Verify that the exact minimum value of σ2 L [Ψ ] is zero, whereas the exact minimum value of EL[Ψ ] is unknown in general. (d) Consider the anharmonic potential V (x) = 1 2 x2 +bx4. Plot V (x) as a function of x for b = 1/8. Use first-order perturbation theory to calculate the lowest order change in the ground state energy due to the x4 term. Then choose a reasonable form for your trial wave function and use your Monte Carlo program to estimate the ground state energy. How does your result compare with first-order perturbation theory? (e) Consider the anharmonic potential of part (d) with b = −1/8. Plot V (x). Use first-order perturbation theory to calculate the lowest order change in the ground state energy due to the x4 term and then use your program to estimate E0. Do your Monte Carlo estimates for the ground state energy have a lower bound? Why or why not? (f) Modify your program so that it can be applied to the ground state of the hydrogen atom. In this case we have V (r) = −e2/r, where e is the magnitude of the charge on the electron. The element of integration dx in (16.55) is replaced by 4πr2 dr. Choose Ψ ∝ e−r/a, where a is the variational parameter. Measure lengths in terms of the Bohr radius 2/me2 and energy in terms of the Rydberg me4/2 2. In these units µ = e2 = = 1. Find the optimal value of a. What is the corresponding energy? CHAPTER 16. QUANTUM SYSTEMS 689 (g) Consider the Yukawa or screened Coulomb potential for which V (r) = −e2e−αr/r, where α > 0. In this case the ground state and wave function can only be obtained numerically. For α = 0.5 and α = 1.0 the most accurate numerical estimates of E0 are −0.14808 and −0.01016, respectively. What is a good choice for the form of the trial wave function? How close can you come to these estimates? Problem 16.25. Variational estimate of the ground state of Helium Helium has long served as a testing ground for atomic trial wave functions. Consider the ground state of the helium atom with the interaction V (r1,r2) = −2e2 1 r1 + 1 r2 + e2 r12 , (16.59) where r12 is the separation between the two electrons. Assume that the nucleus is fixed and ignore relativistic effects. Choose Ψ (r1,r2) = Ae−Zeff(r1+r2)/a0 , where Zeff is a variational parameter. Estimate the upper bound to the ground state energy based on this functional form of Ψ . Our discussion of variational Monte Carlo methods has been only introductory in nature. One important application of variational Monte Carlo methods is to optimize a given trial wave function which is then used to “guide” the Monte Carlo methods discussed in Sections 16.8 and 16.9. 16.8 Random Walk Solutions of the Schrödinger Equation We now introduce a Monte Carlo approach based on expressing the Schrödinger equation in imaginary time. This approach follows that of Anderson (see references). We will then discuss several other quantum Monte Carlo methods. We will see that although the systems of interest are quantum mechanical, we can convert them to systems for which we can use classical Monte Carlo methods. To understand how we can interpret the Schrödinger equation in terms of a random walk in imaginary time, we substitute τ = it/ into the time-dependent Schrödinger equation for a free particle and write (in one dimension) ∂Ψ (x,τ) ∂τ = 2 2m ∂2Ψ (x,τ) ∂x2 . (16.60) Note that (16.60) is identical in form to the diffusion equation (16.1). Hence, we can interpret the wave function Ψ as a probability density with a diffusion constant D = 2/2m. From our discussion in Chapter 7, we know that we can use the formal similarity between the diffusion equation and the imaginary-time free particle Schrödinger equation to solve the latter by replacing it by an equivalent random walk problem. To understand how we can interpret the role of the potential energy term in the context of random walks, we write Schrödinger’s equation in imaginary time as ∂Ψ (x,τ) ∂τ = 2 2m ∂2Ψ (x,τ) ∂x2 − V (x)Ψ (x,τ). (16.61) If we were to ignore the first term (the diffusion term) on the right side of (16.61), the result would be a first-order differential equation corresponding to a decay or growth process depending on the sign of V . We can obtain the solution to this first-order equation by replacing it by CHAPTER 16. QUANTUM SYSTEMS 690 a random decay or growth process, for example, radioactive decay. These considerations suggest that we can interpret (16.61) as a combination of diffusion and branching processes. In the latter, the number of walkers increases or decreases at a point x depending on the sign of V (x). The walkers do not interact with each other because the Schrödinger equation (16.61) is linear in Ψ . Note that it is Ψ ∆x and not Ψ 2∆x that corresponds to the probability distribution of the random walkers. This probabilistic interpretation requires that Ψ be nonnegative and real. We now use this probabilistic interpretation of (16.61) to develop an algorithm for determining the ground state wave function and energy. The general solution of Schrödinger’s equation can be written for imaginary time τ as [see (16.10)] Ψ (x,τ) = n cn φn(x)e−Enτ . (16.62) For sufficiently large τ, the dominant term in the sum in (16.62) comes from the term representing the eigenvalue of lowest energy. Hence, we have Ψ (x,τ → ∞) = c0 φ0(x)e−E0τ . (16.63) From (16.63) we see that the spatial dependence of Ψ (x,τ → ∞) is proportional to the ground state eigenstate φ0(x). If E0 > 0, we also see that Ψ (x,τ) and hence the population of walkers will eventually decay to zero unless E0 = 0. This problem can be avoided by measuring E0 from an arbitrary reference energy Vref, which is adjusted so that an approximate steady state distribution of random walkers is obtained. Although we could attempt to fit the τ-dependence of the computed probability distribution of the random walkers to (16.63) and thereby extract E0, it is more convenient to compute E0 directly from the relation E0 = V = niV (xi) ni , (16.64) where ni is the number of walkers at xi at time τ. An estimate for E0 can be found by averaging the sum in (16.64) for several values of τ once a steady state distribution of random walkers has been reached. To derive (16.64), we rewrite (16.61) and (16.63) by explicitly introducing the reference potential Vref: ∂Ψ (x,τ) ∂τ = 2 2m ∂2Ψ (x,τ) ∂x2 − V (x) − Vref Ψ (x,τ), (16.65) and Ψ (x,τ) ≈ c0φ0(x)e−(E0−Vref)τ . (16.66) We first integrate (16.65) with respect to x. Because ∂Ψ (x,τ)/∂x vanishes in the limit |x| → ∞, (∂2Ψ /∂x2)dx = 0, and hence ∂Ψ (x,τ) ∂τ dx = − V (x)Ψ (x,τ)dx + Vref Ψ (x,τ)dx. (16.67) If we differentiate (16.66) with respect to τ, we obtain the relation ∂Ψ (x,τ) ∂τ = (Vref − E0)Ψ (x,τ). (16.68) We then substitute (16.68) for ∂Ψ /∂τ into (16.67) and find (Vref − E0)Ψ (x,τ)dx = − V (x)Ψ (x,τ)dx + Vref Ψ (x,τ)dx. (16.69) CHAPTER 16. QUANTUM SYSTEMS 691 If we cancel the terms proportional to Vref in (16.69), we find that E0 Ψ (x,τ)dx = V (x),Ψ (x,τ)dx, (16.70) or E0 = V (x)Ψ (x,τ)dx Ψ (x,τ)dx . (16.71) The desired result (16.64) follows by making the connection between Ψ (x)∆x and the density of walkers between x and x + ∆x. Although the derivation of (16.64) is somewhat involved, the random walk algorithm is straightforward. A simple implementation of the algorithm is as follows: 1. Place a total of N0 walkers at the initial set of positions xi, where the xi need not be on a grid. 2. Compute the reference energy Vref = i Vi/N0. 3. Randomly move the first walker to the right or left by a fixed step length ∆s. The step length ∆s is related to the time step ∆τ by (∆s)2 = 2D∆τ. (D = 1/2 in units such that = m = 1.) 4. Compute ∆V = V (x) − Vref and a random number r in the unit interval. If ∆V > 0 and r < ∆V ∆τ, then remove the walker. If ∆V < 0 and r < −∆V ∆τ, then add another walker at x. Otherwise, just leave the walker at x. This procedure is accurate only in the limit of ∆τ << 1. A more accurate procedure consists of computing Pb = e−∆V ∆τ −1 = n+f , where n is the integer part of Pb, and f is the fractional part. We then make n copies of the walker, and if f > r, we make one more copy. 5. Repeat steps 3 and 4 for each of the N0 walkers and compute the mean potential energy (16.71) and the actual number of random walkers. The new reference potential is given by Vref = V − a N0∆τ (N − N0), (16.72) where N is the new number of random walkers, and V is their mean potential energy. The average of V is an estimate of the ground state energy. The parameter a is adjusted so that the number of random walkers N remains approximately constant. 6. Repeat steps 3–5 until the estimates of the ground state energy V have reached a steady state value with only random fluctuations. Average V over many Monte Carlo steps to compute the ground state energy. Do a similar calculation to estimate the distribution of random walkers. The QMWalk class implements this algorithm for the harmonic oscillator potential. Initially, the walkers are randomly distributed within a distance initialWidth of the origin. The program also estimates the ground state wave function by accumulating the spatial distribution of the walkers at discrete intervals of position. The input parameters are the desired number of walkers N0, the number of position intervals to accumulate data for the ground state wave function numberOfBins, and the step size ds. We also use ds for the interval size in the wave function computation. The program computes the current number of walkers, the estimate of the ground state energy, and the value of Vref. The unnormalized ground state wave function is also plotted. CHAPTER 16. QUANTUM SYSTEMS 692 Listing 16.10: The QMWalk class calculates the ground state of the simple harmonic oscillator using the random walk Monte Carlo algorithm. package org . opensourcephysics . sip . ch16 ; public class QMWalk { int numberOfBins = 1000; / / f o r wave function double [ ] x ; / / p o s i t i o n s of walkers double [ ] phi0 ; / / estimate of ground s t a t e wave function double [ ] xv ; / / x values f o r computing phi0 int N0; / / d e s i r e d number of walkers int N; / / actual number of walkers double ds ; / / st ep s i z e double dt ; / / time i n t e r v a l double vave = 0; / / mean p o t e n t i a l double vref = 0; / / r e f e r e n c e p o t e n t i a l double eAccum = 0; / / accumulation of energy values double xmin ; / / minimum x int mcs ; public void i n i t i a l i z e ( ) { N0 = N; x = new double [2 numberOfBins ] ; phi0 = new double [ numberOfBins ] ; xv = new double [ numberOfBins ] ; / / minimum l o c a t i o n f o r computing phi0 xmin = −ds numberOfBins / 2 . 0 ; double binEdge = xmin ; for ( int i = 0; i =0;i −−) { i f (Math . random() <0.5) { / / move walker x [ i ] += ds ; } else { x [ i ] −= ds ; } double pot = potential ( x [ i ] ) ; CHAPTER 16. QUANTUM SYSTEMS 693 double dv = pot−vref ; vsum += pot ; i f (dv<0) { / / decide to add or d e l e t e walker i f (N==0||(Math . random()<−dv dt)&&(N0)) { N−−; / / r e l a b e l l a s t walker to d e l e t e d walker index x [ i ] = x [N] ; vsum −= pot ; / / s u b t r a c t energy of d e l e t e d walker } } } vave = (N==0) ? 0 / / i f no walkers p o e n t i a l = 0 : vsum/N; vref = vave −(N−N0)/N0/dt ; mcs++; } void doMCS( ) { walk ( ) ; eAccum += vave ; for ( int i = 0; i =0&&bin 1), (16.74) where r2 = x2 + y2. Modify QMWalkApp by using Cartesian coordinates in two dimensions. For example, add an array to store the positions of the y-coordinates of the walkers. What happens if you begin with an initial distribution of walkers that is not cylindrically symmetric? 16.9 Diffusion Quantum Monte Carlo We now discuss an improvement of the random walk algorithm known as diffusion quantum Monte Carlo. Although some parts of the discussion might be difficult to follow initially, the algorithm is straightforward. Your understanding of the method will be enhanced by writing a program to implement the algorithm and then reading the following derivation again. To provide some background, we introduce the concept of a Green’s function or propagator defined by Ψ (x,τ) = G(x,x ,τ)Ψ (x ,0)dx . (16.75) From the form of (16.75) we see that G(x,x ,τ) “propagates” the wave function from time zero to time τ. If we operate on both sides of (16.75) with first (∂/∂τ) and then with (H − Vref), we can verify that G satisfies the equation ∂G ∂τ = −( ˆH − Vref)G, (16.76) CHAPTER 16. QUANTUM SYSTEMS 696 which is the same form as the imaginary-time Schrödinger equation (16.65). It is easy to verify that G(x,x ,τ) = G(x ,x,τ). A formal solution of (16.76) is G(τ) = e−( ˆH−Vref)τ , (16.77) where the meaning of the exponential of an operator is given by its Taylor series expansion. The difficulty with (16.77) is that the kinetic and potential energy operators ˆT and ˆV in ˆH do not commute. For this reason, if we want to write the exponential in (16.77) as a product of two exponentials, we can only approximate the exponential for short times ∆τ. To first order in ∆τ (higher-order terms involve the commutator of ˆV and ˆH), we have G(∆τ) ≈ GbranchGdiffusion (16.78) = e−(V −Vref)∆τ e− ˆT ∆τ , (16.79) where Gdiffusion ≡ e− ˆT ∆τ and Gbranch ≡ e−( ˆV −Vref)∆τ correspond to the two random processes: diffusion and branching. From (16.76) we see that Gdiffusion and Gbranch satisfy the differential equations: ∂Gdiffusion ∂τ = − ˆT Gdiffusion = 2 2m ∂2Gdiffusion ∂x2 (16.80) ∂Gbranch ∂τ = (Vref − ˆV )Gbranch. (16.81) The solutions to (16.79)–(16.81) that are symmetric in x and x are Gdiffusion(x,x ,∆τ) = (4πD∆τ)−1/2 e−(x−x )2/4D∆τ , (16.82) with D ≡ 2/2m, and Gbranch(x,x ,∆τ) = e−( 1 2 [V (x)+V (x )]−Vref)∆τ . (16.83) From the form of (16.82) and (16.83), we can see that the diffusion quantum Monte Carlo method is similar to the random walk algorithm discussed in Section 16.8. An implementation of the diffusion quantum Monte Carlo method in one dimension can be summarized as follows: 1. Begin with a set of N0 random walkers. There is no lattice so the positions of the walkers are continuous. It is advantageous to choose the walkers so that they are in regions of space where the wave function is known to be large. 2. Choose one of the walkers and displace it from x to x . The new position is chosen from a Gaussian distribution with a variance 2D∆τ and zero mean. This change corresponds to the diffusion process given by (16.82). 3. Weight the configuration x by w(x → x ,∆τ) = e−( 1 2 [V (x)+V (x )]−Vref)∆τ . (16.84) One way to do this weighting is to generate duplicate random walkers at x . For example, if w ≈ 2, we would have two walkers at x where previously there had been one. To implement this weighting (branching) correctly, we must make an integer number of copies that is equal on the average to the number w. A simple way to do so is to take the integer part of w + r, where r is a uniform random number in the unit interval. The number of copies can be any nonnegative integer including zero. The latter value corresponds to a removal of a walker. CHAPTER 16. QUANTUM SYSTEMS 697 4. Repeat steps 2 and 3 for all members of the ensemble, thereby creating a new ensemble at a later time ∆τ. One iteration of the ensemble is equivalent to performing the integration Ψ (x,τ) = G(x,x ,∆τ)Ψ (x ,τ − ∆τ)dx . (16.85) 5. The quantity of interest Ψ (x,τ) will be independent of the original ensemble Ψ (x,0) if a sufficient number of Monte Carlo steps are taken. As before, we must ensure that N(τ), the number of walkers at time τ, is kept close to the desired number N0. Now we can understand how the simple random walk algorithm discussed in Section 16.8 is an approximation to the diffusion quantum MC algorithm. First, the Gaussian distribution gives the exact distribution for the displacement of a random walker in a time ∆τ, in contrast to the fixed step size in the simple random walk algorithm which gives the average displacement of a walker. Hence, there are no systematic errors due to the finite step size. Second, if we expand the exponential in (16.83) to first order in ∆τ and set V (x) = V (x ), we obtain the branching rule used previously. (We use the fact that the uniform distribution r is the same as the distribution 1 − r.) However, the diffusion quantum MC algorithm is not exact because the branching is independent of the position reached by diffusion, which is only true in the limit ∆τ → 0. This limitation is remedied in the Green’s function Monte Carlo method where a short time approximation is not made (see the articles on Green’s function Monte Carlo in the references). One limitation of the two random walk methods we have discussed is that they can become very inefficient. This inefficiency is due in part to the branching process. If the potential becomes large and negative (as it is for the Coulomb potential when an electron approaches a nucleus), the number of copies of a walker will become very large. It is possible to improve the efficiency of these algorithms by introducing an importance sampling method. The idea is to use an initial guess ΨT (x) for the wave function to guide the walkers toward the more important regions of V (x). To implement this idea, we introduce the function f (x,τ) = Ψ (x,τ)ΨT (x). If we calculate the quantity ∂f /∂t − D ∂2f /∂x2 and use (16.65), we can show that f (x,τ) satisfies the differential equation ∂f ∂τ = D ∂2f ∂x2 − D ∂ f F(x) ∂x − [EL(x) − Vref]f , (16.86) where F(x) = 2 ΨT ∂ΨT ∂x , (16.87) and the local energy EL(x) is given by EL(x) = ˆH∂T ΨT = V (x) − D ΨT ∂2 ΨT ∂x2 . (16.88) The term in (16.86) containing F corresponds to a drift in the walkers away from regions where |ΨT |2 is small (see Problem 7.43). To incorporate the drift term into Gdiffusion, we replace (x − x )2 in (16.82) by the term x − x − D∆τF(x ) 2 so that the diffusion propagator becomes Gdiffusion(x,x ,∆τ) = (4πD∆τ)−1/2 e−(x−x −D∆τF(x ))2/4D∆τ . (16.89) CHAPTER 16. QUANTUM SYSTEMS 698 However, this replacement destroys the symmetry between x and x . To restore it, we use the Metropolis algorithm for accepting the new position of a walker. The acceptance probability p is given by p = |ΨT (x )|2 Gdiffusion(x,x ,∆τ) |ΨT (x)|2 Gdiffusion(x ,x,∆τ) . (16.90) If p > 1, we accept the move; otherwise, we accept the move if r ≤ p. The branching step is achieved by using (16.83) with V (x) + V (x ) replaced by EL(x) + EL(x ) and ∆τ replaced by an effective time step. The reason for the use of an effective time step in (16.83) is that some diffusion steps are rejected. The effective time step to be used in (16.83) is found by multiplying ∆τ by the average acceptance probability. It can be shown (see Hammond et al.) that the mean value of the local energy is an unbiased estimator of the ground state energy. Another possible improvement is to periodically replace branching (which changes the number of walkers) with a weighting of the walkers. At each weighting step, each walker is weighted by Gbranch, and the total number of walkers remains constant. After n steps, the kth walker receives a weight Wk = Πn i=1G (i,k) branch, where G (i,k) branch is the branching factor of the kth walker at the ith time step. The contribution to any average quantity of the kth walker is weighted by Wk. Problem 16.29. Diffusion Quantum Monte Carlo (a) Modify QMWalkApp to implement the diffusion quantum Monte Carlo method for the systems considered in Problems 16.26 and 16.27. Begin with N0 = 100 walkers and ∆τ = 0.01. Use at least three values of ∆τ and extrapolate your results to ∆τ → 0. Reasonable results can be obtained by adjusting the reference energy every 20 Monte Carlo steps with a = 0.1. (b) Write a program to apply the diffusion quantum Monte Carlo method to the hydrogen atom. In this case a configuration is represented by three coordinates. (c)∗ Modify your program to include weights in addition to changing walker populations. Redo part (a) and compare your results. ∗Problem 16.30. Importance sampling (a) Derive the partial differential equation (16.86) for f (x,τ). (b) Modify QMWalkApp to implement the diffusion quantum Monte Carlo method with importance sampling. Consider the harmonic oscillator problem with the trial wave function ΨT = e−λx2 . Compute the statistical error associated with the ground state energy as a function of λ. How much variance reduction can you achieve relative to the naive diffusion quantum Monte Carlo method? Then consider another form of ΨT that does not have a form identical to the exact ground state. Try the hydrogen atom with ΨT = e−λr. 16.10 Path Integral Quantum Monte Carlo The Monte Carlo methods we have discussed so far are primarily useful for estimating the ground state energy and wave function, although it is also possible to find the first few excited states with some effort. In this section we discuss a Monte Carlo method that is of particular interest for computing the thermal properties of quantum systems. CHAPTER 16. QUANTUM SYSTEMS 699 We recall (see Section 7.10) that classical mechanics can be formulated in terms of the principle of least action. That is, given two points in space-time, a classical particle chooses the path that minimizes the action given by S = x,t x0,0 Ldt. (16.91) The Lagrangian L is given by L = T − V . Quantum mechanics also can be formulated in terms of the action (cf. Feynman and Hibbs). The result of this path integral formalism is that the real-time propagator G can be expressed as G(x,x0,t) = A paths eiS/ , (16.92) where A is a normalization factor. The sum in (16.92) is over all paths between (x0,0) and (x,t), not just the path that minimizes the classical action. The presence of the imaginary number i in (16.92) leads to interference effects. As before, the propagator G(x,x0,t) can be interpreted as the probability amplitude for a particle to be at x at time t given that it was at x0 at time zero. G satisfies the equation [see (16.75)] Ψ (x,t) = G(x,x0,t)Ψ (x0,0)dx0 (t > 0). (16.93) Because G satisfies the same differential equation as Ψ in both x and x0, G can be expressed as G(x,x0,t) = n φn(x)φn(x0)e−iEnt/ , (16.94) where the φn are the eigenstates of H. For simplicity, we set = 1 in the following. As before, we substitute τ = it into (16.94) and obtain G(x,x0,τ) = n φn(x)φn(x0)e−τEn . (16.95) We first consider the ground state. In the limit τ → ∞, we have G(x,x,τ) → φ0(x)2 e−τE0 (τ → ∞). (16.96) From the form of (16.96) and (16.92), we see that we need to compute G and hence S to estimate the properties of the ground state. To compute S, we convert the integral in (16.91) to a sum. The Lagrangian for a single particle of unit mass in terms of τ becomes L = − 1 2 dx dτ 2 − V (x) = −E. (16.97) We divide the imaginary time interval τ into N equal steps of size ∆τ and write E as E(xj,τj) = 1 2 (xj+1 − xj)2 (∆τ)2 + V (xj), (16.98) where τj = j∆τ, and xj is the corresponding displacement. The action becomes S = −i∆τ N−1 j=0 E(xj,τj) = −i∆τ N−1 j=0 1 2 (xj+1 − xj)2 (∆τ)2 + V (xj) , (16.99) CHAPTER 16. QUANTUM SYSTEMS 700 and the probability amplitude for the path becomes eiS = e ∆τ N−1 j=0 1 2 (xj+1−xj )2/(∆τ)2+V (xj ) . (16.100) Hence, the propagator G(x,x0,N∆τ) can be expressed as G(x,x0,N∆τ) = A dx1 ···dxN−1 e ∆τ N−1 j=0 1 2 (xj+1−xj )2/(∆τ)2+V (xj ) , (16.101) where x ≡ xN and A is an unimportant constant. From (16.101) we see that G(x,x0,N∆τ) has been expressed as a multidimensional integral with the displacement variable xj associated with the time τj. The sequence x0,x1,...,xN defines a possible path, and the integral in (16.101) is over all paths. Because the quantity of interest is G(x,x,N∆τ) [see (16.96)], we adopt the periodic boundary condition xN = x0. The choice of x in the argument of G is arbitrary for finding the ground state energy, and the use of the periodic boundary conditions implies that no point in the closed path is unique. It is thus possible (and convenient) to rewrite (16.101) by letting the sum over j go from 1 to N: G(x0,x0,N∆τ) = A dx1 ···dxN−1 e −∆τ N j=1 1 2 (xj −xj−1)2/(∆τ)2+V (xj ) , (16.102) where we have written x0 instead of x because the xj that is not integrated over is xN = x0. The result of this analysis is to convert a quantum mechanical problem for a single particle into a statistical mechanics problem for N “atoms” on a ring connected by nearest neighbor “springs” with spring constant 1/(∆τ)2. The label j denotes the order of the atoms in the ring. Note that the form of (16.102) is similar to the form of the Boltzmann distribution. Because the partition function for a single quantum mechanical particle contains terms of the form e−βEn , and (16.95) contains terms proportional to e−τEn , we make the correspondence β = τ = N∆τ. We shall see in the following how we can use this identity to simulate a quantum system at a finite temperature. We can use the Metropolis algorithm to simulate the motion of N “atoms” on a ring. Of course, these atoms are a product of our analysis just as were the random walkers we introduced in diffusion Monte Carlo and should not be confused with real particles. A possible path integral algorithm can be summarized as follows: 1. Choose N and ∆τ such that N∆τ >> 1 (the zero temperature limit). Also choose δ, the maximum trial change in the displacement of an atom, and mcs, the total number of Monte Carlo steps per atom. 2. Choose an initial configuration for the displacements xj that is close to the approximate shape of the ground state probability amplitude. 3. Choose an atom j at random and a trial displacement xj → xj + (2r − 1)δ, where r is a uniform random number in the unit interval. Compute the change ∆E in the energy E, where ∆E is given by ∆E = 1 2 xj+1 − xj ∆τ 2 + 1 2 xj − xj−1 ∆τ 2 + V (xj) − 1 2 xj+1 − xj ∆τ 2 − 1 2 xj − xj−1 ∆τ 2 − V (xj). (16.103) CHAPTER 16. QUANTUM SYSTEMS 701 If ∆E < 0, accept the change; otherwise, compute the probability p = e−∆τ∆E and a random number r in the unit interval. If r ≤ p, then accept the move; otherwise reject the trial move. 4. Divide the possible x values into equal size bins of width ∆x. Update P (x); that is, let P (x = xj) → P (x = xj) + 1, where x is the displacement of the atom chosen in step 3 after step 3 is completed. Do this update even if the trial move was rejected. 5. Repeat steps 3 and 4 until a sufficient number of Monte Carlo steps per atom has been obtained. (Do not take data until the memory of the initial path is lost and the system has reached “equilibrium.”) Normalize the probability density P (x) by dividing by the product of N and mcs. The ground state energy E0 is given by E0 = x P (x)[T (x) + V (x)], (16.104) where T (x) is the kinetic energy as determined from the virial theorem 2T (x) = x dV dx , (16.105) which is discussed in many texts (see Griffiths for example). It is also possible to compute T from averages over (xj − xj−1)2, but the virial theorem yields a smaller variance. The ground state wave function φ(x) is obtained from the normalized probability P (x)∆x by dividing by ∆x and taking the square root. We can also find the thermodynamic properties of a particle that is connected to a heat bath at temperature T = 1/β by not taking the β = N∆τ → ∞ limit. To obtain the ground state, which corresponds to the zero temperature limit (β >> 1), we have to make N∆τ as large as possible. However, we need ∆τ to be as small as possible to approximate the continuum time limit. Hence, to obtain the ground state we need a large number of time intervals N. For the finite temperature simulation, we can use smaller values of N for the same level of accuracy as the zero temperature simulation. The path integral method is very flexible and can be generalized to higher dimensions and many mutually interacting particles. For three dimensions, xj is replaced by the threedimensional displacement rj. Each real particle is represented by a ring of N “atoms” with a spring-like potential connecting each atom within a ring. Each atom in each ring also interacts with the atoms in the other rings through an interparticle potential. If the quantum system is a fluid where indistinguishability is important, then we must consider the effect of exchange by treating the quantum system as a classical polymer system where the “atoms” represent the monomers of a polymer, and where polymers can split up and reform. Chandler and Wolynes discuss how the quantum mechanical effects due to exchanging identical particles can be associated with the chemical equilibrium of the polymers. They also discuss Bose condensation using path integral techniques. Problem 16.31. Path integral calculation (a) Write a program to implement the path integral algorithm for the one-dimensional harmonic oscillator potential with V (x) = x2/2. Use the structure of your Monte Carlo Lennard– Jones program from Chapter 15 as a guide. CHAPTER 16. QUANTUM SYSTEMS 702 (b) Let N∆τ = 15 and consider N = 10, 20, 40, and 80. Equilibrate for at least 2000 Monte Carlo steps per atom and average over at least 5000 mcs. Compare your results with the exact result for the ground state energy given by E0 = 0.5. Estimate the equilibration time for your calculation. What is a good initial configuration? Improve your results by using larger values of N∆τ. (c) Find the mean energy E of the harmonic oscillator at the temperature T determined by β = N∆τ. Find E for β = 1, 2, and 3 and compare it with the exact result E = 1 2 coth(β/2). (d) Repeat the above calculations for the Morse potential V (x) = 2(1 − e−x)2. 16.11 Projects Many of the techniques described in this chapter can be extended to two-dimensional quantum systems. The Complex2DFrame tool in the frames package is designed to show two-dimensional complex scalar fields such as quantum wave functions. Listing 16.13 in Appendix A shows how this class is used to show a two-dimensional Gaussian wave packet with a momentum boost. Project 16.32. Separable systems in two dimensions The shooting method is inappropriate for the calculation of eigenstates and eigenvalues in two or more dimensions with arbitrary potential energy functions V (r). However, the special case of separable potentials can be reduced to several one-dimensional problems that can be solved using the numerical methods described in this chapter. Many molecular modeling programs use the Hartree–Fock self-consistent field approximation to model nonseparable systems as a set of one-dimensional problems. Recently, there has been significant progress motivated by a molecular dynamics algorithm developed by Car and Parrinello. Write a two-dimensional eigenstate class Eigenstate2d that calculates eigenstates and eigenvalues for a separable potential of the form V (x,y) = V1(x) + V2(y). (16.106) Test this class using the known analytic solutions for the two-dimensional rectangular box and two-dimensional harmonic oscillator. Use this class to model the evolution of superposition states. Under what conditions are there wave function revivals? Project 16.33. Excited state wave functions using quantum Monte Carlo Quantum Monte Carlo methods can be extended to compute the excited state wave functions using a Gram–Schmidt procedure to insure that each excited state is orthogonal to all lower lying states (see Roy et al.). A quantum Monte Carlo method is used to compute the ground state wave function. A trial wave function for the first exited state is then selected and the ground state component is subtracted from the trial wave function. This subtraction is repeated after every iteration of the Monte Carlo algorithm. Because excited states decay with a time constant e−(Ej −E0) , the lowest remaining excited state dominates the remaining wave function. After the first excited state is obtained, the second excited state is computed by subtracting both known states from the trial wave function. This process is repeated to obtain additional wave functions. Implement this procedure to find the first few excited state wave functions for the onedimensional harmonic oscillator. Then consider the one-dimensional double-well oscillator V (x) = − 1 2 kx2 + a3x3 + a4x4 , (16.107) CHAPTER 16. QUANTUM SYSTEMS 703 with k = 40, a3 = 1, and a4 = 1. Project 16.34. Quantum Monte Carlo in two dimensions The procedure described in Project 16.33 can be used to compute two-dimensional wave functions (see Roy et al.). (a) Test your program using a separable two-dimensional double-well potential. (b) Find the first few excited states for the two-dimensional double-well potential V (x,y) = − 1 2 kxx2 − 1 2 kyy2 + 1 2 (axxx4 + 2axyx2 y2 + ayyy4 ), (16.108) with kx = ky = 20 and axx = ayy = axy = 5. Repeat with kx = ky = 20 and axx = ayy = axy = 1. Project 16.35. Evolution of a wave packet in two dimensions Both the half-step and split-operator algorithms can be extended to model the evolution of twodimensional systems with arbitrary potentials V (x,y). (See Numerical Recipes for how the FFT algorithm is extended to more dimensions.) Implement either algorithm and model a wave packet scattering from a central barrier and a wave packet passing through a double slit. A clever way to insure stability in the half-step algorithm is to use a boolean array to tag grid locations where the solution becomes unstable and to set the wave function to zero at these grid points. double minV = −2/dt ; double maxVx = 2/dt −2/(dx dx ) ; double maxVy = 2/dt −2/(dy dy ) ; double maxV = Math . min(maxVx,maxVy ) ; for ( int i = 0 , n = potential . length ; i <= n ; i ++) { for ( int j = 0 , m = potential [ 0 ] . length ; j <= m; j ++) { i f ( potential [ i ] [ j ] >= minV && potential [ i ] [ j ] <= maxV) / / s t a b l e stable [ i ] [ j ] = true ; / / s t a b l e else stable [ i ] [ j ] = false ; / / unstable , s e t wave function to zero } } } Project 16.36. Two-particle system Rubin Landau has studied the time dependence of two particles interacting in one dimension with a potential that depends on their relative separation: V (x1,x2) = V0e−(x1−x2)2/2α2 . (16.109) Model a scattering experiment for particles having momentum p1 and p2 by assuming the following (unnormalized) initial wave function: Ψ (x1,x2) = eip1x1 e−(x1−a)2/4w2 eip2x2 e−(x2−a)2/4w2 , (16.110) where 2a is the separation and w is the variance in each particle’s position. Do the particles bounce off each other when the interaction is repulsive? What happens when the interaction is attractive? CHAPTER 16. QUANTUM SYSTEMS 704 (a) Real and imaginary. (b) Amplitude and phase. Figure 16.2: Two representations of complex wave functions. (The actual output is in color.) Appendix 16A: Visualizing Complex Functions Complex functions are essential in quantum mechanics and the frames package contains classes for displaying and analyzing these functions. Listing 16.12 uses a ComplexPlotFrame to display a one-dimensional wave function. Listing 16.12: The ComplexPlotFrameApp class displays a one-dimensional Gaussian wave packet with a momentum boost. package org . opensourcephysics . sip . ch16 ; import org . opensourcephysics . frames . ComplexPlotFrame ; public class ComplexPlotFrameApp { public s t a t i c void main ( String [ ] args ) { ComplexPlotFrame frame = new ComplexPlotFrame ( "x" , "Psi(x)" , "Complex function" ) ; int n = 128; double xmin = −Math . PI , xmax = Math . PI ; double x = xmin , dx = (xmax−xmin )/n ; double [ ] xdata = new double [n ] ; / / r e a l and imaginary values a l t e r n a t e double [ ] zdata = new double [2 n ] ; / / t e s t function i s e^(−x x /4) e ^( i mode x ) f o r x=[−pi , pi ) int mode = 4; for ( int i = 0; i , has open source simulation programs for strongly correlated quantum mechanical systems and C++ libraries for simplifying the development of such code. Although most of the code is beyond the level of this text, this open source project is another example of software for use in both research and education. J. B. Anderson, “A random walk simulation of the Schrödinger equation: H+ 3,” J. Chem. Phys. 63, 1499–1503 (1975); “Quantum chemistry by random walk. H 2P, H+ 3 D3h 1A 1, H2 3Σ+ u, H4 1Σ+ g , Be 1S,” J. Chem. Phys. 65, 4121–4127 (1976); “Quantum chemistry by random walk: Higher accuracy,” J. Chem. Phys. 73, 3897–3899 (1980). These papers describe the random walk method, extensions for improved accuracy, and applications to simple molecules. G. Baym, Lectures on Quantum Mechanics, Westview Press (1990). A discussion of the Schrödinger equation in imaginary time is given in Chapter 3. M. A. Belloni, W. Christian, and A. Cox, Physlet Quantum Physics, Prentice Hall (2006) This book contains interactive exercises using Java applets to visualize quantum phenomena. H. A. Bethe, Intermediate Quantum Mechanics, Westview Press (1997). Applications of quantum mechanics to atomic systems are discussed. Jay S. Bolemon, “Computer solutions to a realistic ‘one-dimensional’ Schrödinger equation,” Am. J. Phys. 40, 1511 (1972). Siegmund Brandt and Hans Dieter Dahmen, The Picture Book of Quantum Mechanics, third edition, Springer-Verlag (2001); Siegmund Brandt, Hans Dieter Dahmen, and Tilo Stroh, Interactive Quantum Mechanics, Springer-Verlag (2003).These books show computer generated pictures of quantum wave functions in different contexts. R. Car and M. Parrinelli, “Unified approach for molecular dynamics and density-functional theory,” Phys. Rev. Lett. 55, 2471 (1985). David M. Ceperley, “Path integrals in the theory of condensed helium,” Rev. Mod. Phys. 67, 279–355 (1995). David M. Ceperley and Berni J. Alder, “Quantum Monte Carlo,” Science 231, 555 (1986). A survey of some of the applications of quantum Monte Carlo methods to physics and chem- istry. David Chandler and Peter G. Wolynes, “Exploiting the isomorphism between quantum theory and classical statistical mechanics of polyatomic fluids,” J. Chem. Phys. 74, 4078–4095 (1981). The authors use path integral techniques to look at multiparticle quantum sys- tems. D. F. Coker and R. O. Watts, “Quantum simulation of systems with nodal surfaces,” Mol. Phys. 58, 1113–1123 (1986). Jim Doll and David L. Freeman, “Monte Carlo methods in chemistry,” Computing in Science and Engineering 1 (1), 22–32 (1994). CHAPTER 16. QUANTUM SYSTEMS 707 Robert M. Eisberg and Robert Resnick, Quantum Physics, second edition, John Wiley & Sons (1985). See Appendix G for a discussion of the numerical solution of Schrödinger’s equa- tion. R. P. Feynman, “Simulating physics with computers,” Int. J. Theor. Phys. 21, 467–488 (1982). A provocative discussion of the intrinsic difficulties of simulating quantum systems. See also R. P. Feynman, Feynman Lectures on Computation, Westview Press (1996). Richard P. Feynman and A. R. Hibbs, Quantum Mechanics and Path Integrals, McGraw–Hill (1965). David J. Griffiths, Introduction to Quantum Mechanics, second edition, Prentice Hall (2005). An excellent undergraduate text that discusses the virial theorem in several problems. B. L. Hammond, W. A. Lester Jr., and P. J. Reynolds, Monte Carlo Methods in Ab Initio Quantum Chemistry, World Scientific (1994). An excellent book on quantum Monte Carlo methods. Steven E. Koonin and Dawn C. Meredith, Computational Physics, Addison–Wesley (1990). Solutions of the time-dependent Schrödinger equation are discussed in the context of parabolic partial differential equations in Chapter 7. Chapter 8 discusses Green’s function Monte Carlo methods. Rubin Landau, “Two-particle Schrödinger equation animations of wavepacket-wavepacket scattering,” Am. J. Phys. 68 (12), 1113–1119 (2000). Michel Le Bellac, Fabrice Mortessagne, and G. George Batrouni, Equilibrium and Non-Equilibrium Statistical Thermodynamics, Cambridge University Presss (2004). Chapter 7 of this graduate level text discusses the world line algorithm for bosons and fermions on a lattice. M. A. Lee and K. E. Schmidt, “Green’s function Monte Carlo,” Computers in Physics 6 (2), 192 (1992). A short and clear explanation of Green’s function Monte Carlo. P. K. MacKeown, “Evaluation of Feynman path integrals by Monte Carlo methods,” Am. J. Phys. 53, 880—885 (1985). The author discusses projects suitable for an advanced undergraduate course. Also see P. K. MacKeown and D. J. Newman, Computational Techniques in Physics, Adam Hilger (1987). Jean Potvin, “Computational quantum field theory. Part II: Lattice gauge theory,” Computers in Physics 8, 170 (1994); and “Computational quantum field theory,” Computers in Physics 7, 149 (1993). William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery, Numerical Recipes, second edition, Cambridge University Press (1992). The numerical solution of the time-dependent Schrödinger equation is discussed in Chapter 19. Peter J. Reynolds, David M. Ceperley, Berni J. Alder, and William A. Lester Jr., “Fixed-node quantum Monte Carlo for molecules,” J. Chem. Phys. 77, 5593–5603 (1982). This paper describes a random walk algorithm for use in molecular applications including importance sampling and the treatment of Fermi statistics. P. J. Reynolds, J. Tobochnik, and H. Gould, “Diffusion quantum Monte Carlo,” Computers in Physics 4 (6), 882 (1990). CHAPTER 16. QUANTUM SYSTEMS 708 U. Rothlisberger, “15 Years of Car–Parrinello simulations in physics, chemistry and biology,” in Computational Chemistry: Reviews of Current Trends, edited by Jerzy Leszczynski, World Scientific (2001), Vol. 6. Amlan K. Roy, Neetu Gupta, and B. M. Deb, “Time-dependent quantum mechanical calculation of ground and excited states of anharmonic and double-well oscillators,” Phys. Rev A 65, 012109-1–7 (2001). Amlan K. Roy, Ajit J. Thakkar, and B. M. Deb, “Low-lying states of two-dimensional doublewell potentials,” J. Phys. A 38, 2189–2199 (2005). K. E. Schmidt, Parhat Niyaz, A. Vaught, and Michael A. Lee, “Green’s function Monte Carlo method with exact imaginary-time propagation,” Phys. Rev. E 71, 016707-1–17 (2005). Bernd Thaller, Visual Quantum Mechanics: Selected Topics with Computer-Generated Animations of Quantum-Mechanical Phenomena, Telos (2000); Bernd Thaller, Advanced Visual Quantum Mechanics, Springer (2005). J. Tobochnik, H. Gould, and K. Mulder, “An introduction to quantum Monte Carlo,” Computers in Physics 4 (4), 431 (1990). An explanation of the path integral method applied to one particle. P. B. Visscher, “A fast explicit algorithm for the time-dependent Schrödinger equation,” Computers in Physics 5 (6), 596 (1991). Chapter 17 Visualization and Rigid Body Dynamics We study affine transformations in order to visualize objects in three dimensions. We then solve Euler’s equation of motion for rigid body dynamics using the quaternion representation of ro- tations. 17.1 Two-Dimensional Transformations Physicists frequently use transformations to convert from one system of coordinates to another. A very common transformation is an affine transformation, which has the ability to rotate, scale, stretch, skew, and translate an object. Such a transformation maps straight lines to straight lines. They are often represented using matrices and are manipulated using the tools of linear algebra such as matrix multiplication and matrix inversion. The Java 2D API defines a set of classes designed to create high quality graphics using image composition, image processing, antialiasing, and text layout. Because linear algebra and affine transformations are used extensively in imaging and drawing APIs, we begin our study of two- and three-dimensional visualization techniques by studying the properties of transformations. It is straightforward to rotate a point (x,y) about the origin by an angle θ (see Figure 17.2) or scale the distance from the origin by (sx,sy) using matrices: x y = cosθ −sinθ sinθ cosθ x y (17.1) x y = sx 0 0 sy x y . (17.2) Performing several transformations corresponds to multiplying matrices. However, the translation of the point (x,y) by (dx,dy) is treated as an addition and not as a multiplication and must be written differently: x y = dx dy + x y . (17.3) 709 CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 710 -4 -3 -2 -1 1 x 4 3 2 (x, y) r -4 -3 -2 -1 1 2 3 4 r' y (x',y') Figure 17.1: A two-dimensional rotation of a point (x,y) produces a point with new coordinates (x ,y ) as computed according to (17.1). This inconsistency in the type of mathematical operation is easily overcome if points are expressed in terms of homogeneous coordinates by adding a third coordinate w. Homogeneous coordinates are used extensively in computer graphics to treat all transformations consistently. Instead of representing a point in two dimensions by a pair of numbers (x,y), each point is represented by a triple (x,y,w). Because two homogeneous coordinates represent the same point if one is a multiple of the other, we usually homogenize the point by dividing the x-y coordinates by w and write the coordinates in the form (x,y,1). (The w coordinate can be used to add perspective (see Foley et al.).) By using homogeneous coordinates, an arbitrary affine transformation can be written as   x y 1   =   m00 m01 m02 m10 m11 m12 0 0 1     x y 1   . (17.4) A translation, for example, can be expressed as:   x y 1   =   1 0 dx 0 1 dy 0 0 1     x y 1   . (17.5) Exercise 17.1. Homogeneous coordinates (a) How are the rotation and scaling transformations expressed in matrix notation using homogeneous coordinates? Sketch the transformation matrices for a 30◦ clockwise rotation and for a scaling along the x-axis by a factor of two and then write the transformation matrices. Do these matrices commute? (b) Describe the effect of the affine transformation   1 0.2 0 0 1 0 0 0 1   . (17.6) CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 711 Exercise 17.1 shows that a coordinate transformation can be broken into parts using a block matrix format: A dT 0 1 . (17.7) We will use boldface for row vectors such as 0 and d and calligraphic symbols to represent matrices. The translation vector d is transposed to convert it to a column vector. The upper left-hand submatrix A produces rotation and scaling while the vector d = [dx,dy] produces translation. Homogeneous coordinates have another advantage because they can be used to distinguish between points and vectors. Unlike points, vectors should remain invariant under translation. To transform vectors using the same transformation matrices that we use to transform points, we set w to zero, thereby removing the effect of the last column. Note that the difference between two homogeneous points produces a w equal to zero. The elimination of the effect of translation makes sense because the difference between two points is a displacement vector, and vectors are defined in terms of their components, not their location. The AffineTransform class in the java.awt.geom package defines two-dimensional affine transformations. Instances of this class are constructed as AffineTransform at = new AffineTransform ( double m00, double m10, double m01, double m11, double m02, double m12 ) ; The methods in this class encapsulate most of the matrix arithmetic that is required for twodimensional visualization. For example, there are methods to calculate a transformation’s inverse and to combine transformations using the rules of matrix multiplication. There are also static methods for constructing pure rotations, scalings, and translations that require only one or two parameters. double theta = Math . PI /6; AffineTransform at = AffineTransform . getRotateInstance ( theta ) ; A method such as getRotateInstance is known as a convenience method because it simplifies a complicated API. The AffineTransform class can transform geometric objects, images, and even text. The following code fragment shows how this class is used to rotate a point and a rectangle. Point2D . Double pt = Point2D . Double ( 2 . 0 , 3 . 0 ) ; pt = AffineTransform . getRotateInstance (Math . PI / 3 ) . transform ( pt , null ) ; Shape shape = new Rectangle2D . Double (50 ,50 ,100 ,150); shape = AffineTransform . getRotateInstance (Math . PI / 3 ) . createTransformedShape ( shape ) ; The Affine2DApp class in Listing 17.1 demonstrates affine transformations by applying them to a rectangle. Listing 17.1: The Affine2DApp class. package org . opensourcephysics . sip . ch17 ; import java . awt . ; import java . awt . geom . ; import org . opensourcephysics . controls . ; import org . opensourcephysics . display . ; import org . opensourcephysics . frames . ; public class Affine2DApp extends AbstractCalculation { CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 712 DisplayFrame frame = new DisplayFrame ( "2D Affine transformation" ) ; RectShape rect = new RectShape ( ) ; public void calculate ( ) { / / a l l o c a t e 3 rows but not the row elements double [ ] [ ] matrix = new double [ 3 ] [ ] ; / / s e t the f i r s t row of the matrix matrix [ 0] = ( double [ ] ) control . getObject ( "row 0" ) ; / / s e t the second row matrix [ 1] = ( double [ ] ) control . getObject ( "row 1" ) ; / / s e t the t h i r d row matrix [ 2] = ( double [ ] ) control . getObject ( "row 2" ) ; rect . transform ( matrix ) ; } public void reset ( ) { control . clearMessages ( ) ; control . setValue ( "row 0" , new double [ ] {1 , 0 , 0 } ) ; control . setValue ( "row 1" , new double [ ] {0 , 1 , 0 } ) ; control . setValue ( "row 2" , new double [ ] {0 , 0 , 1 } ) ; rect = new RectShape ( ) ; frame . clearDrawables ( ) ; frame . addDrawable ( rect ) ; calculate ( ) ; } public s t a t i c void main ( String [ ] args ) { CalculationControl . createApp (new Affine2DApp ( ) ) ; } class RectShape implements Drawable { / / inner c l a s s Shape shape = new Rectangle2D . Double (50 , 50 , 100 , 100); public void draw ( DrawingPanel panel , Graphics g ) { Graphics2D g2 = ( ( Graphics2D ) g ) ; g2 . setPaint ( Color .BLUE ) ; g2 . f i l l ( shape ) ; g2 . setPaint ( Color .RED) ; g2 . draw ( shape ) ; } public void transform ( double [ ] [ ] mat ) { shape = (new AffineTransform (mat [ 0 ] [ 0 ] , mat [ 1 ] [ 0 ] , mat [ 0 ] [ 1 ] , mat [ 1 ] [ 1 ] , mat [ 0 ] [ 2 ] , mat [ 1 ] [ 2 ] ) ) . createTransformedShape ( shape ) ; } } } Exercise 17.2. Two-dimensional affine transformations (a) Enter an affine transformation for a 30◦ clockwise rotation into Affine2DApp. About what point does the rectangle rotate? Why? CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 713 (b) Add and test a convenience method named translate to the RectShape class that takes two parameters (dx,dy). Add a custom button to invoke this method. (c) Add and test a convenience method named rotate to the RectShape class that takes a θ parameter. (d) An object can be rotated about its center by first translating the object to the center of rotation, performing the rotation, and then translating the object back to its original position. Implement a method that performs a rotation about the center of the rectangle by invoking a sequence of translate and rotate methods. (e) Affine transformations have the property that transformed parallel lines remain parallel. Demonstrate that this property is plausible by transforming a rectangle using arbitrary values for the transformation matrix. To facilitate the creation of simple geometric shapes using world coordinates, the Open Source Physics library defines the DrawableShape and InteractiveShape classes in the display package. These classes define convenience methods to create common drawable shapes whose (x,y) location is their geometric center. These shapes can later be transformed without having to instantiate new objects. The following code fragment shows how these classes are used. / / c i r c l e of radius 3 in world units l o c a t e d at ( −1 ,2) DrawableShape c i r c l e = InteractiveShape . createCircle ( −1 ,2 ,3); c i r c l e . transform ( AffineTransform . getShearInstance ( 1 , 2 ) ) ; frame . addDrawable ( c i r c l e ) ; / / r e c t a n g l e of width 2 and height 1 centered at (3 ,4) InteractiveShape rect = InteractiveShape . createRectangle ( 3 , 4 , 2 , 1 ) ; rect . transform (new AffineTransform ( 2 , 1 , 0 , 1 , 0 , 0 ) ) ; frame . addDrawable ( rect ) ; Because DrawableShape and InteractiveShape classes are written using the Java 2D API, the objects that they define are fundamentally different from the objects that use the awt API introduced in Section 3.3 because Java 2D shapes can be transformed. In addition, the Java 2D API is not restricted to pixel coordinates nor is it restricted to solid single-pixel lines.1 Exercise 17.3. Open Source Physics shape classes The Open Source Physics shape classes can be manipulated using a wide variety of linear algebra-based tools. Modify the Affine2DApp program so that it instantiates and transforms a rectangular DrawableShape into a trapezoid. Test your program by repeating Exercise 17.2. 17.2 Three-Dimensional Transformations There are several available APIs for three-dimensional visualizations using Java. Although Sun has developed the Java 3D package, this package is currently not included in the standard Java runtime environment. The gl4java and jogl libraries are also popular because they are based on the Open GL language. Because 3D graphics libraries are in a state of active development and because we want a three-dimensional visualization framework designed for physics simulations, we have developed a three-dimensional visualization framework that relies only on 1See Chapter 4 in the Open Source Physics: A Users Guide with Examples for a more complete discussion of the DrawableShape and InteractiveShape classes. CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 714 r v' v v v O N N P' P r x v P P' v θ θ Figure 17.2: A rotation of the vector v about an axis ˆr to produce the vector v can be decomposed into parallel v and perpendicular v⊥ components. the standard Java API. This section describes the mathematics that forms the basis of all threedimensional libraries. The simplest rotation is a rotation about one of the coordinate axes with the center of rotation at the origin. This transformation can be written using a 3 × 3 matrix acting on the coordinates (x,y,z). For example, a rotation about the z-axis can be written as   x y z   =   cosθ sinθ 0 −sinθ cosθ 0 0 0 1     x y z   = Rz   x y z   . (17.8) The extension of homogeneous coordinates to three dimensions is straightforward. We add a w-coordinate to the spatial coordinates to create a homogenous point (x,y,z,1). This point is transformed as   x y z 1   =   m00 m01 m02 m03 m10 m11 m12 m13 m20 m21 m22 m23 0 0 0 1     x y z 1   = Rz dT 0 1   x y z 1   . (17.9) Although rotations about one of the coordinate axes are easy to derive and can be combined using the rules of linear algebra to produce an arbitrary orientation, the general case of rotation about the origin by an angle θ around an arbitrary axis ˆr can be constructed directly. The strategy is to decompose the vector v into components that are parallel and perpendicular to the direction ˆr as shown in Figure 17.2. The parallel part v does not change, while the perpendicular part v⊥ is a two-dimensional rotation in a plane perpendicular to ˆr. The parallel part is the projection of ˆr onto v, v = (v· ˆr)ˆr, (17.10) CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 715 and the perpendicular part is what remains of ˆv after we subtract the parallel part: v⊥ = v − (v· ˆr)ˆr. (17.11) To calculate the rotation of v⊥, we need two perpendicular basis vectors in the plane of rotation. If we use v⊥ as the first basis vector, then we can take the cross product with ˆr to produce a vector w that is guaranteed to be perpendicular to v⊥ and ˆr: w = ˆr × v⊥ = ˆr × v. (17.12) The rotation of v⊥ is now calculated in terms of this new basis: v = R(v⊥) = cosθ v⊥ + sinθ w. (17.13) The final result is the sum of this rotated vector and the parallel part that does not change: R(v) = R(v⊥) + v (17.14a) = cosθ v⊥ + sinθ w + v (17.14b) = cosθ [v − (v· ˆr)ˆr] + sinθ (ˆr × v) + (v· ˆr)ˆr (17.14c) or R(v) = [1 − cosθ](v· ˆr)ˆr + sinθ (ˆr × v) + cosθ v. (17.15) Equation (17.15) is known as the Rodrigues formula and provides a way of constructing rotation matrices in terms of the direction of the axis of rotation ˆr = (rx,ry,rz), the cosine of the rotation angle c = cosθ, and the sine of the rotation angle s = sinθ. If we expand the vector products in (17.15), we obtain the matrix: R =   trxrx + c trxry − srz trxrz + sry trxry + srz tryry + c tryrz − srx trxrz − sry tryrz + srx trzrz + c   , (17.16) where t = 1 − cosθ. Homogeneous coordinates are transformed using R 0T 0 1 , (17.17) where the R submatrix is given in (17.16). The Rotation3D class constructor (see Listing 17.2) computes the rotation matrix. The direct method uses this matrix to transform a point. Note that the point passed to this method as an argument is copied into a temporary vector and that the point’s coordinates are then changed. You will define an inverse method that reverses this operation in Exercise 17.5. Listing 17.2: The Rotation3D class implements three-dimensional rotations using a matrix rep- resentation. package org . opensourcephysics . sip . ch17 ; public class Rotation3D { / / transformation matrix private double [ ] [ ] mat = new double [ 4 ] [ 4 ] ; public Rotation3D ( double theta , double [ ] axis ) { double norm = Math . sqrt ( axis [ 0 ] axis [0]+ axis [ 1 ] axis [ 1 ] CHAPTER 17. VISUALIZATION AND RIGID BODY DYNAMICS 716 +axis [ 2 ] axis [ 2 ] ) ; double x = axis [0]/norm , y = axis [1]/norm , z = axis [2]/norm ; double c = Math . cos ( theta ) , s = Math . sin ( theta ) ; double t = 1−c ; / / matrix elements not l i s t e d are zero mat [ 0 ] [ 0 ] = t x x+c ; mat [ 0 ] [ 1 ] = t x y−s z ; mat [ 0 ] [ 2 ] = t x y+s y ; mat [ 1 ] [ 0 ] = t x y+s z ; mat [ 1 ] [ 1 ] = t y y+c ; mat [ 1 ] [ 2 ] = t y z−s x ; mat [ 2 ] [ 0 ] = t x z−s y ; mat [ 2 ] [ 1 ] = t y z+s x ; mat [ 2 ] [ 2 ] = t z z+c ; mat [ 3 ] [ 3 ] = 1; } public void direct ( double [ ] point ) { int n = point . length ; double [ ] pt = new double [n ] ; System . arraycopy ( point , 0 , pt , 0 , n ) ; for ( int i = 0; i