5.5. Writing Clean Code
Alice: But now I want to understand how exactly you coded these six
equations in the file test.rb above.
Bob: The line include Math tells Ruby to make the Math module
visible to the program. This is an example of what is called a mixin.
However, this is merely a convenience. We could have left that out,
but then the square root sqrt would not have been recognized; we
would have had to write Math.sqrt instead. It is just more
convenient to mixin Math right at the beginning.
Alice: Then you define the size of the time step and the number of
steps you want to take. It is a good idea, to introduce them as
variables at the top of your program, instead of hard coding them below
inside the loops. It will be much easier to change the values only
once at the top, rather than having to inspect the whole program to see
where the values occur.
Bob: You would be surprised how many scientists of the hard-coding
type you can still find!
Alice: That is because most scientists have never been taught how to
include levels of data abstraction in their program, right from the
beginning in the conceptual view of the architecture of a program.
One of the main principles of data abstraction is: no constants except
0 and 1 are allowed in the body of the code. Everything else should
be given a symbolic name.
Bob: I'm in principle against principles, but I like the content of what
you just said. As for those many scientists, at least they all use
subroutines.
Alice: Not all! There are still legacy codes being used widely that
jump around through many pages of programs using do loops between blocks
of code that are effectively a poor man's subroutine. One problem is
that scientists view subroutines merely as a convenient way to save time:
if you use the same piece of code in three places in your program, you
save time by writing it only once as a subroutine, and calling it from
three places. But that is the least important reason to use functions
or subroutines in a program.
A much more important reason is the issue of code maintenance. If the
same piece of code occurs in more than one place, it becomes essentially
impossible to maintain the code in a consistent way. Change something
in one place in someone's legacy code, and most likely you don't even
know that it would have to be changed in the other place to. Bugs
appear, for no apparent reason, while you are sure that you did
the right thing; you just didn't know about the other place which has
now become incompatible. I bet that literally tens of person-millennia
have gone down the drain this way, during the last fifty years, chasing
those bugs.
Bob: You mean more than 10,000 person-years?
Alice: I wouldn't be surprised. I would estimate there to be at least
a hundred thousand programmers in the world. Most of them have struggled
for a total of a month in their life, at the very least, chasing bugs that
originated in codes which they tried to update, only to find out that there
were unexpected and typically undocumented side effects. That already
makes a person-millennium worth of effort. And I think that is a vast
underestimate.
Bob: Impressive. I never thought about it quite that way.
Alice: So the name of the game is modularity, and I'm glad to see you
giving the students the right example, in the third and fourth line of
your integrator.
Bob: This type of what you call modularity at least I agree with. It
would have been easy to write 100.times in the seventh line
instead of ns.times, and change that number by hand later.
I thought about doing that. But when I saw that dt occurs
not once but twice in the body of the do loop, I realized that it
would be all to easy to change the numerical value of one of them, and
not the other.
Alice: With the result of having the position and velocity stepping
forward in time at different speeds -- quite a nightmare, when you
want to debug it. Most likely you will start your debugging on the
assumption that you made a mistake in the physics of gravitational
interaction, or in the mathematical equations, or in the way you solve
them numerically, or just a typo in the way you coded it all. The
last thing you suspect would be that you would update the position
with time step 0.01 and the velocity with time step 0.001, say.
Bob: I know it all too well, from past experience. That's why I've
wizened up.
5.6. Where the Work is Done
Alice: Talking about the do loop, I presume it is being traversed
ns times, almost as if you read Ruby like English. How can that
possibly work?
Bob: The period means that times is a method associated with the
class of which the variable ns points to an instance. Since ns
has been initialized with an integer, it now has become an instance of
class Integer. And conveniently, this class has a method times
built in that takes the value of the instance of the class, here the
value of ns, and iterates the following block of code ns times.
As often is the case in Ruby, the explanation of a construction sounds
far more complicated than the way you use it. As you already remarked,
ns.times reads like English. Another example of the principle
of least surprise.
Alice: Within the do loop, you first compute
by introducing a variable r2, initialize it to zero, and then
accumulating the result of squaring the value of each component of the
vector
. What you need for the acceleration is
the 3/2 power, so you compute that in the next line.
Finally, you solve the three pairs of equations I wrote above. The
array method each_index presumably does what it says, it executes the
next block of code once for each possible value of the index of the array?
Bob: Yes. In the case of a two-dimensional array a, the two components
are a[0] and a[1]; remember that Ruby arrays start at
zero by default. For such an array, the block of code following
a.each_index will be executed once for k=0, and once
for k=1.
Alice: And if we would have used three-dimensional arrays for positions
and velocities and accelerations, the block would be executed also for
k=2. How neat to see an integrator without any need to remind
the computer ad nauseam to go through an inner loop for k is
1 to 3, or something like that -- just as we saw earlier when we wrote
the I/O routines.
Bob: But we can do even better. Granted, we have avoided the use of an
explicit variable NDIM for the number of dimensions, but there is
still the lingering smell of components hanging in the air, in terms
of the use of k in the last two blocks of code. I wanted to get the
code working quickly, so I didn't think about it to much, but I have
an idea of how to get rid of even k.
Alice: That would be even better. But shall we first check whether
everything works as advertised?