A Likely Story

Is cosmology a science? Is scientific cosmology even possible, because it is about events so unique and fundamental that no test in any laboratory can truly repeat them? Questions like these pop up often enough, and you can find many good answers to them through e.g. quora, which I will not repeat here.

For the layman thinker, the difference between truth and lies is simple and clear, and it would be natural to expect the difference between science and non-science to be simple and clear as well. The human brain is a categorizing machine that wants to put everything in its proper place. Unfortunately the demarcation of science versus not-science is not so clear.

Tischbein - Oldenburg

Karl Popper modeled his philosophy of science on the remarkable history of general relativity. In 1916, Albert Einstein published his long-awaited theory, and made sensational predictions, reported in newspapers around the world, that would not be possible to verify until the next total eclipse of the Sun. It was almost like a step in classical aristeia, where the hero loudly and publicly claims what preposterous thing he will do, before going on to achieve exactly that. Popper’s ideas about falsification are based on this rare and dramatic triumph of armchair theory-making, not so much on everyday practical science work.

If we want a philosophy of science that really covers most of what gets published as science these days, what we really need is a philosophy of statistics and probability. Unfortunately, statistics does not have the same appeal as a good story, and more often gets blamed for being misleading than lauded as a necessary method towards more certain truths. There is a non-zero probability that some day popularizations of science could be as enthusiastic about P-values, null hypotheses, bayesians, as they are today about black holes, dark energy and exotic matter.

Under the broadest umbrella of scientific endeavors, there are roughly two kinds of approaches. One, like general relativity, looks for things that never change, universal rules that apply in all places and times. These include the ‘laws’ of physics, and the logical-mathematical framework necessary for expressing them (whether that should include the axioms of statistics and probability, if any, is the question).

The other approach is the application of such frameworks, to make observations about how some particular system evolves. For example, how mountains form and erode, how birds migrate, how plagues are transmitted, what is the future of a solar system or galaxy, how climate changes over time, what are the relationships between different phyla in the great tree of life, and so on. Many of such fields study uniquely evolved things, such as a particular language or a form of life. In many cases it is not possible or practical to “repeat an experiment” starting from the initial state, which is why it is so important to record and share the raw data, so that it can be analyzed by others.

From the point of view of the theoretical physicists, it is often considered serendipitous that the fundamental laws of physics are discoverable, and even understandable by humans. But it could also be that the laws that we have discovered so far are just approximations that are “good enough” to be usable with the imperfect instruments available to us.

The “luck” of the theorist has been that so many physical systems are dominated by one kind of force, with the other forces weaker by many orders of magnitude. For example, the orbit of the Earth around the Sun is dominated by gravitational forces, while the electromagnetic interactions are insignificant. In another kind of system, for example the semiconducting circuits of a microprocessor, electromagnetism dominates and gravity is insignificant. The dominant physics model depends on the scale and granularity of the system under study (the physical world is not truly scale invariant).

As the experimental side of physics has developed, our measurements have become more precise. When we achieve more reliable decimals to physical measurements, we sometimes need to add new theories, to account for things like unexpected fine structure in spectral lines. The more precision we want from our theories, the more terms we need to add to our equations, making them less simple, further away from a pythagorean ideal.

The nature of measurement makes statistical methods applicable regardless of whether measurement errors originate from a fundamental randomness, or from a determinism we don’t understand yet. The most eager theorists, keen to unify the different forces, have proposed entire new dimensions, hidden in the decimal dust. But for such theories to be practically useful, they must make predictions that differ, at least statistically, from the assumed distribution of measurement errors.

Many theorists and philosophers abhor the uncertainty associated with probability and statistics. (Part of this is probably due to personality of each individual, some innate unwillingness to accept uncertainty or risk.) To some extent this can be a good thing, as it drives them to search for patterns behind what first seems random.

But even for philosophers, statistics could be more than just a convenient box labeled ‘miscellaneous’. Like in the Parmenides dialogue, even dirt can have ideal qualities.

Even though statistics is the study of variables and variability, its name comes from the same root as “static”. When statistics talks about what is changeable, it always makes implicit assumptions about what does not change, some ‘other’ that we compare the changes against.

It is often said that statistical correlation does not imply causation, but does cosmic causation even make sense where cosmic time does not exist? Can we really make any statistical assumptions about the distribution of matter and energy in the ‘initial’ state of all that exists, if that includes all of space and time?

One of the things that Einstein was trying to correct, when working on general relativity, was causality, which was considered broken in the 1905 version of relativity, since causes did not always precede their effects, depending on the movement of the observer. General relativity fixed it so that physical events always obey the timeline of any physical observer, but only by introducing the possibility of macroscopic event horizons, and strange geometries of observable spacetime. But the nature of event horizons prevents us from observing any event that could be the primal cause of all existence, since it would be outside of the timeline from our point of view. We can make estimates of the ‘age’ of the Universe, but this is a statistical concept, no physical observer experiences time in the clock that measures the age.

Before Einstein, cosmology did not exist as a science. At most, it was thought that the laws of physics would be enough to account for all the motion in the world, starting from some ‘first mover’ who once pushed everything in the cosmos to action. This kind of mechanistic view of the Universe as a process, entity or event, separate from but subservient to a universal time, is no longer compatible with modern physics. In the current models, continuity of time is broken not only at event horizons, but also at the Planck scales of time and distance. (Continuing the example in Powers of Two, Planck length would be reached in the sixth chessboard down, if gold were not atomic.)

Why is causality so important to us, that we would rather turn the universe into swiss cheese than part with it? The way we experience time, as a flow, and how we maintain identity in that flow, has a lot to do with it. Stories, using language to form sequences of words, or just as remembered sequences of images, dreams and songs, are deeply embedded into the human psyche. Our very identities as individuals are stories, stories are what make us human, and plausible causes make plausible stories.


Knowledge, Fast and Slow

Ars longa, vita brevis

Due to the shortness of human life, it is impossible for one person to know everything. In modern science, there can be no “renaissance men”, who have deep understanding of all the current fields of scientific knowledge. Where it was possible for Henri Poincar√© to master all the mathematics of his time, a hundred years later no-one in their right minds would attempt a similar mastery, due to the sheer amount of published research.

A large portion of the hubris of the so-called renaissance men, like Leonardo da Vinci, can be traced to a single source: the books on architecture written by Vitruvius more than a thousand years earlier, rediscovered in 1414 and widely circulated by a new innovation, the printing press. In these books, dedicated to emperor Augustus, Vitruvius describes what kind of education is needed to become an architect: nothing less than enkuklios paideia, universal knowledge of all the arts and crafts.

Of course an architect should understand how a building is going to be used, and how light and sound interact with different building materials. But some of the things that Vitruvius writes are probably meant as indirect flattery to his audience and employer, the first emperor. Augustus would likely have fancied himself “the architect” of the whole roman empire, in both the literal and the figurative sense.

Paideia was a core hellenic tradition, it was how knowledge and skills were kept alive and passed on to the future generations. General studies were attended until the age of about 12, after which it was normal to choose your future profession, and start an apprenticeship in it. But it was also not uncommon for some aristo to send their offspring to an enkuklios paideia, a roving apprenticeship. They would spend months, maybe a year at a time learning from the masters of one profession, then move to another place to learn something completely different for a time. A born ruler would anyway not be needing any single profession as such, but some knowledge of all professions would help him rule (or alternatively, human nature being what it is, the burden of tolerating the privileged brats of the idle class must be shared by all (“it takes a village”)).

Chiron instructs young Achilles - Ancient Roman fresco

Over the centuries, enkuklios paideia transformed into the word encyclopedia, which today means a written collection of current knowledge in all disciplines. As human knowledge is being created and corrected at accelerating rates, printed versions are becoming outdated faster than they can be printed and read. Online encyclopedias, something only envisioned by people like Douglas Engelbart half a century ago, have now become a daily feature of life, and most written human knowledge is in principle available anywhere, anytime, as near as the nearest smartphone.

Does that mean that we are all now vitruvian architects, renaissance geniuses, with working knowledge of all professions? Well no, human life is still too short to read, let alone understand, all of wikipedia, or keep up with its constant changes. And not everything can be learned by reading or even watching a video, some things can only be learned by doing.

For the purposes of this essay, I am stating that there are roughly two types of knowledge that a human can learn. The first one, let’s call it epistemic knowledge, consists of answers to “what” questions. This is the kind of knowledge that can be looked up or written down fast; for example, the names of people and places, numeric quantities, articles of law. Once discovered, like the end result of a sports match, they can be easily distributed all around the world. But, if they are lost or forgotten, they are lost forever, like all the writings in languages we no longer understand.

The other type of knowledge I will call technical knowledge, consisting of answers to “how” questions. In a sense technical knowledge is any acquired skill that is learned through training, that eventually becomes second nature, something we know how to do without consciously thinking about it. Examples are the skills that all children must learn through trial and error, like walking or speaking. Even something as complex as driving a car can become so automatic that we do it as naturally as walking.

[Sidenote: the naming of the two types here as “epistemic” and “technical” is not arbitrary, they are based on two ancient greek words for knowledge.]

The division to epistemic and technical knowledge is not any fundamental divide, and many contexts have both epistemic and technical aspects. Sometimes the two even depend on each other, like names are dependent on language, or writing depends on the alphabet.

Both kinds of knowledge are stored in the brain, and can be lost if the brain is damaged somehow. But whereas an amnesiac can be just told what their name and birthday is, learning to ride a bicycle again cannot be done by just reading a wikipedia article on the subject. The hardest part of recovering from a brain injury can be having to relearn skills that an adult takes for granted, like walking, eating or speaking.

In contrast to epistemic knowledge, technical knowledge can sometimes be reconstructed after being lost. Even though no documents readable to us have survived from the stone age, we can still rediscover what it may have been like to work with stone tools, through experimental archaeology.

Technical knowledge exists also in many wild animals. Younger members of the pack follow the older ones around, observe what they do and try to imitate them, in a kind of natural apprenticeship. Much has been said about so-called mirror neurons that are though to be behind this phenomenon, in both humans and animals.

New techniques are not just learned by repetitive training and imitation, entirely new techniques can be discovered in practice. Usually some competitive drive is present, like in sports. For example, high jump sets its goal in the simplest of terms: jump over this bar without knocking it off. But it took years before someone tried to use something other than the “scissors” technique. Once the superiority of a new jumping technique became evident, everyone starting to learn it, and improve on it, thus raising the bar for everyone.

New techniques offer significant competitive advantages not only in sports, but also in the struggles between nations and corporations. Since we are so good at imitating and adapting, the strategic advantage of a new technique will eventually be lost, if the adversary is able to observe how it is performed. The high jump takes place in front of all, competitors and judges alike, and everything the athlete does is potentially analyzed by the adverse side. (This does not rule out subterfuge, and the preparatory training can also be kept secret.)

About the time of the industrial revolution, it became apparent that tools and machines can embody useful technical knowledge in a way that is intrinsically hidden from view. Secret techniques that observers cannot imitate even in their imaginations are, to them, indistinguishable from magic. To encourage inventors to disclose new techniques, but still gain temporary competitive advantage in the marketplace, the patent system was established. Since a patent would only get granted if the technique was disclosed, everyone would benefit, and no inventor need take their discoveries to the grave with them, for fear of them being “stolen”. Today international patent agreements cover many countries, and corporations sometimes decide to share patent portfolios, but nations have also been known to classify some technologies secret for strategic military purposes.

Even though technical knowledge is the slow type of knowledge, it is still much easier to learn an existing technique from someone than it was for that someone to invent, discover or develop in the first place. This fact allows societies to progress, as the fruits of knowledge are shared, kept alive and even developed further. One area where this may not apply so well is in the arena of pure thought, since it mostly happens hidden from view, inside the skull. This could be one reason why philosophy and mathematics have always been associated with steep learning curves. Socrates never believed that philosophy could be passed on by writing books, only dialogue and discussions could be truly instructive, the progress of thought made more explicit thereby. This is also why rhetoric and debate is often considered as prerequisite for studying philosophy (though Socrates had not much love for the rhetors of his time either).

From all the tools that we have developed, digital computers seem the most promising candidates for managing knowledge outside of a living brain. Words, numbers and other data can be encoded as digital information, stored and transported reliably from one medium to another, at faster rates than with any other tool available to us. Most of it can be classified as the first type of knowledge, the kind that can be looked up in a database management system. Are there also analogues of the second type of knowledge in computers?

In traditional computer programming, a program is written, tested and debugged by human programmers, using their technical skills and knowledge and all the tools available to them. These kind of computer programs are not written just for the compiler, the source code needs to be understood by humans as well, so they know that/how it works, and can fix it or develop it further if needed.¬† The “blueprint” (i.e. the software parts) of a machine can be finalized even after the hardware has been built and delivered to the customer, but it is still essentially a blueprint designed by a human.

Nowadays it is also possible for some pieces of software to be trained into performing a task, such as recognizing patterns in big data. The development of such software involves a lot of testing, of the trial and error kind, but not algorithmic programming in the traditional sense. Some kind of an adaptive system, for example an artificial neural network, is trained with a set of example input data, guided to imitate the choices that a human (or other entity with the knowledge) made on the same data. The resulting, fully trained state of the adaptive system is not understandable in the same way that a program written by a human is, but since it is all digital structures, it can be copied and distributed just as easily as human-written software.

This kind of machine learning has obvious similarities to the slow type of knowledge in animals. The principles are the same as teaching a dog to do a trick, except in machine learning we can just turn the learning mode off when we are done training. And of course, machines are not actively improving their skills, or making new discoveries as competing individuals. (Not yet, at least.)