more dangerous ideas

January 4th, 2006 by Lawrence David

i thought i’d post another one of those “dangerous ideas” because they’re so darn interesting.

this one is courtesy of steve strogatz: pimp of non-linear dynamical systems and author of probably the most widely-read textbook on the subject.

steve-o sez:

I worry that insight is becoming impossible, at least at the frontiers of mathematics. Even when we’re able to figure out what’s true or false, we’re less and less able to understand why.

An argument along these lines was recently given by Brian Davies in the “Notices of the American Mathematical Society”. He mentions, for example, that the four-color map theorem in topology was proven in 1976 with the help of computers, which exhaustively checked a huge but finite number of possibilities. No human mathematician could ever verify all the intermediate steps in this brutal proof, and even if someone claimed to, should we trust them? To this day, no one has come up with a more elegant, insightful proof. So we’re left in the unsettling position of knowing that the four-color theorem is true but still not knowing why.

Similarly important but unsatisfying proofs have appeared in group theory (in the classification of finite simple groups, roughly akin to the periodic table for chemical elements) and in geometry (in the problem of how to pack spheres so that they fill space most efficiently, a puzzle that goes back to Kepler in the 1500′s and that arises today in coding theory for telecommunications).

In my own field of complex systems theory, Stephen Wolfram has emphasized that there are simple computer programs, known as cellular automata, whose dynamics can be so inscrutable that there’s no way to predict how they’ll behave; the best you can do is simulate them on the computer, sit back, and watch how they unfold. Observation replaces insight. Mathematics becomes a spectator sport.

i found his worries noteworthy because i’ve noticed similar ones creeping into biologists’ heads – especially computational biologists (i’m supposed to be one of those when i grow up). some people (not me) are predicting that if you take the technology common today (dna sequencers, microarrays, etc) and combine them with the right future breakthrough (say the ability to measure protein levels on the cellular scale and in high throughput) you’d have enough data to train a black box; this black box would take as input some cellular condition and output a prediction of cellular behavior – things like future gene expression, protein levels, cell growth … the works. obviously, this would be tremendous breakthrough – it’d be a phenomenal tool for doing things like drug discovery.

people are of course working on stuff like this already, feeding mountains of microarray, protein-protein, mass-spec, and gene sequence data into a bevy of machine learning algorithms, hoping to produce some kind of program that can produce useful predictions of what a biological system will do. (it’s really hard though – biology is one gnarly wombat.)

but, assuming this approach does work, part of me would still be sad. what does this kind of approach teach us about biology? i’m not sure you can argue that just having the capability of predicting what a system will do implies that you’ve learned anything (see here how bill gates figured out how to win at “petals around the rose”). for many scientists (including strogatz it seems), what matters more than obtaining the result is that somehow, along the way, you’ve sated the curiosity that drove you to ask questions in the first place.

(i’d be thrilled if someone came up with a more scientifically “pleasing” method of understanding complex biological systems. i’m not holding my breath though – i’m constantly stunned by how complicated (read: how much math you need to predict) even the most simple biological processes, such as how the lambda phage decides its fate or even stochastic treatments of simple biomolecular interactions.)

Posted in Uncategorized | 3 Comments »

3 Responses to “more dangerous ideas”

on 05 Jan 2006 at 10:40 pm Alex

I’d say that being able to correctly predict how a system will react is a pretty powerful indicator that you actually know a lot about the system — if you can’t predict how it will react, can you really claim to understand it ?

Also, I think that at a certain system size, you need black-box abstractions to be able to effectively think about the system. More specifically: even if you knew absolutely all the molecular interactions in a cell, there’s no way you’d be able to keep them all in your head and/or do anything useful with that knowledge without organizing these interactions into higher-level “modules” of some sort.

So, while I agree that the low-level details are important and interesting, computational/”systems” biologists will need to get comfortable with less detailed models and more abstraction layers.

[Shameless plug for a somewhat-related post: http://alexmallet.blogspot.com/2005/11/lessons-from-ai-maybe.html

-Alex Mallet.
on 09 Jan 2006 at 1:20 am Lawrence David

Hey Alex – sorry for not replying earlier. It’s amazing how much less time I spend on the internet when I don’t have homework to do In any case, in response to:

I’d say that being able to correctly predict how a system will react is a pretty powerful indicator that you actually know a lot about the system — if you can’t predict how it will react, can you really claim to understand it ?

Agreed – understanding implies predictive capacity. But, I still maintain that prediction is not indictative of understanding. For instance, take lambda phage’s lysis/lysogeny switch (I apologize for the cliched system; I’m really a terrible biologist and my collection of biology stories is pretty limited). If you couple an exhaustive barrage of DNA damaging experiments with precise measurements of time from lysogeny to lysis, I’m sure any number of statistical analysis techniques will hand you a kick-ass predictive model of how the bacteriophage will proliferate when UV light is shone onto infected bacteria. Nonetheless, if all you wanted was a black box, I don’t think you’d learn that lysis/lysogeny were controlled by two mutually inhibiting genes. And, it’s even less probable that you’d find out that there’s some pretty neat biochemistry behind the mutual inhibitory system (both inhibitory proteins can actually bind the same section of dna!) Perhaps it’s the scientific romantic in me speaking, but I’d be hard-pressed to admit understanding of this system without that kind of mechanistic knowledge.

Also, I think that at a certain system size, you need black-box abstractions to be able to effectively think about the system. More specifically: even if you knew absolutely all the molecular interactions in a cell, there’s no way you’d be able to keep them all in your head and/or do anything useful with that knowledge without organizing these interactions into higher-level “modules” of some sort.
I think you’re completely right – keeping track of every reaction in the cell while trying to predict how a patient responds to an increase in their insulin treatments is intractable as well as unnecessary. What I guess I’m just trying to articulate is my distate for how systems biologists often learn these black boxes; people brute force these things by stuffing reams and reams of data down a computer’s throat and hoping their machine-learning-fu is strong. In other words, I’d see biology to be an intellectually richer pursuit if were some kinds of first principles that we could discern and then use to predict how life will proceed. In the spirit of this utterly improbable expectation, I’ve got a good Dirac quote: “it’s more important to have beauty in one’s equations than it is to have them fit experiment.”
on 13 Jan 2006 at 3:13 pm Alex

I totally agree that it’s more satisfying to actually know the molecular details of what’s going on, and that it gives you a deeper level of understanding than subduing a mountain of data via ML-fu. The problem, as I see it, is one of effective scaling of experimental technique as much as it is of “systems-level” understanding: working out all the biochemical details just takes too damn long, even for unicellular critters. And then you have to do it all over again for the next critter. We don’t have the attention span [or NIH research budget] for that anymore =)

Trackback URI | Comments RSS

from the desk of stinkpot

a memory repository

more dangerous ideas

3 Responses to “more dangerous ideas”

Leave a Reply