Listen up pop-science fans, I might be just about to pop one of your bubbles (or maybe not). This one, in fact. The original paper can be found here – it’s open access, and therefore extra awesome. I thought I’d do this before the Discovery Institute get their grubby mitts on it.
The regurgitation of the press release begins as follows:
Ultracold-Resistant Chemical on Titan Could Allow It to Harbor Life
Astrobiologists and planetary scientists have a fairly good idea of which chemicals might indicate the presence of oxygen-breathing, water-based life—that is if it is like us. When it comes to worlds such as Saturn’s moon Titan, however, where temperatures are too cold for aqueous biochemistry, it’s much harder to know which chemicals could signal the existence of hydrocarbon-based life.
Oh, I love pop-science headlines. They always go at least ten steps ahead of the research they’re actually reporting. In their defence, Scientific American do a decent job and don’t oversell it once they hit the third or forth paragraph, but I want to go a little deeper into the theory because I’m kind of a nerd. I’ll cover some of the core strengths and disadvantages of what they’re doing in this research.
In brief – What the f**k are they doing?
Life on Earth requires some sort of membrane to contain it. We call these cells. You may have heard of them. These are made – as high school biology graduates will know – by phospholipid bi-layers that create a fully encased supramolecular structure. These layers form because the phospholipid molecules have parts that attract to water, and parts that move away from water. Obviously, we pretty much live in water, so there aren’t many options for the parts of the molecule that dislike water – these long hydrocarbon chains – and so they form small globules called micelles, where the long chains face inward, protected from water by a shell made of the parts of the molecule that actually like to bind to water. In higher concentrations these start forming membranes, where there are two layers – the hydrophobic, water-hating, parts of the molecule all turned in and the hydrophilic, water-loving, parts turned out. Eventually, with the right concentration, they form bi-layered cells.
Of course, this all means we need liquid water. The chemistry of these membranes and bi-layers doesn’t work in other solvents particularly well, and certainly not at low temperatures where water and lipid molecules freeze solid. So the question is this: can we do the same thing that forms in other solvents, using other molecules, and at temperatures outside the “habitable zone” of the solar system. More specifically, can this be done for the conditions on Titan, where liquid methane acts as the moon’s “water”, and simpler organic molecules act as the phospholipids.
The response seems to be that, in principle, the answer is yes.
So it means life is possible?
Yes and no. The theory proposes a way to build the membranes and cells required to contain life – these keep the active metabolic chemicals in high concentration (the original paper mentions this as part of the introduction, it’s all part of the “RNA World” hypothesis for abiogenesis), allowing life to form and evolve. But this is far from the greatest barrier to self-organised and self-replicating life. Even if these hypothetical cells form, they would have to contain some high concentration chemistry – something that would have to be more complex and active than we currently have solid evidence for. The chemical “soup” trapped in there would also have to reach a complexity to start replicating with modification – where evolution can take over and make “life”, as we know it, Jim, almost inevitable. This is a much bigger “if” than the mere formation of membranes, and to be fair even that is still a big “if”.
Even a theoretical proposal would have to import essential chemical properties to a low temperature system with an alkane solvent. This is not impossible, but it is not staggeringly likely either.
“Computational” = “Proceed with caution”
It’s important to keep in mind this current “cells on Titan” research is theoretical – in fact, “hypothetical” might be a closer qualitative description, as it’s a big “if” rather than a solid, well-backed theory. This sort of caveat is often the first to go missing as papers get compressed into press releases, and press releases get compressed into pop-science articles, and articles get compressed to Facebook posts and tweets and meme images and Daily Mail comments. Be under no illusions: this work has been done entirely in a computer, and is just a proposition for now.
I can’t and won’t trash work for being purely computational. I’ve done plenty of my own calculationsthat have interfaced between real-world chemical observations and their theoretical replication, and I’ve mentioned before the successful results of using a genetic algorithm to predict the existence of usual chemical structures. However, the work I discussed there by Oganov et. al. went a step beyond their computational hypothesis – they put their experimental clout where their mouth was and actually made the substances they predicted. Score one solid goal for science, even if it didn’t “completely overturn all of chemistry” as the press release claimed.
So far with respect to cell membranes forming on Titan, there’s no empirical data forthcoming. Is this because someone has tried, failed, and neglected to publish? Is it because conclusively demonstrating that cells don’t form in liquid methane would mean proving a negative? The experiment might not be so straightforward to do, there may always be the right conditions to make it happen if the hypothesis is solid. But I expect it will come eventually, particularly if they’ve piqued the interest of parties capable of doing the experimental work. This hypothesis will either sink or swim (in liquid methane, of course) on the basis of that.
Molecular Dynamics simulations
Computing the properties of molecules is difficult. Computing the properties accurately is even more difficult-er.
Think of it this way – for every atom (if you want to treat every atom individually) has to be described by three coordinates of position. And then three coordinates of momentum to give it a direction. And three coordinates of force acting on it computed from everything else that will change its momentum and position. It’s clear that as your system grows, you’ll need more data just to describe it. But then there are the interactions that lead up to the force that will alter its position and momentum. Two points gives you one interaction – and this is the only case you can solve perfectly. Three points gives you three interactions (consider a triangle). Four points gives you six interactions and five points requires modelling ten interactions (draw these out if you don’t believe me) and it increases from there. Some theoretical models increase their computational costs even more rapidly than that.
If you want to describe a very large system, say, a protein, or a layered membrane formed from dozens or hundreds of molecules, you will have thousands upon thousands of interactions to take into account. It stands to reason, then, that the more interactions you have the less complicated your calculations for each one must be. Otherwise you’re talking “age of the universe” time scales for making your calculation. This is where molecular mechanics and molecular dynamics come into play – you take your molecules and you simplify down the possible interactions to the most basic level, then run the simulation that way using assumptions and less intensive calculations.
In general, this is alright. You can get the basics of what a large number of molecules will try to do just from running such simple calculations, and the OPLS model used in this work is accepted as good enough for the task at hand. So the method is what we’d call “robust” – that is, it’s one of those things where 60% of the time it works 100% of the time.
If you download a neat bit of freeware called Argus Lab (warning: it’s not under active development at the moment and tends to run into trouble on 64-bit machines) you can start playing with your own things in a matter of minutes and do things like show DNA bases binding to each other using molecular mechanics calculations. The exact values you get for the strength of that interaction are dubious-as-all-hell, but hey, from fundamentally simple equations you can predict that DNA works. That’s just cool, right?
But the simple methods are not perfect and foolproof. Often you need to fudge a few of the simulations with real-world data. These methods are known as “semi-empirical” (you can work out the etymology of that at home) and the garbage-in-garbage-out principle holds true for them. Sometimes, even if you do try to fudge it with decent empirical data you still can’t get a good result. Even trying to work out the properties of water – something you’d think is the most well-studied molecule in existence – is insanely difficult and requires, actually, modelling a lot of water molecules because the interactions are that disperse. You can’t calculate something like a hydrogen bond strength of H2O just by considering two H2O molecules interacting. So you need to validate the simple model to make sure you aren’t falling foul of this sort of physics trickery.
The main bit of data used to validate the OPLS model in this work is the binding energy between two of their target molecules. The authors compared their predicted energies from the model that made the self-organised layers to an energy taken from a more robust and reliable (a “higher level”) calculation. And this is the part where I need to say “proceed with caution” again, because this data they’re comparing to still isn’t empirical, but also established from a calculation.
Ab initio Calculations
If you scroll down the original open access paper you’ll find the frightening combination of numbers and letters “M062X/aug-cc-pVDZ”. To explain this as quickly as possible, everything before the “/” is the “functional” – this is the theory, as laid out by clever computational people and physicists with a lot of spare time on their hands, that you will use to spit out an energy from your calculation. Everything after is the “basis set”, which are the basic building blocks of the atoms (more specifically, the electrons) that you’ll use to help derive it. There are an astounding number of each, and they are all completely interchangeable (although some combinations are more sensible than others). And each combination will spit out different energies for even the same molecule.
Calculating the binding energy between two molecules is almost comically simple. You set up your molecule and the theory and basis set you want to use to model it and the calculation spits out an energy value. You then set up two molecules next to each other and the same calculations spit out another energy value. If the latter is less than two lots of the former, the molecules prefer to sit next to each other by that amount of energy.
There are a few caveats to this, such as basis-set superposition error (BSSE), which is basically the error associated with assuming the “comically simple” approach I just described, but you can correct for that easy enough. Since you didn’t ask, you do this by taking the molecules individually as described above, but give them access to the atomic orbitals, aka the basis functions, of the other molecule but without actually putting the molecule or the electrons there – you then do some mathematical jiggery-pokery with the resulting combination of energies and you arrive at your correction. This is another thing you need to do or your TAP-IPM will chase you around with a chair.
Now, the major trouble with ab initio (from base principles) calculations is that they need to be calibrated. You do this by picking a method that produces reliable results for the work at hand.
And that’s the trick, you have to find the right combination that works. If the theory and basis-set combination you choose replicates an energy that you’ve actually measured (a known quantity) within a few percent, it’s a good bet that it will successfully predict the energy of an unknown if you’re looking at a similar-enough system. A lot of simple organic reactions can be predicted well by the combination labelled “B3LYP/6-31G”, which is about as close as you can get to a “standard” or “default” combination. But B3LYP/6-31G fails miserably for a lot of transition metals and organometallic compounds, which is where you need to start getting creative. If the process you are studying is intra-molecular – i.e., bits are just rearranging, rather than falling off or coming on – then most combinations tend to be much of a muchness. But when you’re talking inter-molecular interactions, particularly the van der Waals or electrostatic interactions between molecules, the right combination is essential. Again, garbage-in-garbage-out.
But you must measure it against something known, otherwise you are shooting in the dark. I once read a paper that proposed a very interesting new twist to a particular catalytic mechanism, something that they claimed had a much lower – and therefore more plausible – energy profile. It looked great. But it turned out they hadn’t actually calibrated/validated it well. If you could even call what they did “validation”. Their supplementary information showed that they had just changed the electron core psuedopotential (an assumption that allows you to ignore all the core electrons around an atom and replace them with just a single charge) a few times and concluded “well, we get similar enough answers each time so it must be right”. You can’t do this, or your TAP-IPM will chase you around with a chair. Again. Your chosen method has to be calibrated against empirical data or the garbage-in-garbage-out principle applies. So, when I replicated this catalytic system with a completely different level of theory and a different basis set (one that was calibrated against empirical energy values derived from some painstaking kinetic experiments) the claimed effect in this paper effectively disappeared.
And this is where I get a little dubious about the reality of micelles forming in liquid methane in reality. The molecular dynamics, and particularly the more detailed conclusions of the paper rely on an accurate binding energy between two molecules. Without this, you could get any old result. You could stick in a random number for the binding energy, and see molecular self assembly from the simplified molecular dynamics calculations that is wholly unrealistic. The energies associated with that self-assembly may well be off my a huge margin, and when you start plugging that into thermodynamic equations that are raised to the power of these numbers, your errors become far more staggering. I am also dubious about taking a binding energy from just two molecules alone. If you’re talking about large structures such as micelles, I really would like to see some ab initio stuff done on larger clusters including tetramers to see how they start interacting using this higher and more precise level of theory, BSSE-corrected or not. As I touched upon above, in water you need to get to several layers of interacting water molecules to approach experimental accuracy. Is this level of detail needed in this case? It might hurt the hypothesis, but it can’t hurt its reliability.
I also have to question the use of implicit solvation in their quantum mechanical model – that is, not making the calculation in the presence of actual solvent molecules (almost essential if you’re going to imply that solvent drives this reaction!) but in just polarisable continuum that, let’s be brutally honest about this method, only vaguely represents the idea that there’s a solvent if you squint a bit and squish it about. The binding energy that they calculate to configure and validate the model is, of course, more than against molecules verses molecules separated by infinite distance, it competes against the ability for the molecules to bind to the solvent explicitly. This isn’t always trivial. The ability for solvent molecules to make very specific interactions with molecules means that solvent getting in there are breaking up the self-assembled layers and altering their stability needs to be accounted for much more explicitly than they have done to make the results more robust.
Is all that required for acrylamide and other similar molecules working in methane? Possibly, possibly not. Hopefully the authors have done their background reading to figure that out, and I’m willing to give them the benefit of the doubt given that they’re refererring fairly robust procedures and methods – although these methods are compared against reasonable standards (M062X) rather than a “gold standard” like CCSD(T). It’s reassuring that the OPLS model’s binding energies were within 4 kJ/mol of the ab initio results, suggesting the model has merit, but as I’ve pointed out above, theoretical self-consistency should take a backseat to consistency with experiment because the former can be fudged so very easily.
Overall, I think this is a pretty cool and promising result. The work by Oganov on sodium chloride stoichiometry that I’ve discussed previously on this blog demonstrates the predictive power of computational chemistry, and this could well do the same. The authors here have demonstrated some excellent potential chemistry that could be going on in liquid methane oceans. However, save the champagne for now. Without comparing their results and values to experimentally derived ones, and finally experimental verification that self-assembly of these molecules actually happens in liquid methane there is no hard evidence, yet, that this theory is realistic. Hopefully these experiments are coming soon, so we can see if this holds up. Because if we can build them in the lab, and then figure out a reliable way to detect them in the wild on Titan, the question about whether it lowers the barrier to life and its application to exobiology is irrelevant, it will be some really interesting chemistry we’ve found.