Friday, March 26, 2010

Rabbit proofing a wild vegetable patch




Nothing to do with Science. More related to common sense. I went for a walk with a kind soul today who thought it wise to plant vegetable seeds in the random acres of land that our campus is endowed with. It never occurred to him that rabbits would be chewing away at the seedlings in a few weeks time.




Instead of sending him a book on Gardening for Dummies, it's far easier to disseminate information on the blog.

Well known Bunny repellents:

Wednesday, June 4, 2008

Not mine... but glycosylation anyway!

Instead of writing myself, I'm forwarding a link to another nice blog entry about glycosylation.

Friday, May 9, 2008

Buckyballs and footballs

Before the English invented football, there was the buckyball.





Named after the American architect R. Buckminster Fuller who designed the geodesic dome with the same fundamental symmetry, C60 is third major form of pure carbon after diamond (my other favourite carbon) and graphite.

It also happens to be the roundest and most symmetrical molecule known.
In C60, hexagons and pentagons of carbon link together in a coordinated fashion (just like in a football) to form a hollow geodesic dome with bonding strains equally distributed among the 60 carbon atoms. The recognition for its discovery by Kroto, Curl and Smalley came in the form of a Nobel Prize in Chemistry back in 1996.

As it turns out, C60 and its other fullerene cousins (C70, C84, C28 et al) are endowed with extraordinary chemical and physical properties. They can react with all sorts of elements across the periodic table and free radicals- involving a polymerisation process widely used to make high temperature superconductors. American scientists spent the next decade moulding fullerenes into pipes (nanotubes). Meanwhile, the continental counterparts in the IBM laboratory in Zurich incorporated buckyballs into micro-sized abacus by lining buckyballs onto a multigrooved copper plate, like beads on a string and then manipulated the beads with a scanning tunnelling microscope to perform calculations. A technology that could pave the way for a better computer chip in the future. Move over Microsoft.


Wednesday, May 7, 2008

Vesicle coats

On to next chapter in Alberts et al..

Adidas clearly stole their 'Fevernova' logo

from the clathrin triskelion


And football players (or Plato?) the truncated icosahedron ball shape


from the clathrin coat


Go Nature in beating humans in making pretty things :)

Friday, May 2, 2008

Model selection - Information criteria, part II

Now for the hardcore information criteria part :)

The goal is still the same - pick a model to maximize the log-likelihood of the data. This is given by We can approximate the integral with a Laplace approximation, which is similar in idea to the previous post - the probability mass will be centered around the mode of the distribution. We can fit a normal distribution with the mode as mean, and variance approximated from Taylor expansion at the mode. Next 2 paragraphs can be skipped if you believe this :)

For example, to approximate a function that has a mode (and thus a local maximum) at , we use the 2nd order Taylor:

(the first order term is 0 because of the local maximum)

Taking as the negative of the second derivative matrix, we get If we are looking for a probability distribution that is proportional to , we have as the mean, as the covariance matrix, and as the normalizing coefficient - voila!

So we can fit a Gaussian to a function - back to information criteria. We'll fit a Gaussian to at the mode (with the most likely parameter setting) :







As before, the first term is the fit of the model to the data. The rest of the terms are the complexity penalty. The a wide prior probability for the parameters, the second term is small, and the last term scales with - the main penalty comes from

To evaluate the determinant of the covariance matrix, we assume that it has full rank, and is due to iid data points. This means that is the sum of variances due to the data points, and since the data is iid, . So . Again, last term is constant, so all in all we have


To recap, we estimated the probability of the data under the model, using the Laplace approximation to fit a Gaussian for the log-likelihood, and used some simplifying assumptions to arrive at the final form.

The end result is pretty much the Bayesian Information Criterion, and it penalizes model complexity more than AIC. Note that the constants in front are not arbitrary, since we never made any simplifications for them, and there's a 2:1 ratio. That should show Matti :)

Thursday, May 1, 2008

The Geometry of Nature and Chaos

Long before Benoit Mandelbrot defined fractals, Dutch artist MC Escher geometrical tessellations inspired connections between mathematicians, physicists, artists and crystallographers. To put it simply, fractals are structures that appear self-similar on multiple spatial scales- that is, any piece of it looks like the whole after a change of scale.

Fractals in Nature tend to be three-dimensional- requiring three coordinates to specify the location of any point. In specifying an object, we often use two definitions of dimensions. Firstly is the Euclidean dimension (De): the number of coordinates required to specify an object. Secondly, there is the Topological dimension (Dt): something like a measure of the intrinsic dimension of the object. Consider a thin string with a topological dimension of one but when it is spread out in space, as in a ball, it has a Euclidean dimension of three.

Topology is also referred as ‘rubber’ geometry since it only deals with the qualitative shape of an object. Take for instance a rubber ball- stretching it can allow it to be deformed into another topologically equivalent object. Therefore, a curve of any shape is actually topologically equivalent to a straight line with a topological dimension of one.

Euclidean and topological dimensions are always integral. But very often mathematicians use the term, similarity dimension which is often fractional. If you take a unit Euclidean line, square and cube, each divided into N equal self similar parts of linear dimension s (scale factors)- for the line, Ns = 1, each smaller part has length s = 1/N.

For the square, Ns2 = 1. Therefore, s = 1/N0.5

As for the cube, NS3 = 1. That means s = 1/N1/3.

So say, if an object of unit size contains N self-similar copies of itself of size s, then its similarity dimension Ds is determined by the equation:

Ns Ds = 1

For the Euclidean figures above, Ds = 1 for the line, Ds = 2 for the square and Ds = 3 for the cube. If we re-write the equation

Ds = log (N) / log (1/s)

Now we can find the similarity dimension of the Koch curve (or the snowflake). At each observation scale, if the curve contains 4 self-similar copies of itself of size 1/3,

Ds = log 4 / log 3 = 1.2618…

That means the similarity dimension of a Koch curve is larger than its topological dimension of 1, but smaller than its Euclidean dimension of 2. Since Ds for a Koch curve is larger than that for a line but smaller than that for area, we can conclude that the Koch curve is more than a line but not quite a plane. Wonderfully surreal.

Monday, April 28, 2008

Misfolded proteins

There are nice sentences in high school text books along the lines of 'misfolded proteins are recognized and degraded'. But in reality, it seems like a tough job to sort these proteins out. There are unfolded proteins, not completely folded proteins, completely folded proteins, and misfolded proteins - how does the cell distinguish which ones deserve to go on?

Anyway, turns out there is a way of dealing with this. Firstly, everything in biology is shape. Shape, shape, shape - John Archer, a old professor, used to stress this a lot. You can recognize when a protein is not done folding because it will display portions that it shouldn't - for example, hydrophobic areas that would be buried in a beta-sheet.

Now how to distinguish between not completely folded and misfolded? This is where sugar tagging comes in. Proteins in the ER are glycosylated in the N-terminus. Glucosyl transferase proteins recognize the hydrophobic portions of the protein, and add another sugar to the N-terminal oligosaccharide. As long as the sugar tag has at least one more glucose, the protein is recognized by calnexin and it cannot exit the ER. To escape calnexin binding, the bound glucose needs to be cleaved by glucosidase.

The proteins in ER cycle between being bound by calnexin and having a sugar cleaved, and being recognized by the glucosyl transferase and having a sugar added until they are completely folded.

This still leaves recognizing misfolded proteins - and apparently the mechanism is similar. Once the protein has spent enough time in the ER and not gotten completely folded, a sugar will be linked that will be recognized by a chaperone which will direct it to the nucleus for degradation.