Sunday 30 July 2006

proteins - DNA Replication - Biology

I think that you have a couple of points wrong. Since your question is asked using bacterial terminology, I'll stick to that.



The leading strand, the one that is initiated at the origin of replication, is synthesised by pol III which is a highly processive polymerase, i.e. it keeps on going for long periods, making very long products. In principle a single pol III molecule could produce the entire leading strand copy of the template.



The lagging strand is, as you say, initiated at multiple RNA primers. The enzyme that extends these primers is pol I. As polI extends a primer it creates an Okazaki fragment (i.e. this is not an alternative name for the primer itself). Eventually the pol I will encounter the 5' end of another RNA primer. At this point the 5'>3' exonuclease activity of the pol I comes into play and removes the primer, replacing it with DNA. The pol I will probably also degrade some of the DNA that has been added to that primer by another pol I, resynthesising it as it goes. This is so-called "nick tranlation" since as the pol I moves along the template it moves a nick in the new strand as it goes. pol I is not a very processive enzyme however and will fall off, leaving the nick to be sealed by DNA ligase.



The leading strand does require a primer, and in most cases this is an RNA laid down by the initiation complex at the origin of replication. In some cases, a protein provides an -OH group for DNA polymerase to initiate at.

Thursday 27 July 2006

human biology - At what age do babies begin to synthesize their own antibodies?

Graph of the progression of an infants antibody production



In the graph above the darker blue line refers to the antibodies the baby receives from the mother in utero, as you mentioned in your question.



As you can see, the red line indicates that babies begin to produce low levels of their own antibodies between 3 and 6 months before birth. However, these are IgM antibodies, immature 'rough draft' versions. These have much lower affinity for antigens then their mature IgG counterparts which are the classically thought of antibody.



Levels of an infant's own IgG start to rise after birth, however don't reach a reasonable level until after the child is roughly 1 year old. Maternal antibodies start to tail off at around 3 months leaving a period highlighted in blue on the graph where infants are particularly prone to getting infections.

Monday 24 July 2006

organic chemistry - Saturated vs Unsaturated Fats -- Structure in Relation to Room Temperature State?

In the solid state the individual triacylglycerol molecules are interacting with each other primarily through Van der Waals interaction. These weak bonds between molecules are broken at the solid-liquid transition. The amount of energy needed to disrupt these interactions (which determines the melting point of the fat or oil) is determined by the energy associated with all of these bonds added together. In a saturated fat the acyl chains are able to align perfectly right along their length, maximising intermolecular interactions. This effect is reflected in the fact that the melting temperature of a pure triacylglycerol increases as the chain length increases.



You can see this effect clearly in the melting temperatures of individual fatty acids. (C18:0 means an 18 carbon molecule with zero double bonds in the acyl chain):




C18:0 (stearic acid) 70°C



C16:0 (palmitic acid) 63°C



C14:0 (myristic acid) 58°C




So the addition of a single -CH2- group in the acyl chain increases melting temperature by a few degrees.



When a cis double bond is introduced into the acyl chain this creates a kink in the structure. Because of this the acyl chains cannot align completely along their length - they don't pack together as well. Because of this the sum of the energy associated with intermolecular Van der Waals interactions is reduced. Again this is seen clearly in the melting temperatures of fatty acids:




stearic acid C18:0 70°C



oleic acid C18:1 16°C




As you can see from these numbers, the effect of introducing a double bond is large compared to the chain length effect.



A typical fat or oil will of course be a mixture of different triacylglycerols, but the underlying principle is the same.

Wednesday 19 July 2006

ecology - If I built a concrete pool, would it ever contain fish?

The answer is yes, absolutelyvne enough time. You can do this experiment yourself. Get a clean glass jar, hell even sterilize if if you like (your oven overnight at 250 F will do it). Fill it about 3/4 with clean (or even sterilized) water, and leave it outside in a mixed sunny shady spot for a few weeks, in moderate weather. It's small volume will get cooked by constant direct sun, which will impede the results.
After a few weeks (might be faster), look closely into the water (through the sides of the jar) with a magnifying glass, or better still, sample the water and look under a microscope. You won't see, but they are there, multiple bacterial species.
You will see single celled algae, small pond creatures (hydra, rotary-topped species (multiple)), paramecia, and multi-celled algal forms in strings. These are all fallen in from airborne dust and windblown particles. You likely won't get fish, but if you are lucky, you might get a few brine shrimp. And yes, particles of blown or bird-carried dirt etc can easily carry shrimp, amphibian, or plausibly, viable fish eggs.



But what do they feed on? There's no food in the water you put in !! CO2, nitrogen, Oxygen, etc are readily available from the air. A few cells of bacteria or algal forms seed the clean water, and begin fixing CO2, nitrogen etc, and now you have a food source. Wind-carried (or bird-carried) small organisms can now survive in a web of bacteria-plant-animal interactions.
By the way, you can also run a control jar, where you sterilize the jar, fill it with boiling hot water, and quickly cover it tightly with foil. Place it beside your open jar, and it will grow...nothing.



As a side note, "toxic" algae are often not toxic to small organisms and other algae. Their toxins (neurotoxins usually) affect mostly only higher organisms, such as fish and humans. Algal blooms kill fish and birds, they seldom sterilize the pond.

Monday 17 July 2006

gene expression - Do the phage repressors CI and Mnt exhibit crosstalk?

There are no reports for it yet. I don't think just because they show overall homology, they would exhibit crosstalk. Just see the alignment of the DNA binding domains of these proteins first ~70 residues from N-terminal. Moreover their DBDs belong to different PFAM families.



Similarly with the promoter sequences.
As reported in iGEM website:



They are quite different.
Moreover, if iGEM has a reported standard part such as a bistable system component: Mnt/cI promoter (BBa_K228003) then it would have been pretty much standardized. iGEM parts are usually reliable.

bioinformatics - Superposing DNA - Biology

Try this out with BioPython:



from Bio.PDB.PDBSuperimposer import PDBSuperimposer as superimposer
from Bio.PDB.PDBParser import PDBParser

parser = PDBParser()
sup = superimposer()
struct_1 = parser.get_structure("XXXX","first_pdb")
struct_2 = parser.get_structure("XXXX","second_pdb")

atoms1 = struct_1.get_atoms()
atoms2 = struct_2.get_atoms()
sup.set_atoms(atoms1,atoms2)

print sup.rotrans #print the rotation translation matrix from the SVD

#then to complete the alignment
sup.apply()


You can look here for further details. SuperPoser

Sunday 16 July 2006

entomology - Do insects respond to the detection of dead insects?

Yes, it is a common behaviour and is called necromone signaling (Yao et al 2009, see references in paper for many examples), and is probably used to avoid predators, parasites and disease. The chemicals used are often similar (unsaturated fatty acids), and seem to have an old evolutionary history (~400 million years). Many groups of species can also detect infected, not yet dead, conspecifics. Removal of dead individual is also common in e.g. ants (necrophoric behavior).



The paper referred to earlier (beside reviewing earlier evidence) also provides experiments of detection/avoidence in isopods and Tent
caterpillars (Malacosoma americanum), where they show avoidance based on crushed conspecifics, unsaturated fatty acids and (for isopods) intact corpses.



Naturally, the same type of compounds are used by e.g. species of carrion beetles (Silphidae) to find their animal hosts, but then for the purpose of reproduction.

Wednesday 12 July 2006

molecular genetics - Comparative cost of RNA-seq vs sequencing full length cDNAs

I am in the process of assembling and annotating the genome of a non-model organism, using almost exclusively short read (paired-end Illumina) data. Throughput is one obvious benefit of these data (high coverage for reasonable cost, error rates notwithstanding), but another benefit of RNA-seq data in particular is that they can be used in multiple ways. For example, I am mapping the RNA-seq reads to the genome to estimate transcript abundances and do some differential expression analysis, but I have also assembled the RNA-seq reads de novo to use as evidence for genome annotation.



There are, however, limitations to using RNA-seq-derived transcripts for annotation. Full-length cDNAs would facilitate much more reliable annotation. However, I have no intuition for the relative cost, time commitment, or complexity of extracting and sequencing full-length cDNAs vs next-gen sequencing. Can anyone here comment on this?

biochemistry - How to Design an siRNA Experiment?

First you should decide whether you want to design an shRNA or use siRNAs.



If you want to use shRNA you should look at the rules that "Mad Scientist" mentioned. You can insert mismatches but make sure not to disrupt the secondary structure. You can use RNAFold to verify.



shRNA always has issues and you have to optimize your design, test it by realtime PCR or Northern blot etc to confirm the production and the concentration.



You can design siRNAs and there are online algorithms to help you design one.
You have to choose a sequence which is specific to your gene. This step is important for both shRNA and siRNAs in order to minimize/avoid off-targeting.



A common practice is to use a pool of siRNAs for a single gene. Pooled siRNAs are commercially available for many genes.



For the experimental controls you would need to do two experiments:



  • mock siRNA treated; which wont affect any gene

  • Your siRNA treated

then test for a reference gene such as GAPDH and your gene. A good reference gene should be the one which is not expected to change expression. GAPDH may not be a good reference always (for example in cases where metabolism is affected). The choice of reference gene is intuitive and depends on your prior knowledge of the system. Instead of GAPDH you can also transfect GFP and quantitate that as a control (spike in).

Tuesday 11 July 2006

Current state of direct RNA sequencing

I had a colleague ask me recently whether mRNAs could be sequenced directly. I found this Nature paper[1] published by Helicos in 2009, in which they describe their developments in the area. It's been several years since this was published, and yet RNAseq and most other transcriptomic analysis is still done by first reverse transcribing RNA into complementary DNA.



What is the current state of RNA sequencing? Has Helicos failed to deliver on their technology? Have any other platforms risen to the challenge (or attempted to do so)? Or is the technology successfully in production and still simply overshadowed by the wide adoption of traditional cDNA-based sequencing methods?




Ozsolak F, Platt AR, Jones DR, Reifenberger JG, Sass LE, McInerney P, Thompson JF, Bowers J, Jarosz M, Milos PM. 2009. Direct RNA sequencing. Nature, 461, 814-818, doi:10.1038/nature08390.

human biology - How fast do different organs turn over cells?

1) cell growth



You should look into chemotherapy and cancer medicine in general. Because chemo is mostly effective because it kills fast dividing cells, this has been worked out reasonably well. the 7-10 year number is not really correct, some cells are replaced a lot more slowly.



This is why hair often falls out in cancer treatment, because the follicle cells are growing quickly. Neurons divide very slowly - if at all - and often are never replaced. Fat cells are in between - probably replaced in the 7-10 year range. Heart cells are replaced albeit quite slowly - less than 1% per year, which implies that many cells are with you your entire lifetime.



2) atoms/molecules change



The cell itself is in a continuous state of flux, but different parts of the cell, like cells in the body, change at different rates. Some proteins which make up the cell matrix or the DNA in the nucleus are replaced very rarely (through repair or rearrangement of the chromosome for instance) and most of the chromosome DNA is with the cell for the entire life of the cell.



Most proteins are labelled for degradation and are recycled after a few hours of function. Metabolic compounds such as sugars or salt might drift in and out of the cell continuously, maybe turning over in an hour or so. Fats can be incorporated into the cell and last for years I think.

Monday 10 July 2006

genetics - What is the mechanism of regulation of PER /CRY genes?

Wikipedia gives a very good explanation of this, on the page for the suprachiasmatic nucleus.




For example, in the fruitfly Drosophila, the cellular circadian rhythm
in neurons is controlled by two interlocked feedback loops.



In the first loop, the bHLH transcription factors clock (CLK) and
cycle (CYC) drive the transcription of their own repressors period
(PER) and timeless (TIM). PER and TIM proteins then accumulate in the
cytoplasm, translocate into the nucleus at night, and turn off their
own transcription, thereby setting up a 24-hour oscillation of
transcription and translation. In the second loop, the transcription
factors vrille (VRI) and Pdp1 are initiated by CLK/CYC. PDP1 acts
positively on CLK transcription and negatively on VRI. These genes
encode various transcription factors that trigger expression of other
proteins. The products of clock and cycle, called CLK and CYC, belong
to the PAS-containing subfamily of the basic helix-loop-helix (bHLH)
family of transcription factors, and form a heterodimer. This
heterodimer (CLK-CYC) initiates the transcription of PER and TIM,
whose protein products dimerize and then inhibit their own expression
by disrupting CLK-CYC-mediated transcription. This negative feedback
mechanism gives a 24-hour rhythm in the expression of the clock genes.
Many genes are suspected to be linked to circadian control by "E-box
elements" in their promoters, as CLK-CYC and its homologs bind to
these elements.



The 24-hr rhythm could be reset by light via the protein cryptochrome
(CRY), which is involved in the circadian photoreception in
Drosophila. CRY associates with TIM in a light-dependent manner that
leads to the destruction of TIM. Without the presence of TIM for
stabilization, PER is eventually destroyed during the day. As a
result, the repression of CLK-CYC is reduced and the whole cycle
reinitiates again. (http://en.wikipedia.org/wiki/Suprachiasmatic_nucleus#Fruitfly)




The suprachiasmatic nucleus (SCN) is a tiny part of our brain residing in the center. It maintains a biological clock through a gene expression cycle in the individual neurons. The mechanism for humans is very similar to the mechanism for fruit flies, as explained above, but to rehash a bit...



In the fruit fly model there are five players: CLK, CYC, PER, TIM, CRY.



CLK and CYC are transcription factors for PER and TIM, and bind to their promoters in order to activate the expression of PER and TIM.



When the expression level of PER and TIM gets very high, the PER and TIM proteins return to the nucleus, and inhibit their own transcription factors (CLK and CYC) through molecular interactions.



The CRY protein is light-sensitive, and so daylight (or artificial light) will cause CRY to destroy the TIM proteins. Without TIM, we no longer get the PER-TIM dimers that inhibit the CLK-CYC transcription factors.



In humans and other mammals it's the same concept, but with slightly different players (e.g., homologous genes).



What's important here is that we have a biological clock on the cellular level. We haven't gone into how this clock is used yet... But, it's my understanding that the SCN uses the information to tell the pineal gland when to produce melatonin.



The melatonin causes us to be drowsy, lowers our body temperatures, and causes us to fall asleep.



You can see another cycle at work here - our body temperature, which rises with wakefulness, and declines with sleep.



This gives a more general answer to how circadian rhythms are used to influence our sleep cycle. Your question was how the genes involved in circadian rhythms regulate each other. The answer is the transcription-translation negative feedback loop described above.



Hope this helps...

Sunday 9 July 2006

genetics - Is the poly(A) tail added while transcription is still underway?

It's slightly more complicated than the comments above indicate.




Glover-Cutter, K et al. (2007) RNA polymerase II pauses and associates with pre-mRNA processing factors at both ends of genes. Nature Structural & Molecular Biology 15: 71-78




The authors present evidence that RNA polII pauses 0.5 - 1.5 kb downstream of the poly(A) site where processing factors are recruited (cleavage stimulation factor, or CstF, and cleavage polyadenylation specificity factor, CPSF). So the polymerase is still engaged with the template and the premRNA when processing is initiated, although it may not be actively elongating the transcript.




"Localisation of the active polII-3' end processing complex well downstream of the poly(A) consensus sequence agrees well with the cleavage of Chironomus BR1 transcripts after 600 bases of downstream sequence has been transcribed. This model is also supported by the fact that efficient poly(A) site cleavage requires an intact RNA tether linking the poly(A) site with the downstream polymerase." (references are cited in the original).




It is a semantic question as to whether you call this cotranscriptional processing.



[Incidentally this is yet another example which contradicts the commonly-stated idea that evolution acts to optimise everything to conserve ATP - here we have the transcription process discarding the energy from approximately 1000 phosphodiester bonds per transcript.]



UPDATE
@WYSIWYG The polymerase goes past the site where the poly(A) will eventually be added, and makes another 1000 bases of RNA before pausing to recruit factors that are needed for the subsequent cleavage at the pol(A) site which now lies "behind it". After that the poly (A) tail is added at the newly created 3' end. So the initiation of poly(A) addition (in the sense that the cleavage is an obligatory first step in the process) is accomplished while the polymerase is still engaged. I agree that the actual addition of the poly(A) cannot in any way be considered to be cotranscriptional.



I was trying to make the point out that it isn't simply a case of the polymerase encountering the cleavage site and dropping off to allow a second unlinked process to take place.

Monday 3 July 2006

speciation - Evolution in 37 years, is it possible?

Evolution can occur in just one full generation



Strong selection will rapidly reduce the gene frequencies of genes which cause negatively selected phenotypes. This reduces the likelihood of unfavourable genotypes occurring in the next generation.



(I regard generation here as the complete cycle of one individual being born to the point at which they successfully give birth/sire young).



Population genetics explanation:



Imagine a population in which LOC1 is a locus affecting a phenotypic marker allowing perfect identification of genotype at that locus. There are two alleles at LOC1, A and a. This gives rise to three different genotypes, AA, Aa and aa which occur in the population in equal proportions. Those individuals with AA exhibit a certain trait, whilst Aa and aa individuals do not. If only the AA genotypes are allowed to mate then the following generation will only contain AA genotypes at LOC1 (assuming no mutation).



For more population genetics read Principles of Population Genetics and then Introduction to Quantitative Genetics.



Genetic drift caused by using 5 Pairs



The new population was started with just 5 pairs of individuals. This means there is huge potential for fixation of alleles via drift with in the very first generation. I ran a (basic) genetic drift simulation in R just now. Out of 5000 replicates 4709 went to fixation (Single locus, two alleles, using 30 generations of 10 individuals).



Environmental change can cause hidden genetic variation to be exhibited



Molecular regulators which affect the expression of traits can respond differently depending on environmental influences. This point is summarized nicely by the opening lines of this paper- thanks to @leonardo for this point.




Hsp90 is a molecular chaperone for many signal transducers and may
influence evolution by releasing previously silent genetic variation
in response to environmental change. In fungi separated by 800
million years of evolution, Hsp90 potentiated the evolution of drug
resistance in a different way, by enabling new mutations to have
immediate phenotypic consequences.




Evolution and speciation are different things



These are just the simple wikipedia definitions, for better ones then consult the literature:




Evolution is the change in the inherited characteristics of biological
populations over successive generations.




For a good thorough introductory text on evolution I recommend Evolution by Mark Ridley (not to be confused with Matt Ridley - popular evolution science writer).




Speciation is the evolutionary process by which new biological species arise.




Speciation, thanks to a long fought debate over species concepts, is also an ambiguous process. What some species concepts would define as two distinct species, another would call one species, and another may call them 10 species! Seemingly everyone who specializes in the study of speciation would have there own twist on a species concept so don't expect that argument to resolve any time soon! It's not advanced much since Dobzhansky wrote this.




The species concept is one of the oldest and most fundamental in
biology. And yet it is almost universally conceded that no
satisfactory definition of what constitutes a species has ever been
proposed.




When does evolution occur? and when is change not evolution in such a short time?



I've put this bit in because what I find is a strange, but common, question people ask is




"species X has changed/adapted to novel selection very rapidly, has
evolution really occurred?"




The way people ask this makes it sound like they want a certain amount of evolution to occur before we can say something has evolved. To me saying evolution has occurred means that some change has occurred, via any of the mechanisms of evolution, which results in one subset of individuals being different from another it is compared to (spatially or temporally separated). You could ask if such strong evolution can really occur over a short time period, but the answer is obvious - yes it can. If it has occurred then why do you think it is exceptional? what is your alternative explanation for the differences seen?