Thursday, 20 July 2017

Bill Gates, VR and Influenza Vaccines

Bill Gates was shown around the NIH where they wanted to show him how VR helps to create better vaccines.

As I said in another Blogpost. I began my research career in structure based drug design.  Then I learned that you models are only as good as the data you feed in and so I moved down the pipeline, first to crystallography and now to sequence data analysis. I know the limitations but the NIH wants money and they are less likely to talk about them.

The limitation in influenza research is sampling. We simply do not collect the data properly. We have lots of data from China because that is where we think future outbreaks might come from, but the last Swine Flu pandemic originated in Mexico. There is not enough systematic global collection of data. This means that unexpected changes catch us unaware. Most of the times we do pick the right vaccine candidates but sometimes we get it wrong. VR will not help this.

What will help is the IOT. That provides an opportunity for massive data collection. The Cloud allows us to share data globally. If we can stop the national laboratories from hoarding data this would also be a big step forward. The WHO also needs to be reformed to remove some of the political players who are a barrier to sharing. Scientists are bad politicians. My dad worked on a funding committee and they had to move meetings to secret locations because the scientists were always trying to lobby then and bully them into decisions. In influenza research there is a ruling clique that wishes to restrict research participation. When you mention citizen science or data sharing they have a fit.

So what should we be looking at today?

1) Why is there a wide-spread breeding failure in the Celtic seas? Is this related to the death of marine mammals? Both of these sets of species are possible influenza sources/sinks and it might be a good idea to do some influenza viral screens to see if a new more pathogenic strain has evolved. If it is an influenza it is probably H7 or H3 and that it can affect mammals would be a concern.

2) H5N8 from Viet Nam is a new emerging highly pathogenic H5 containing lineage. We had an outbreak in Korea that spread to North America and also Europe but it does not seem to be able to persist in either Europe or North America (although for North America that is disputed). However the most recent European cases are not related to the Gochang and Buan Korean lineages, but to a Viet Nam sequence that again spread via Korea.



The key is vigilance, wider participation and thinking outside the box. The bird breeding might be nothing and unrelated but lets do a quick check. Improved data collection and screening is where we will make the break-throughs not in the VR lab.

Wednesday, 19 July 2017

My F1000 paper on reassortment in H5N8 - even open peer review has flaws

This is really irritating me as this is version 3 of the same story. It is not even a particularly interesting story except it is if you think deeply about it.

What I want to show is that H5N8 in the US is a subtype that has been produced by multiple events where an H5 containing virus reassorts with an N8 containing virus. To do this I constructed a tree for ALL the H5 sequences in the database and ALL the N8 sequences in the database.

If I am wrong then the H5N8 sequences from the US would cluster in one group and not be spread across the tree in many distinct clades. They would at least be close neighbours. Am I wrong? Nope they are spread all over both trees. In the H5 tree there are lots of neighbouring H5 sequences that have been sampled but they are from other subtypes. In the N8 tree they are also widely spread with sequences from many other subtypes in between.

So here is the referees comment https://f1000research.com/articles/5-2463/v1#referee-response-18901

I am completely incredulous about how this is a possible argument for rejection.

If the referees are right int their arguments then you CAN have clades in a tree that are polyphyletic for subtype without reassortment being the cause. I can simply get from H5N2 to H5N8 by spontaneous mutation of the N2 to the N8 form. I can make hundreds of base changes and insertions and deletions and this is more likely than a simple reassortment event.

Now let me imagine this is not their argument and that they will admit that reassortment does exist in the clades but that they are not convinced about the specific H5N8 reassortment events. They are suggesting that these occur in both the N8 and H5 tree which are two independent samples in the same pattern by chance and that there is a need for the internal gene trees to corroborate these events. I say in the paper very clearly that I cannot KNOW what the origins of the H5 and N8 are and I can just make suppositions about them, but I know that a reassortment event has definitely occurred. Otherwise what are the chances that the H5 and N8 genes both undergo substantial mutational changes between samplings of H5N8 and that the neighbours include large numbers of sequences from many other subtypes? If this is true then H5N8 sampling must have been carried out appallingly or in a very biased manner while we detect all the other subtypes with ease.

To be honest I have the internal gene trees as I just finished them for another paper about a different subject and oddly enough they show exactly what I said in the paper. I was 100% right. These two referees are 100% WRONG. In fact their arguments are so illogical and unsupported by evidence that I was surprised that they were brave enough to put their names to them.

So let me ignore the sneering way the review is written, because everyone has a time in their life when they thing they know it all and recent graduates tend to fall into this trap more often than not. I know that I was the same 20 years ago when I started my career.

Let me consider how they use rhetoric in order to create a straw-man to knock down by suggesting that the paper is about discovering reassortment - it is not. The title is very clear it says reassortment in H5N8 and then only part of the H5N8 tree - the US part is actually the main focus of the study. It is a paper about a specific example and the dangers of collecting data by subtype as this gives an incomplete picture of sequence evolution in a segmented virus that can undergo reassortment.

This is the point. If the segments can reassort then they can pass between multiple subtypes in their evolutionary pathway. This is not rocket science this is just suggesting it is better to consider this possibility in trees and sampling and not just carry out phylogenetic analysis by subtype.

This gives me with two options:
1) The referees are idiots.
2) The referees forgot to declare the conflict of interest in that they have a skewed view-point in order to protect their existing work. This paper starts to undermine the idea of monophyletic clades for subtype which underpins the WHO nomenclature system for H5 which Justin Bahl helps to manage.

I do not think the referees are idiots, but I do think the second point is true and that there are good reasons why Dr Bahl should have declared a real and prejudicial interest and NOT taken up the review. Nobody is objective about seeing their work undermined, ever. I also think this is a good reason for me to ban Dr Bahl and Joseph Hicks from EVER reviewing any of my papers and for editors to consider any review that they provide with suspicion.

Saturday, 29 April 2017

Let's go back to the beginning.

I need to tell a long story and so I need to go back to the beginning. Part of this story I have already told but not very well and so this is an attempt to put everything into context.

I did a degree in Chemistry and Law at the University of Exeter. When it came to choosing what to do next I wanted to stay in research. I had met some lawyers at the recruitment fairs and they had convinced me that I did not want to be a lawyer. My drive was wanting to change the world, their motivations were purely financial. I was not going to be Perry Mason and I was not going to write environmental protection legislation and so I turned back to science which had been my dream since my teens.

What caught my eye was protein molecular modelling. I had not done much biological chemistry or biochemistry as an under-graduate but the beauty of the computer models captivated me. It was like the best video game I had ever seen (at that time they used the best graphics computers you could buy and they cost tens of thousands). I had applied for PhDs elsewhere in physical chemistry including with P.W. Atkins (his response was he didn't supervise students, then why was he is the graduate prospectus?). But nothing compared to those ribbon images of proteins.

I received the Norman Rydon scholarship from the Chemistry Department at Exeter. This allowed me to pick my supervisor and the money would come with me. It was an incredible stroke of luck and so I got to follow my dream and study molecular modelling of proteins. My PhD was in homology modelling of FBP aldolase and also including using molecular dynamics to study the conformations of peptide inhibitors. Unfortunately about the time I finished Swiss-Model appeared and what had taken me 3 years to do now could be done in 10 minutes on a server ... That was the end of homology modelling research for me. What I also learned was that modellers depend on the quality of the data they are given. The FBP structure that I used had some limitations and so my models shared those limitations and so I went back a step to become a protein crystallographer.

While I was doing my PhD and protein crystallography post-doc I was the general computational biologist or bioinformatician on call for the research group. This was the mid-1990s and so bioinformatics did not really exist as a subject. I did sequence alignment, BLAST searches and phylogenetic analysis for projects where we tried to understand the evolution of protein structure and function.

What I realised from building these alignments and trees was that if your data is very partial and contains mostly sequences from related species and only a few sequences from more distant species, then this will bias the tree towards your data and possibly away from a better representation of reality. The problem is how do you select sequences to include and exclude? An even bigger issue is the irregularity of the sampling across the "tree of life" (we also did not call it that then either we just called it across the kingdoms). We worked with the recently discovered Archaea and they are dramatically different to the bacteria and the eukaryotes and putting the Archaeal sequences into trees was difficult.

From alignments I also learned that making secondary structure predictions on all the sequences in the alignment is better than just making predictions on a single instance. They should all have the same structure and so this sequence level variation should disappear in predictions. This turned out the be a major discovery (made by someone else) and that ended investigation in secondary structure prediction (OK I have missed out neural nets etc. but I have serious doubts that they contribute anything more than the use of multiple alignment and using GOR or even Chou and Fasman).

I continued as a post-doc in protein crystallography but also dabbled in bioinformatics until in 1999 Exeter set-up an MSc in Bioinformatics. As one of the local experts I helped to set up the course and I taught the sequence and structure modules. I was made a lecturer in 2000 and I remained at Exeter for five years until I got caught up in the departmental politics of the closing of the Chemistry Department (I was a lecturer in Biological Sciences and Engineering and Computer Science at the time). Exeter made me redundant but the atmosphere had soured for me there anyway, because they disapproved of me trying to have a work life balance and putting my wife and newly born children first. They also did not like my involvement in politics. I was a city councillor in Exeter for five years.

One of my students had said did you see this advert for a bioinformatics lecturer at Oxford. I hadn't but the closing date hadn't passed and so I applied. My curriculum was okay. I was much better at teaching than research. Setting myself up as an independent researcher had also been made difficult by having to separate my research from my PhD and post-doctoral supervisor. Luckily I had the support of Dr Ron Yang and we had worked together on some projects. He did the computing (most of it) and I gave the biological input (a bit at the end to make sure that it actually worked). This meant that I had my four publications and that is what matters in UK academia.

I went to the interview at Oxford. I thought it went well. The head of the course was young, bright and very unexpected. Dr (now Prof.) Charlotte Dean would go on to be head of the department of statistics. What was unexpected was how relaxed and casual she was. She was not the serious unapproachable Oxford don. They offered me the job on the same day and I started a few months later. We moved from Devon to Oxfordshire. I lead the teaching in the modules Charlotte was not leading. In Oxford the Bioinformatics MSc was in the Department of Statistics and I became a Departmental Lecturer in Statistics. This also meant I taught statistics, Perl programming and he biology courses. I had gone from being a lecturer in Biological Sciences and Engineering and Computer Science to a lecturer in Statistics. I had degrees in none of these subjects but I am 100% a computational biologist. This is the curse of being inter-disciplinary. I did start to think about systems biology at Exeter and we had a meeting there where I met Kitano and his work was a major influence on my thinking.

This is when I became an accidental statistician and I am glad that it happened, because apart from the stunning molecular images I found that data is what fascinates me. I just can never get enough data. I think that if I had been introduced to statistics earlier on then I might have been a statistician from the start but at school it is never taught well and people fear statistics. At Oxford I learnt to love it. When the lecturer who taught statistical data mining left I took over his module. Now I was teaching masters level statistics to people who had degrees in maths and some of them from Oxford. It was an amazing experience, although I have to admit to spending the entire summer reading the textbooks from cover to cover (thanks to Hastie, Tibshirani and Friedman and also to Brian Ripley).

Charlotte went on to other things and I became acting head of the MSc teaching more and more. My interests now were systems biology and trying to put together data from different experiments and perspectives. What really troubled me and what still sits in the back of my mind is entropy and how it works in living systems. I was a book worm for systems biology.

Saturday, 24 September 2016

Dawkins' and Pinker's gene

The two biggest problems for Dawkins' and Pinker's interpretation of what the gene means as a heritable "element" that affects phenotype. Is that first it is most often not an element but a system of non-local and interacting elements and secondly and most importantly it will not follow Mendelian genetics. It will not be segregating and discrete. There will be a myriad of variations depending on how the system responds to the environment it finds itself.

Mendel was lucky with the characteristics he chose to examine and when you do have genes that segregate then you do have a gene as described by the molecular biologist and geneticist, of they type that Pinker riles against. They are just different alleles corresponding to an expressed region of DNA or possibly their cis regulatory regions. They are not the nefarious and indeterminate objects defined by Pinker and Dawkins. If we are to take their views seriously then we have to go back to before the modern synthesis and try again.

Now that we know that most of the genome can be defined as loosely functional, even if just in terms of spacing between coding regions, then perhaps we do need to look at what the term gene means.

Thursday, 22 September 2016

I finally understand what Dawkins means by gene

I was reading a short article by Steven Pinker in the book "This Idea Must Die". There Pinker was saying about how molecular biologists have a different view, a very restrictive view of what a gene is. They only consider the protein encoding region of the DNA as the gene.

That was a Eureka moment as I finally understand what Dawkins was trying to say. He shares exactly the same view as Pinker. To him a gene is a heritable element that produces a phenotype. This is a much older view of the gene than the view I was brought up with. It predates knowing anything about DNA at all.

To the atomistic and DNA based molecular biologists and geneticists this means the sections of DNA that produce the protein that is responsible for the phenotype. That piece of DNA when expressed causes the phenotype. This is why the molecular biologists got such a shock when they found that there were only 20-30 thousand protein expressing segments, genes in their words, in the human genome. This looks the same as Dawkins' gene but it is completely different. Dawkins because he knows very little about genetics and molecular biology is living in the world view before the modern synthesis which linked DNA to genes. Pinker shares the same anti-reductionist perspective. Even though both would consider themselves materialist and reductionist scientists.

Their view of the gene would include all of the regulatory elements, both local and non-local in the genome. It would also include all of the mechanisms for regulation and post-translational modification, for localisation and for every other modifier that affects the process of taking that section of DNA or those multiple sections of DNA to produce a phenotype. In Dawkins' view there are no multi-gene effects to produce a phenotype because the genetic atom is actually that complete system that relates DNA and phenotype.

That is what makes me so strongly critical of Dawkins' work because he has no appreciation of the system at the molecular level. I work with proteins and how they fold and I even dislike DNA. I see the disconnect between the DNA code that can be mutated and the proteins that they produce. There is a huge non-linearity in their connection. The effects of mutations are almost impossible to predict. But if you take Dawkins' and Pinker's way of specifying a gene just as a heritable element then their writing makes a lot more sense.

It makes more sense but they still ignore the fundamental problems with this view. That is that these "atomic entities" these genes are not atomic. They are overlapping, intertwined, non-local and non-linear systems that cannot be approximated by some atomic genetic theory. In each cell-type the networks of connections between regulatory elements and expressed regions is different and that is not even considering spatial effects.

In their world each cell type would have its own set of genes, because each have their own phenotype and own system of expression. Even each of my tissues would become a collective organism and and animal would become a collective of collectives. It is this decision to ignore the relationships between the parts and to impose an artificial genetic atomism on these heritable elements that makes it unrealistic as a view. Playing with my sons' Lego makes it clear. I have all those bricks which are the genes of the model. But unless I put those bits together in the right way I never get my car or my space ship. If you don't think at the systems level you can never understand biology. Atomism and reductionism are doomed to failure.

Saturday, 3 September 2016

Big Government should Amazon and Starbucks pay more tax?

Yesterday I posted about beggar my neighbour and why the Ireland/Apple tax case matters for democracy and stability. Today Amazon and Starbucks are he focus of attention. These are two more in a very long list that will also include Google, Vodafone. Microsoft and many others who use their global clout to minimise regulation and taxation.

What was amusing is the posts on social media by neo-cons about the companies being justified because the governments waste money and so they should keep avoiding the tax.

What do governments spend their money on? A lot of it is social security and a lot of that is pensions (much more than unemployment in the UK). So shall we cut pensions because Amazon and Starbucks don't pay up? Should those Daily Mail reading baby boomers who support the neo-con illusion get shafted by their own stupidity? Should we allow them to poke themselves in the eye? Sounds good to me but maybe not.

What else does the government pay for? Healthcare is a big spend as well. We could allow Amazon and Starbucks to use their tax avoided cash to invest in sponsored hospitals and to reproduce the philanthropy of Carnegie or Rockerfeller. Look at Oracle and the billions of Larry Ellison as an example he used all that cash to build - the most expensive racing yacht in history. So maybe expecting billionaires to give away their money is not such a good idea (I know Bill Gates has done amazing things and George Lucas and Warren Buffett as well but they do not run countries).

The government also spends money on defence. From an evidence based view this is often a waste of money and the social media post is correct. Britain is building two stupid carriers to fight the types of war that no longer exist against enemies that are no longer there. We are about the renew nuclear weapons that nobody will ever use and that are also a waste of time. Oddly enough I suspect that the person who made the media post would say that this is NOT a waste of government money as the neo-cons are easily deceived by Eisenhower's military industrial complex that sells what nobody needs at an extortionate price.

Then there is education. We could all do with a lot less of that so that we can all be as stupid as the Daily Mail readers and the neo-con social media enthusiasts. That keeps people from questioning. Yes you need to train an elite to run your business and keep globalisation going but an ignorant population is good for business.

What about the infrastructure paid for by taxes? The roads etc. Well there is lots of mis-mangement of funds there, but is is caused by the neo-con push to privatise all services and to have the market find the best price. Just ask Halliburton how this works for them in the US and ask any local government how it has worked out in the UK. Higher price poorer service and don't mention PPI.

So yes Amazon, Starbucks etc. should be paying taxes and while sometimes government does waste money it is a lot better than the alternative.

Friday, 2 September 2016

Beggar my neighbour: Apple's Tax Problem in Ireland

There is well known economic rule called "beggar your neighbour". It is important in behavioural economics when you consider the model of the repeated Prisoner's dilemma. In that case beggar my neighbour represents the defection strategy. There is also a connection to companies seeking countries with the minimum regulation/taxation. This is when the companies are defecting.

Companies have a duty to shareholders which in the short term and when you do not expect there to be a repeat of the circumstances means that defection is the preferred strategy and politicians often think the same way. This is sib-optimal capitalism. It is sub-optimal because in reality we have longer term interactions and repeat business which are undermined by defection. Axelrod has shown that the best strategy as proposed by Rappaport is Tit-for-tat. You respond to defection by defecting and then you go back to a position of trust. Trust is the essential feature that makes the system optimal. You have to maximise the trust to reduce the costs of regulation and defection.

At a national level a defecting country is one that offers a lower level of regulation and taxation compared to all the other countries as business will move to that country and not pay taxes where they actually are active. This is how the Swiss canton of Zugg has become the European HQ of many multi-nationals. Zugg has a tax rate of 5% which is very attractive to global business. Given the size of the canton this minimal amount of tax from a large number of corporations raises more than enough for the infra-structure and services that the canton has to pay for. In fact they should be making a considerable profit. Levels of taxation elsewhere have to be larger because nations are expensive and levels of tax are set to avoid a deficit. This is why Zugg is beggaring its neighbours and why Ireland with its Apple tax deal was also beggaring its neighbours by removing tax revenues from other countries where Apple was doing business. Apple was unlucky enough to be the first company that was brought to court but it will not be the last and Google and Amazon are two more big names that stand out.

Ireland with a much smaller economy can survive without all the tax that is owed but this deficit is pushed onto all the other EU nations. That is why this sort of tax deal is illegal and why the UK deal with Vodafone also  needs to be investigated. Governments do this because they want to keep the jobs in their countries but if I did this as a small business or as an individual even if I did not do my duty well enough only in the expectation of a future benefit I would be in jail for up to 12 years and face an unlimited fine under UK law. In fact the recent report about the possible dropping of the investigation of BHS in return for Sir Philip Green paying a large sum to the pension fund is also bordering on illegal under the Bribery Act 2010. I begin to see why an Italian mafia judge called the UK the most corrupt country in Europe.

Ireland has been caught cheating but both Ireland and Apple are going to seek to contest the judgement. The only way that you can prevent beggar my neighbour is if we go beyond short term interests or if we promote supra-national agreements. We call these trade deals and although TTIP is a dirty word at the minute there are many others that we rely on everyday. The WTO is the largest agreement to make sure that nations do not deliberately cause economic hardship for each other. But the most successful is the European Union and that is why Ireland and Apple in the end have to lose if we are to have any faith in nations and democracies and if we want to live in a world which has not been taken over by corporations.

So why does not being able to beggar your neighbour matter? History shows that wars are usually about resources and as a response to internal economic challenges. Harming another countries economy has political and social consequences and not just economic ones. If we want a more peaceful and equal world we have to get beyond Brexit and Beggar my Neighbour and start understanding the long term benefits of working together.