Past seminars

Wednesday 9th December

Greg Gamble - Department of Medicine

Title: “Workflows I have metâ€

Abstract:As a statistician working part time for the bone research group I am exposed to both clinical and basic research. Standard tools and approaches trialled under the fire of peer review have long been part of what I do but when the groups molecular biologists extended their interest into microarrays I got worried. They know what they want from microarray analysis and with help from Mik Black and his PhD Student Sarah Song they got it: An analysis of data from patients with Paget’s disease (a common bone disorder). Unfortunately this collaboration could not continue and so we had to develop some in house expertise at analysing microarray data. This talk is my attempt to document the paths that we have taken, evaluating available software/resources to discover limitations and strengths and will highlight the barriers we have encountered. There is surprising generosity in this field, many free resources (almost too many) and researchers free with their limited time provide unselfish support. We are probably typical of many research groups coming to grips with a new paradigm. Together we have clicked many a proscriptive workflow tab and worried about what the output might mean.


Tuesday 8th December

Dr Naryttza Díaz - Ludwig Institute for Cancer Research, Swiss Institute of Bioinformatics

Title: "Taxonomic classification of environmental genomic fragments: From whole genomes to genomic fragments"

Abstract:Understanding the species composition of natural microbial communities is a milestone to gain access to the beneficial aspects of these microbial collectives. The dropping sequencing costs allow to sequence entire microbial communities without previous culturing. Metagenomics is the sequencing and analysis of collective genomes (metagenomes) of microorganisms isolated from an environment. The metagenomics approach promises to be a gateway to access the estimated 99% of species that still resist cultivation. However, due to the complexity and amount of data generated in metagenomic projects the prediction of the taxonomic origin of the genomic fragments composing a metagenomic sample is still a challenging issue in computational biology. In this talk, a novel strategy for the taxonomic classification of metagenomic sequence data will be presented.


Wednesday 18th November

Steffen Klaere - Department of Mathematics

Title: “MISFITS: assigning extra mutations to a phylogenetic treeâ€

Abstract:Given a multiple sequence alignment for a set of species, maximum likelihood methods reconstruct, based on a Markov model of evolution, a phylogenetic tree which describes the evolutionary relationship of the species. Few statistical methods have been suggested to test the fit of the model to the data. Moreover, those tests typically simply reject the model. We present a method which quantifies the difference between model and data in terms of extra mutations. In particular, we replace site patterns which are over represented in the alignment with patterns which are underrepresented and count the number of mutations needed to replace them. Our approach gives a biologically insightful interpretation of what to do to fit the observed data to the model.


Wednesday 4th November, 3-4pm in Mac 1 (Old Biology Building)

Kay Nieselt - University of Tuebingen

Title: "Characterisation of non-coding RNAs and RNA-RNA interactions in Streptomyces coelicolor"

Abstract:Several studies of non-coding RNAs (ncRNAs) have shown that they are involved in a wide spectrum of different processes and almost daily the list of processes is enlarged. Nevertheless, the functions of most ncRNA transcripts are still unknown. There is by now a number of tools published for the genomewide prediction of ncRNA regions. However, most such programs produce an unknown number of false positives. Furthermore, the locus prediction does not provide information about functional ncRNAs that might be contained in the corresponnding region. We present an approach to annotate ncRNAs predicted by programs such as RNAz (Washietl et al., 2005). Loci containing predicted ncRNAs are compared to known ncRNA families. In addition, we compute features related to the transcription process which allows us to distingush putative ncRNA transcripts from ncRNA regulatory motifs. In addition, we can predict interactions between putative ncRNA transcripts and mRNAs. These methods are applied to the antibiotic-producing soil bacterium Streptomyces coelicolor. Almost 4000 ncRNA elements were predicted, most of them overlapping with protein coding genes. First results show that several key proteins in S. coelicolor are regulated by ncRNA transcripts. This is supported by genomewide high resolution time series expression data using a custom-designed microarray targeting all protein coding genes as well as our predicted ncRNAs.


Wednesday 14th October

Lynn Ferguson - Discipline of Nutrition

Title: "Data management and analysis challenges in a New Zealand nutrigenomics consortium"

Abstract:Nutrigenomics New Zealand was established as a result of a funding call by FRST, to develop a capability that could be applied to the development of gene-specific personalised foods. This multifaceted programme is established across two CRIs (Plant and Food Research and AgResearch) plus the University of Auckland, involving 55 named scientists (with differing time commitments), and spread across several different New Zealand centres (in Auckland (2 sites), Hamilton, Palmerston North, Christchurch and Dunedin). We are utilising Inflammatory Bowel Diseases in particular Crohn's disease (CD), as proof of principle. This is an excellent example where there is known genetic susceptibility to disease, that is impacted by environment, including diet. We have established case-control studies of more than 1,000 IBD patients compared with approximately 600 unaffected controls, and studying these for single nucleotide polymorphisms (SNPs) and gene copy number variants that associate with disease. At the same time, we are seeking dietary information, especially in terms of food tolerances and intolerances. By these means, we have identified mushroom intolerance in individuals carrying variants in an organic cation transporter gene (OCTN1). Knowledge of key human disease SNPs is incorporated into the design of paired reporter gene constructs, whereby isogenic cell lines, with and without the variant SNP of interest, are tested for phenotypic effects of nutrients, bioactive compounds and food extracts, using a robotically controlled high throughput screen. Lead compounds are then tested in relevant animal models, where the endpoints are defined in terms of transcriptomics, proteomics and metabolomics, in addition to changes in the pathology of the affected individual animals. A multi-centre programme of this size generates a lot of data, both in volume and of many types. Setting up a database for the whole programme has given rise to a lot of challenges. The programme has also thrown up a variety of statistical problems across a range of disciplines. The solution of these problems has required a mixture of applications of standard techniques and adaptations to meet some novel challenges.


Wednesday 7th October

Simon Greenhill – Computational Evolution Group, University of Auckland

Title: "Using phylogenetics to understand languages and cultures"

Abstract:Languages are the archives of history. Their elements - such as lexicon and grammar - carry historical signal about the people who spoke these languages and their cultures. Biologists have developed a powerful set of statistical phylogenetic methods for answering questions about human prehistory using genetic data. Information from language, however, holds far greater potential for understanding our past. In this talk I will discuss some of the work I have been doing to apply phylogenetic methods to languages. I will first use these methods to test between hypotheses about the settlement of the Pacific, and to reveal some striking patterns of cultural and linguistic change.


Wednesday 15th July

Kim-Anh Le Cao - RC Centre of Excellence in Bioinformatics, University of Queensland

Title: "multivariate analysis on metabolic data"


Monday 18th May

Brendan O’Fallon - University of Washington

Title: "A continuous-fitness coalescent and the impact of weak selection on gene genealogies"

Abstract:Coalescent theory provides an elegant and powerful method for understanding the shape of gene genealogies and resulting patterns of genetic diversity. However, the coalescent does not naturally accommodate the effects of heritable variation in fitness. While some methods are available for studying the effects of strong selection, few tools beyond forward simulation are available for quantifying the impact of weak selection at many sites. I first demonstrate that weak selection affecting multiple sites substantially distorts gene genealogies in several ways. I then describe a continuous-fitness approximation that can accurately describe some of these distortions and which can be used to calculate tree likelihoods given a model of selection at multiple sites. Additionally, I demonstrate that only two parameters, population size and the variance of the distribution describing fitness heritability, are sufficient to accurately describe the effects of significantly more complex and realistic models of mutation and selection.


Monday 27th April

A/Prof Dave Edwards - Australian Centre for Plant Functional Genomics, University of Queensland

Title: "Sequence Analysis of Complex Genomes"

Abstract:I will provide an overview of the different technologies for producing second and third generation DNA sequence data, describe some of the tools available for the analysis of this data and some of the challenges associated with the sequencing of complex genomes. I will include practical examples of second generation sequence data analysis for orphan and complex genomes.


Thursday 23th April

Eric Libby - Massey University

Title: "The Number of Judges: Balancing Cost and Accuracy from Figure Skating to Grant Reviews"

Abstract:Designing procedures for carrying out fair evaluations is a recurring dilemma faced by modern institutions. In most procedures several independent judgments must be combined to reach a decision. The factors that need to be considered include the accuracy of the judges, the methods for combining the evaluation of each judge, the cost of the judges, and the cost of errors. We describe how the accuracy of decisions depends on the number of judges and their accuracy. Then, assuming that the number of judges is optimal, we demonstrate that there is an implied relationship between the number of judges, the cost per judge, and the cost per error, so that given any two of these values, we can determine the third. We apply the results to a number of examples from sports, academia, and judicial realms. These results clarify the considerations that designers of evaluation procedures need to examine to obtain accurate and cost-effective decisions.


Wednesday 15th April

Professor Saman Halgamuge - University of Melbourne

Title: "Discovering the almost unknown: a bio-inspired approach to Pattern Recognition and Optimization"

Abstract:Finding hidden patterns in data or grouping data is essential for making sense of present-day real life applications involving multi-dimensional, multi-scope large data sets. One possible avenue is to use unsupervised learning methods that do not depend on labels or predefined descriptions available for groupings. In Part 1 of this talk, I will discuss an algorithm on this category called Growing Self Organizing Maps which was developed by my group and its extension to semi-supervised learning strategies. The main advantage of this algorithm is the ability to adapt its structure using the data set features (i.e. learn), thus making it immensely useful for applications. Broadly, as a precursor to this work, while taking you on this journey on unsupervised learning path, I also discuss some of my contributions in the area of structure adapting neural network algorithms. Part 2 of the talk will highlight some of the work conducted by my group in Particle Swarm Optimization and work conducted in Bioinformatics. Among the work conducted in this area are research in Environmental Genomics and microarray data analysis.


Tuesday 10th March

Peter Waddell - Purdue University

Title: "Effect of models of the rate of evolution of the rate of evolution illustrated with placental mammal data"