High throughput sequencing (HTS) technologies have been decreasing the cost and increasing the world-wide capacity for sequence production at an unprecedented rate, making the initiation of large scale projects aiming to sequence more than 1000s of genomes possible. The ability to detect structural alterations on HTS personal genomes will change the way diseases of genomic origin are diagnosed and treated. In this talk we will briefly go through some of the algorithm development efforts at the Lab for Computational Biology in SFU for analyzing large collections of HTS genomes and transcriptomes, for the purposes of identifying common and rare structural variants with high accuracy. Our algorithms, which we collectively call CommonLAW (Common Loci structural Alteration detection Widgets) move away from the current model of

detecting genomic variants in single HTS donor genomes independently, and
checking whether two or more donor genomes indeed agree or disagree on the variations, to a new model in which structural variants are detected among multiple genomes and transcriptomes simultaneously.

One of our methods, Comrad, for example, enables integrated analysis of transcriptome (i.e. RNA) and genome (i.e. DNA) sequence data for the purposes of discovering genomic rearrangements and aberrant transcripts (non-standard mRNAs) in multiple (possibly related) individuals simultaneously.

I will discuss a general model of graph partitions that includes many well known partition and colouring problems, many of them arising in the study of perfect graphs. I will describe classification attempts both for the complexity of these problems, and for their forbidden subgraph characterizations. This will include joint work with Tomas Feder, Shekoofeh Nekooei Rizi, Geordie Schell, Juraj Stacho, Jing Huang, Arash Rafiey, Wing Xie, and others.

In the first part of this talk we consider an All Nearest Larger Neighbors Problem which is defined as follows: "Given a set M of n objects with keys, find for each object in M a nearest object with key larger than that of the object if any." The problem can be defined in many different ways. In one extreme we are given an array containing n real numbers and distances between objects (array elements) are measured by their index difference. In another extreme we are given a 2-D array with possibly duplicate keys and distances between two elements are measured by their Euclidean distances. If sufficient work space is available, both of the problems can be solved efficiently. For example, if n numbers are given in an array, we can design a linear-time algorithm using a stack. Given a 2-D binary array consisting of only two different values, the problem is known as that of distance transformation, for which a linear-time algorithm is known. What happens if the amount of work space is limited? We can solve the 1-D problem in O(n log n) time using work space of O(log n) bits even if we have duplicate elements. The 2-D problem can be solved in O(n log n) time using work space of O(log n) bits if all the elements are distinct, but no subquadratic-time algorithm using O(log n) bits is known if we allow duplicate elements. We could also discuss space-time tradeoff for the 1-D problem.

The second half of the talk deals with several algorithms using limited work space for various problems including those for computational geometry and for image processing. It is reasonable to use O(sqrt(n)) work space for an image of n pixels. Using the work space, we can count the number of connected components in a binary image. We could also solve other interesting problems for a binary image.

In 1975, Mike Fredman showed that, given a set of distinct values satisfying an arbitrary partial order, one can finish sorting the set in a number of comparisons equal to the information theory bound (Opt) plus at most 2n. However, it appeared determining what comparisons to do could take exponential time. Shortly after this several people, including myself, wondered about the complementary problem Fredman's "starting point" of arranging elements into a given partial order. My long term conjecture was that one can do this, using a number of comparisons equal to the information theory lower bound for the problem plus a lower order term plus O(n) through a reduction to multiple selection. Along the way some key contributions to the problems were made, including the development of graph entropy and the work of Andy Yao (1989) and Jeff Kahn and Jeong Hang Kim (1992). The problems were both solved, with Jean Cardinal, Sam Fiorini, Gwen Joret, Raph Jungers (2009, 2010), using the techniques of graph entropy, multiple selection and merging in polynomial time to determine the anticipated Opt + o(Opt) + O(n) comparisons. The talk will discuss this long term development and some amusing stops along the way.

IWOCA 2011: Abstracts for Invited Speakers