IWOCA 2011: Abstracts for Session A
Talk A.1: Chris Thachuk * and Anne Condon.
Efficient Codon Optimization with Motif Engineering
It is now common to add protein coding genes into cloning vectors for expression within non-native host organisms. Codon optimization supports translational efficiency of the desired protein product, by exchanging codons which are rarely found in the host organism with more frequently observed codons. Motif engineering, such as removal of restriction enzyme recognition sites or addition of immuno-stimulatory elements, is also often necessary. We present an algorithm for optimizing codon bias of a gene with respect to a well motivated measure of bias, while simultaneously performing motif engineering. The measure is the previously studied codon adaptation index, which favors the use, in the gene to be optimized, of the most abundant codons found in the host genome. We demonstrate the efficiency and effectiveness of our algorithm on the GENCODE dataset and provide a guarantee that the solution found is always optimal.
A coloring of edges of a finite directed graph turns the graph into a finite-state automaton. The synchronizing word of a deterministic automaton is a word in the alphabet of colors (considered as letters) of its edges that maps the automaton to a single state. A coloring of edges of a directed graph of uniform outdegree (constant outdegree of any vertex) is synchronizing if the coloring turns the graph into a deterministic finite automaton possessing a synchronizing word. The road coloring problem is the problem of synchronizing coloring of a directed finite strongly connected graph of uniform outdegree if the greatest common divisor of the lengths of all its cycles is one. The problem posed in 1970 has evoked noticeable interest among the specialists in the theory of graphs, automata, codes, symbolic dynamics as well as among the wide mathematical community. A polynomial time algorithm of O(n3) complexity in the worst case and quadratic in the majority of studied cases for the road coloring of the considered graph is presented below. The work is based on the recent positive solution of the road coloring problem. The algorithm was implemented in the freeware package TESTAS.
Partial words are finite sequences over a finite alphabet that may contain some holes. A variant of Fine and Wilf's theorem states that any partial word with h holes, having periods p, q and length at least the so-denoted L(h, p, q) has also the greatest common divisor of p and q, gcd(p,q), as a period. In this paper, we associate a graph with each p- and q-periodic word, and study two types of vertex connectivity on such a graph: modified degree connectivity and r-set connectivity where r = q mod p. As a result, we give an algorithm for computing L(h, p, q) in the general case.
Talk A.4:
Richard Beal * and Donald Adjeroh.
Parameterized Longest Previous Factor
The longest previous factor (LPF) problem is defined for traditional strings exclusively from the constant alphabet Σ. A parameterized string (p-string) is a sophisticated string composed of symbols from a constant
alphabet Σ and a parameter alphabet Π. We generalize the LPF problem to the parameterized longest previous factor (pLPF) problem defined for p-strings. Subsequently, we present a linear time solution to construct the pLPF array. Given our pLPF algorithm, we show how to construct the pLCP (parameterized longest common prefix) array in linear time. Our algorithm is further exploited to construct the standard LPF and LCP arrays all in linear time.
Talk A.5:
Richard Beal * and Donald Adjeroh.
p-Suffix Sorting as Arithmetic Coding
The challenge of direct parameterized suffix sorting (p-suffix sorting) for a parameterized string (p-string) is the dynamic nature of parameterized suffixes (p-suffixes). In this work, we propose transformative approaches to direct p-suffix sorting by generating and sorting lexicographically numeric fingerprints and arithmetic codes that correspond to individual p-suffixes. Our algorithm to p-suffix sort via fingerprints is the first theoretical linear time algorithm for p-suffix sorting for non-binary parameter alphabets, which assumes that each code is represented by a practical integer. We eliminate the key problems of fingerprints by introducing an algorithm that exploits the ordering of arithmetic codes to sort p-suffixes in linear time on average.
Talk A.6:
Amr Elmasry, Arash Farzan and John Iacono *.
A Unifying Property for Distribution-Sensitive Priority Queues
A priority queue is presented that supports insert in worst-case constant time, and delete, find-min, delete-min, and decrease-key on element x in worst-case
O( lg ( min{wx, qx}+2))
time, where wx (respectively qx) is the number of elements that were inserted after x (respectively before x)
and are still present at the time when operating on x. Our priority queue then has both the working-set and the queueish properties, and more strongly it satisfies these properties in the worst-case sense. We also argue that these bounds are the best possible with respect to the considered measures. Moreover, we modify our priority queue to satisfy a new unifying property ---the time-finger property--- which encapsulates both the working-set and the queueish properties. In addition, we prove that the working-set property is asymptotically equivalent to the unified bound (which is the minimum per operation among the static finger, static optimality, and the working-set bounds). This latter result is of tremendous interest by itself as it had gone unnoticed since the introduction of such bounds by Sleater and Tarjan. Together, these results indicate that our priority queue also satisfies the static-finger, the static-optimality and the unified bounds.
Talk A.7:
Stefan Edelkamp, Amr Elmasry and Jyrki Katajainen *.
Two Constant-Factor-Optimal Realizations of Adaptive Heapsort
In this paper we introduce two priority queues, which are theoretically efficient. For both priority queues, insert requires O(1) amortized time, and extract-min requires O(lg n) worst-case time and involves at most lg n + O(1) element comparisons. One priority queue is based on weak heaps (array-based) and the other on weak queues (pointer-based). The main idea is to temporarily store the inserted elements in a buffer, and once it is full move its elements to the main queue using an efficient bulk-insertion procedure. By employing the new priority queues in adaptive heapsort, we guarantee that the formula for the number of element comparisons performed by the algorithm is optimal up to the constant factor of the high-order term. We denote such performance as constant-factor optimality. In contrast to earlier constant-factor-optimal sorting algorithms, which are mainly of theoretical interest, adaptive heapsort relying on the developed priority queues is practically workable.
Talk A.8:
Cancelled.
Let dq(n,k) be the maximum possible minimum Hamming distance of a linear [n,k] code over GF(q). Tables of best known linear codes exist for all fields up to q=9. In this paper, linear codes
over GF(13) are constructed for k up to 6.
The codes constructed are from the class of quasi-cyclic codes. In addition, the minimum distance of the extended quadratic residue code of length 88 is determined.
Talk A.10:
Dmitry V. Chistikov *.
Testing Monotone Read-Once Functions
A checking test for a monotone read-once function f depending
essentially on all its n variables is a set of vectors M distinguishing f from all other monotone read-once functions of the same variables. We describe an inductive procedure for obtaining individual lower and upper bounds on the minimal number of vectors T(f) in a checking test for any function f. The task of deriving the exact value of T(f) is reduced to a combinatorial optimization problem related to graph connectivity. We show that for almost all functions f expressible by read-once conjunctive or disjunctive normal forms T(f) is
asymptotically equal to n / ln n. For several classes of functions our results give the exact value of T(f).
Talk A.11:
Frank Ruskey and Khalegh Mamakani *.
Generating All Simple Convexly-Drawable Polar Symmetric 6-Venn Diagrams
An n-Venn diagram consists of n curves drawn in the plane in such a way that each of the
2n possible intersections of the interiors and exteriors of the curves forms a connected non-empty region. A Venn diagram is convexly-drawable if it can be drawn with all curves convex and it is simple if at most two curves intersect at any point. A Venn diagram is called polar symmetric if its stereographic projection about the infinite outer face is isomorphic to the projection about the innermost face. We outline an algorithm that shows there are exactly 406 simple convexly drawable polar-symmetric 6-Venn diagrams.
Talk A.12:
Alejandro Erickson * and Mark Schurch.
Enumerating Tatami Mat Arrangements of Square Grids
We prove that the number of monomer-dimer tilings of an n x n square grid, with m < n monomers in which no four tiles
meet at any point is m 2m+(m+1) 2m+1, when m and n have the same parity. In addition, we present a new proof of the result that there are n 2n-1 such tilings with n monomers, which divides the tilings into n classes of
size 2n-1. The sum of these over all m ≤ n has the closed form 2 n-1(3n-4)+2 and, curiously,
this is equal to the sum of the squares of all parts in all compositions of n.
A binary string B of length n = k t is a k-ary Dyck word
if it contains t copies of 1, and the number of 0s in every prefix of B is at most k-1 times the number of 1s.
We provide two loopless algorithms for generating k-ary Dyck words in cool-lex order: (1) The first requires two index variables and assumes k is a constant; (2) The second requires t index variables and works for any k.
We also efficiently rank k-ary Dyck words in cool-lex order.
Our results generalize the “coolCat” algorithm by Ruskey and Williams (Generating balanced parentheses and binary trees by prefix shifts in CATS 2008) and provide the first loopless and ranking applications of the general cool-lex Gray code by Ruskey, Sawada, and Williams (Binary bubble languages and cool-lex order under review).
Talk A.14:
Frank Ruskey and Jennifer Woodcock *.
The Rand and Block Index of Pairs of Set Partitions
The Rand index of two set partitions is the number of pairs {x, y} such that there is a block in one partition
containing both x and y, but x and y are in different blocks in the other partition. Let R(n,k) denote the number of distinct (unordered) pairs of partitions of n that have Rank index k. For fixed k we prove that R(n,k) can be expressed as
Σj ck,j (n choose j) Bn-j.
where ck,j is a non-negative integer and Bn is a Bell number. For fixed k we prove that there is a constant
Kn such that R(n, (n choose 2)-k) can be expressed as a polynomial of degree 2k in n for all n ≥ Kn.
This polynomial is explicitly determined for 0 ≤ k ≤ 11. The block index of two set partitions is the number of elements that are not in common blocks. We give formulae and asymptotics based on N(n), the number of pairs of partitions with no blocks in common. We develop an O(n) algorithm for computing the block index.
Talk A.15:
Ching-Lueh Chang * and Yuh-Dauh Lyuu.
Stable Sets of Threshold-based Cascades on the Erdös-Rényi Random Graphs
Consider the following reversible cascade on the Erdös-Rényi
random graph G(n,p). In round zero, a set of vertices, called the seeds, are active. In round t ∈ Ζ+, a non-isolated vertex is activated or deactivated, respectively, if at least or
less than a ρ ∈ (0,1] fraction of its neighboring vertices are active in round t-1.
An irreversible cascade is defined similarly except that active vertices cannot be deactivated. A set of vertices, S, is said to be stable if no vertex will
ever change its state, from active to inactive or vice versa, once the set of active vertices equals S. For both the reversible and the irreversible cascades, we show that for any constant ε > 0,
all p ∈ [(1+ ε)(ln(e/ρ))/n,1] and with probability
1-n-Ω(1), every stable set of G(n,p) has size O(⌈ρ n⌉)
or n-O(⌈ρ n⌉).
|