2017 IJCAI IJCAI 2017

The DNA Word Design Problem: A New Constraint Model and New Results

Abstract

A fundamental problem in coding theory concerns the computation of the maximum cardinality of a set S of length n code words over an alphabet of size q, such that every pair of code words has Hamming distance at least d, and the set of additional constraints U on S is satisfied. This problem has application in several areas, one of which is the design of DNA codes where q=4 and the alphabet is {A,C,G,T}. We describe a new constraint model for this problem and demonstrate that it improves on previous solutions (computes better lower bounds) for various instances of the problem. Our approach is based on a clustering of DNA words into small sets of words. Solutions are then obtained as the union of such clusters. Our approach is SAT based: we specify constraints on clusters of DNA words and solve these using a Boolean satisfiability solver.

📈 Trend Setter — Discrete Mathematics
🧭 Keyword Pioneer — dna code design
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Data Science & Analytics, Deep Learning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics
🌉 Interdisciplinary Bridge — Computer Science and Mathematics & Optimization
🐣 Hot Topic Early Bird — constraint satisfaction