DNA String Alignment

Coded in C++

This assignment focused on writing a program to compute the optimal alignment of two DNA strings from an input file. It introduced key ideas from computational biology, where computers are used to analyze biological data, as well as dynamic programming—a technique that solves complex problems by breaking them into smaller, manageable subproblems. Pair programming was encouraged to support collaboration and problem-solving.

To implement the solution, I used a two-dimensional matrix to store edit distances between the two strings. Each cell represented the cost of aligning substrings, and the matrix was filled using recurrence relations that determined the cheapest operation: insertion, deletion, or substitution. After filling the matrix, I traced back through it to build the optimal alignment, checking for matches, mismatches, and gaps. I also developed helper functions to calculate penalties, find minimum values, and construct the alignment. One issue I encountered was that the output appeared reversed due to the direction of traversal, which I fixed by reversing the result before printing.

Overall, this assignment helped me better understand dynamic programming and how it can simplify complex computations. It also gave me insight into how programming techniques are applied in fields like healthcare and biology, reinforcing my interest in the intersection of technology and science.