Needleman Wunsch Algorithm

200 Views Asked by At

I have written a python code for Needleman Wunsch algorithm. My code gives only one possible global alignment out of many possible global alignments between two DNA sequences. The code is:

import numpy as np
x = np.array(['T','C','G','T','C','C','T','T','C','A','T','T'])
y = np.array(['G','T','C','T','C','A','T','G'])
Match = 1
Mismatch = 0
Gap = -1
m = len(x)
n = len(y)

def S(a, b):
    if (a == b):
        return(Match)
    else:
        return(Mismatch)

F = np.zeros((m + 1, n + 1))
for i in range(0, (m + 1)):
    F[i, 0] = Gap * i
for j in range(0, (n + 1)):
    F[0, j] = Gap * j
for i in range(1, (m + 1)):
    for j in range(1, (n + 1)):
        D = F[i - 1, j - 1] + S(x[i - 1], y[j - 1])
        V = F[i - 1, j] + Gap
        H = F[i, j - 1] + Gap
        F[i, j] = max(np.array([D, V, H]))
F

S1 = []
S2 = []

i = m
j = n

while (i > 0 or j > 0):
    if (i > 0 and j > 0 and F[i, j] == F[i - 1, j - 1] + S(x[i - 1], y[j - 1])):
        S1 = S1 + [x[i - 1]]
        S2 = S2 + [y[j - 1]]
        i = i - 1
        j = j - 1
    elif (i > 0 and F[i, j] == F[i - 1, j] + Gap):
        S1 = S1 + [x[i - 1]]
        S2 = S2 + ['-']
        i = i - 1
    else:
        S1 = S1 + ['-']
        S2 = S2 + [y[j - 1]]
        j = j - 1

print(np.array([S1, S2]).reshape(2, len(S1)))

Now how to modify this code so as to get all possible alignments? Thanks in advance.

Needleman Wunsch Algorithm

1

There are 1 best solutions below

0
On

Well, the keywords should be dynamic programming, python.

In order to get all optimal solutions you have to maintain in each step $(i,j)$ a list $L(i,j)$ which shows where the maximum has been attained. There are three such possibilities: $(i-1,j), (i,j-1), (i-1,j-1)$ or simply north (n), west (w), and north-west (nw). Then you need a backtracking algorithm in order to traverse all possible paths in the list from $(m,n)$ to $(0,0)$.