DNA- and RNA-Based Computing Systems. Группа авторов
Чтение книги онлайн.
Читать онлайн книгу DNA- and RNA-Based Computing Systems - Группа авторов страница 18
Figure 2.4 Representation of surface‐bound DNA sequence.
In step (iii), the sequences corresponding to the satisfaction of each clause are marked by hybridizing these with the complementary sequences corresponding to “vvvvvv.” In step (iv), all single‐stranded sequences remaining after the hybridization are destroyed by treating with Escherichia coli Exonuclease I. In step (v), all hybridized sequences are unmarked to get the single‐stranded molecules for all the remaining surface‐bound sequences. Steps (iii)–(v) are repeated for all the clauses one after another. The unmarked sequences remaining at the end are analyzed in a readout operation using PCR in step (vi).
The procedure is explained in the context of the same illustrative SAT problem (x1∨x2) ∧ (¬x2∨x3) of three variables (x1, x2, x3) described earlier. This problem has a solution space of size 8 (each variable can be either “1,” or “0,” total solution space = 23). All combinations for this problem are shown in the second row ( corresponding to the test tube t0) of Table 2.1. The problem involves two clauses: C1 = (x1∨x2) and C2 = (¬x2∨x3). The SAT is checked for these clauses one by one. C1 = (x1∨x2) is not satisfied only if x1x2x3 is represented by {000} and {001}. Therefore, the complementary sequences for all “vvvvvv” except for the above two clauses are hybridized. This eliminates the above two combinations. The remaining hybridized combinations are unmarked. Next, the SAT of C2 = (¬x2∨x3) is checked. C2 = (¬x2∨x3) is not satisfiable for {010} and {110}. Except these, all the sequences are allowed to hybridize. This leads to hybridization of {011}, {100}, {101}, and {111}. These are unmarked and identified in a readout step using PCR.
2.2.4 Sakamoto's Model
Sakamoto et al. [5] introduced a hairpin formation model for solving an SAT problem using molecular biology techniques. For a given illustrative SAT problem (x1 ∨x2) ∧ (¬x2 ∨x3), literal strings (x1, ¬x2), (x1, x3), (x2, ¬x2), and (x2, x3) are formed. A literal string is a string used to encode the given formula with conjunctions of the literals selected from each clause. The literal strings are obtained by concatenating of DNA sequences corresponding to each literal in a ligation step. In these literal strings, if a variable is represented in both original and negation form, then it violates the SAT condition of the given SAT problem. The literal strings without such violation in which a variable is represented only and at least in one form (either actual or negation) constitute a satisfiable solution to the given SAT problem. In Sakamoto's model, possible literal strings are first obtained by ligation. A length of the literal string equals to the number of clauses × nucleotides used for each literal. Subsequently, the obtained literal strings are subjected to temperature variation, which leads to a hairpin formation if a variable is represented in its original and negation form. The restriction enzyme destroys all such hairpins. These solutions are readily eliminated in the subsequent gel electrophoresis operation where only the literal strings with the desired length are separated. All the literal strings separated are analyzed using the sequencer, and the solution of the given SAT problem is obtained. It is to be noted that the given procedure eliminates a large number of unsatisfying literal strings, which makes it easier to deduce the correct solution from the analysis of the remaining satisfying literal strings. The procedure is useful for large size problems. However, it also has a risk of missing some literal strings due to experimental errors that may lead to an erroneous solution to the given SAT problem.
For the given illustrative SAT problem [(x1∨x2) ∧ (¬x2∨x3)], a single‐stranded sequence for all literals is ligated to form the literal strings (x1, ¬x2), (x1, x3), (x2, ¬x2), and (x2, x3). The ligated strings are shown in Figure 2.5. These ligated strings will be subjected to restriction enzyme digestion, which eliminates the string (x2, ¬x2), and finally, the remaining literal strings are (x1, ¬x2), (x1, x3), and (x2, x3). From these remaining three literal strings, a solution to the given SAT problem is deduced by mathematical analysis. It is to be mentioned that only one literal string is eliminated for the given problem as it has only (2)2 [= (number of literals in each clause)number of clauses] equal to four literal strings. Though for the given illustrative example the search space is reduced only by 25% (as it removes only one literal string), for bigger problems the reduction in search space is significant. Sakamoto et al. [5] illustrated the benefit of the method by solving 6‐variable, 10‐clause problem [= (x1 ∨x2 ∨¬ x3) ∧(x1 ∨x3 ∨ x4) ∧(x1 ∨¬ x3 ∨¬ x4) ∧(¬ x1 ∨¬ x3 ∨x4) ∧(x1 ∨¬ x3 ∨x5) ∧(x1 ∨x4 ∨¬ x6) ∧(¬ x1 ∨x3 ∨ x4) ∧(x1 ∨x3 ∨¬ x4) ∧(¬ x1 ∨¬ x3 ∨¬ x4) ∧(¬ x1 ∨ x3 ∨¬ x4)] having three literals in each clause. This example has 310 = 59 049 literal strings out of which only 24 literal strings were found to be satisfying the condition. Thus, ∼99.95% of the search space is eliminated using the above methodology to finally obtain the correct solution just by analyzing 24 literal strings. One of the advantages of this methodology is the use of just one step, unlike sequential elimination steps used in earlier models.
Figure 2.5 Illustration of four literal strings for the DNA hairpin formation‐based computation.
2.2.5 Ouyang's Model
Ouyang et al. [11] solved a maximal clique problem using DNA computing method. A maximal clique is the largest subset in the graph in which each vertex is connected to every other vertex in the subset. In maximal clique problem, the maximum size of the clique in terms of the vertices has to be evaluated. For example (Figure 2.6a), the graph has five vertices, and eight edges where the vertices (5, 4, 2, 1) is the largest clique; thus the maximum size of the clique is four. Ouyang et al. [11] solved the maximal clique problem using DNA computing as follows.