# Folding Complex DNA Nanostructures From Limited Sets of Reusable Sequences

Stefan Niekamp, Katy Blumer, Parsa Nafisi, Kathy Tsui, John Garbutt, Shawn M. Douglas*

Department of Cellular and Molecular Pharmacology, University of California, San Francisco, CA 94158, USA.

*E-mail: shawn.douglas@ucsf.edu; Phone: +1-415-502-1947; Fax: +1-415-476-5292

## Abstract

Scalable production of DNA nanostructures remains a substantial obstacle to realizing new applications of DNA nanotechnology. Typical DNA nanostructures comprise hundreds of DNA oligonucleotide strands, where each unique strand requires a separate synthesis step. New design methods that reduce the strand count for a given shape while maintaining overall size and complexity would be highly beneficial for efficiently producing DNA nanostructures. Here we report a method for folding a custom template strand by binding individual staple sequences to multiple locations on the template. We built several nanostructures for well-controlled testing of various design rules, and demonstrate folding of a 6-kilobase template by as few as 10 unique strand sequences binding to 10+/-2 locations on the template strand.

## Introduction

DNA nanotechnology solves an important problem that remains extremely challenging for other engineering platforms, which is the positional control of matter on nanometer length scales (1). Thus, DNA may hold great potential for creating nanoscale tools and devices that could impact many fields including materials science, electronics, and medicine. However, the path from laboratory proofs-of-concept to demand-meeting applications will require further innovation in both design and synthesis of DNA nanostructures.

When developing novel strategies for creation of DNA nanostructures, we can evaluate design choices in the context of how the structure will be used and how it will be made. We might consider the total amount of structures needed, ease of design, initial and marginal costs of synthesis and recovery, minimum yield of well-folded structures, surface addressability, and so on. Certain properties that appear meaningful in one context may be less relevant in another context.

Once functional requirements are chosen, many design parameters can be explored such as tiled (2-4) vs. templated (5, 6) assembly, crossover arrangement (7), helix-axis orientation (8-11), total size in nucleotides (12), and multimerization via sticky ends (13) or base stacking (14). For example, many studies have made a point of exploring shape diversity by fixing some of these parameters and varying others. Using tiled assembly, hundreds of shapes have been created by fixing the sequences and orientation of a set of strands, and varying which subset of strands are folded. Using a templated approach, the similar scaffold sequences have been folded into 2D planar (6), 3D planar (15, 16), 3D lattice (17, 18), gridiron (10), and polyhedral mesh (11) shapes by varying helix-axis orientation and crossover arrangements.

Here we aimed to take a step toward applications of DNA nanotechnology that require large-scale synthesis of complex structures comprising at least 10,000 nucleotides. The future scalability of templated structures appears promising in light of recently reported gram-scale production of single-stranded DNA (ssDNA) scaffold templates in bioreactors (19). However, because each unique template-binding "staple" strand requires a separate synthesis step, large-scale synthesis remains prohibitively expensive for methods that rely on hundreds of unique strands (cost calculations in Supplementary Data). Thus, we sought to reduce the total number of distinct strands necessary to fold a structure (Figure 1A). We report a novel approach to creating DNA nanostructures that reduces the marginal cost of large-scale strand synthesis by reusing individual staple sequences multiple times on the same template. Like Shih-style single-stranded DNA origami (5), our approach comes with some tradeoffs compared to other methods, namely the initial difficulty and costs of strand and template design are increased. Nevertheless, by employing both custom scaffolds and templated assembly, we were able to expand the available design space for parameters such as template length, sequence content, and number of strands. That flexibility allowed us to achieve an order-of-magnitude reduction in the number of unique strands required to fold DNA nanostructures (Figure S1) without significant reductions in size, complexity, or yield.

Designing DNA nanostructures using our method requires some modifications to similar template-based design approaches (Figure 1B). Using caDNAno, a computer-aided design tool for DNA origami (20), we first routed the scaffold to approximate a 3D shape and exported the design from caDNAno as a text file in JSON format. Next, we input the JSON file into a custom Python script (see Material and Methods) and specified the desired number of unique staples (e.g. 10). The script determines each custom scaffold sequence by generating a random staple layout, repeatedly assigning a set of staple sequences to that layout, and then assigning the appropriate complementary bases to the scaffold. Because highly repetitive DNA sequences can be difficult to synthesize, we sought to minimize scaffold sequence repetitiveness. We created a library of candidate scaffold sequences and selected the top-ranked sequence according to total fraction of nucleotides that appear in a "repetitive" motif, defined as a 12-base window that appears more than once in the scaffold. Using a similar approach to the MOSIC (21) method for enzyme-mediated production of DNA oligonucleotides, we cloned each scaffold sequence flanked by hairpins encoding recognition sequences for the restriction enzyme BtsCI into the helper phage M13K07 for amplification of ssDNA (Figure S2). We purified ssDNA (22) followed by a digestion with BtsCI for separation of the vector M13K07 and the custom scaffold. The scaffold was then added to folding reactions to create the final shapes.

## Materials and Methods

### Design of custom scaffolds

The script and a detailed manual are available for download here: https://github.com/stefanniekamp/ReuseDNA. In brief, the parameters that can be changed besides the number of staples are: Input caDNAno / json-file, number of iterations that should be used (= number of different versions that will be generated), number of best design(s) in terms of degree of repetitiveness in scaffold sequence ranked from lowest to highest that should be shown, usage of predefined staple sequences or random sequences, the minimum repeat length for repetitive motifs and staple lengths as well as colors. The output will be as depicted in Figure 1B where the degree of repetitiveness for each scaffold sequence is shown. An example output is also shown in Figure S3. In addition to the plot caDNAno / json-files and sequences for all scaffolds are generated and saved.

### Cloning of custom scaffolds

Screening device custom scaffold sequence inserts (1082 bp) were ordered in pBluescript from Genewiz (sequences can be found in Supplementary Data). They were then PCR amplified with ctagtaccgcggAGGAATAGGGC and ctagtagagctcGTCGACCCACTC (upper case anneals with scaffold sequence in pBluescript and lower case adds SacII and SacI restriction sites with some extra DNA to allow enzyme to bind, respectively). The amplicon was digested and cloned into M13K07 RF at SacII and SacI restriction sites. Larger custom scaffold sequences for the 24-helix bundles were ordered as DNA blocks from Genewiz and Gibson-cloned into digested (SacII and SacI) M13K07 RF. For Gibson cloning of each of the three designs, three ~2kb DNA blocks were assembled (see block sequences in Supplementary Data) and amplified (primer sequences in Table S2) before cloning into M13K07 RF.

### Scaffold amplification and purification

First a 5 ml overnight culture of XL1-Blue cells in LB media with tetracycline was grown. The next day a 500 ml flask with 150 ml of 2xYT medium mixed with 1.5 ml of the saturated overnight culture, 1 ml of 50% glucose, 1.5 ml of 1.3 M MgCl2, tetracycline, and kanamycin was prepared. Then, the culture was incubated at 37C with shaking at 250 r.p.m. for about 6 h before the supernatant was harvested by spinning at 4,000 rcf for 15 min at 4C. Afterwards, the supernatant was filtered with four layers of Whatman No. 1, followed by a 0.6 um glass fiber filter, and a 0.2 um filter. Then PEG 8000 and NaCl were added to final concentrations of 40 g/l and 30 g/l, respectively. Samples were incubated in an ice bath for 30 minutes. Next, the phage was pelleted at 4,000 rcf for 15 min at 4C and the supernatant was decanted. Then, the pellet was resuspended in 1/100 of the original culture volume in TE buffer (5 mM Tris pH 8.5 and 1 mM EDTA). Afterwards, residual E. coli cells were pelleted at 15,000 rcf for 15 min at 4C and phage supernatant was transferred to a fresh container. This was followed by the addition of 2 volumes of lysis buffer (0.2 M NaOH, 1% SDS) and 1.5 volumes of neutralization buffer (3 M KOAc pH 5.5). The mixture was incubated in an ice-water bath for 15 minutes, and then spun at 16,000 rcf for 15 minutes at 4C. Next, the supernatant was transferred into fresh centrifuge bottles. Immediately, 2 volumes of ice cold 100% ethanol were added and mixed by swirling. The mixture was incubated in a -20C freezer for 2 hours and spun at 16,000 rcf for 15 minutes at 4C afterwards. Next, the supernatant was decanted and 10 ml of ice cold 75% ethanol was added to each centrifuge bottle and mixed by swirling. Afterwards, the mixture was spun at 16,000 rcf for 5 minutes at 4C and the supernatant was removed. Finally, the pellet was air dried and resuspended in TE buffer (5 mM Tris pH 8.5 and 1 mM EDTA) -- the volume will depend on desired final concentration of scaffold. For the custom-scaffold 24-helix bundle, the purified scaffold was subsequently digested with BtsCI (NEB, Catalog # R0647L) as follows: 10 ul of ssDNA at 100 nM with 2 ul of cutsmart buffer, 1 ul of BtsCI and 7 ul of ddH2O was incubated at 50C overnight. Note, that the final scaffold sequence will contain two dsDNA restriction sites for BtsCI (hairpin at each end, Figure S2) and several ssDNA restriction sites for BtsCI. But since BtsCI has a significantly higher affinity for dsDNA, this was not a concern.

### Oligonucleotides

All oligonucleotides were ordered from IDT and resuspended in 5 mM Tris pH 8.5 and 1 mM EDTA.

### Molecular self-assembly reactions and purification

Scaffold (final concentration 20 nM) and staples (final concentration 200 nM for DNA origami, which equals a 10 fold excess per corresponding scaffold binding site, and a 10 fold excess per corresponding scaffold binding sites for custom scaffold DNA origami, if not specified otherwise), were mixed in 5 mM Tris pH 8.5, 1 mM EDTA and 18 mM MgCl2 and annealed with the following temperature ramp: denaturation at 65C for 15 min followed by cooling from 62C to 35C with a decrease of 1C per 2 h. Then the reaction was held at 12C for at least 30 min. Afterwards products were analyzed by 2% agarose gel electrophoresis in TBE (45 mM Tris-borate and 1 mM EDTA) with 11 mM MgCl2 and purified by extraction and centrifugation in Freeze 'N Squeeze columns.

### Agarose gel-based yield estimation

Agarose gel-based yield estimation was carried out by using ImageJ (http://rsb.info.nih.gov/ij/). The percentage of structure that ran as a monomeric, leading band was estimated as the background subtracted integrated intensity value divided by the background-subtracted integrated intensity value enclosing the material from the well, down to the bottom of the leading band.

### Transmission electron microscopy

6 ul of the purified folding reaction product was applied on glow-discharged, carbon-coated, 400 mesh formvar grids (Electron Microscopy Sciences), incubated for 1 min, blotted off and stained with 2% (w/v) aqueous uranyl formate solution. The electron micrographs were collected with a FEI TECNAI T20 transmission electron microscope and a Tietz TVIPS 8k camera at normal magnification of 46,000x. Particles for class averaging were picked and calculated with EMAN 2. The number of particles picked for the class averages in Figures 3 A, B and C was between 220 and 295.

## Results and Discussion

Adapting a template-based design strategy for reduced strand counts without reducing the nucleotide count required the introduction of repetitive elements into our custom scaffold sequences. To assess the feasibility of our approach and to examine some initial design parameters, we created a DNA origami screening device with a central core domain and two opposing "antennae" (Figure 2). Each antenna is a six-helix bundle folded from a 1082-base scaffold segment. The antennae can be distinguished by an asymmetric domain in the core structure (Figure 2A). A series of "control" and "test" antenna pairs were designed (Figure S4). Various control antennae were designed using the standard DNA origami method with a fully unique staple set. We cloned several custom scaffold segments at the location of the second antenna for testing.

We analyzed the influence of four parameters on custom sequence repetitiveness and antennae folding yield: scaffold crossover density, staple crossover density, staple length, and repetitiveness of staple crossover arrangements. We again scored repetitiveness using a 12-base window. Crossover densities were ranked by counting the number of crossovers per 1000 nucleotides. We analyzed designs with "short" staples (25-62 bases) and "long" staples (63-125 bases), listed in Table S1. We also tested one design with highly repetitive staple crossover arrangements. That is, when the same staple binds to the scaffold in different locations, the crossover positions tend to occur at identical phosphate positions within the staple (Figure S4).

We tested three parameters across six designs, and devised a 3-letter abbreviation to identify each parameter set (Short or Long staple length, Low or High staple crossover density, and Low or High scaffold crossover density). Thus our six designs can be designated as LLL, SLL, SHL, LLH, SLH, and LHH (Figures 2B and S5 and Table S1). After running the folding products on an agarose gel, we isolated the leading bands for all six designs by physical extraction (Figure S6) and determined relative yields of antenna domains by manually counting the percentage of well-folded custom scaffold antennae as visualized by negative-stain transmission electron microscopy. We normalized yields by the percentage of well-folded control antennae attached to the same DNA origami screening devices (Figures 2C and S7). Of the six versions we observed that the three least-repetitive designs folded with the highest yields ranging from 96% to 98%. The three most-repetitive designs folded to lower yields ranging from 69% to 88% (Figures 2D and E). Hence, there seems to be an inverse correlation between the yield of correctly folded antennae and the degree of repetitiveness in the scaffold sequence. Comparing the SHL and SLH designs, which have short staples and a similar combined number of staple and scaffold crossovers (Table S1), we observed that the SHL design with its repetitive staple crossover arrangement has a much higher sequence redundancy and lower yield. This may indicate that repetitive staple crossover arrangements can compromise folding yield. When we compare SLL and LLL designs, it appears that using longer staples improved the yield, perhaps due to the lower degree of repetitiveness in the scaffold sequence. In light of these data, we designed subsequent shapes using a high density of staple and scaffold crossovers, longer staples, and non-repetitive staple crossover arrangements.

We next set out to determine how few unique staple sequences could be used to fold a large (>10,000 nucleotide) DNA nanostructure without significantly compromising the folding yield (Figure 3). We designed a set of 24-helix bundles with 6-kilobase scaffolds, and tested versions designed to fold using 10, 15 or 20 unique staple sequences (Figs. 3A, 3B, 3C, respectively). For comparison, a similar shape designed using the DNA origami method requires approximately 150 unique staples. Thus, for the designs with 10, 15 and 20 different staple sequences that means a reduction in number of different strands of 15-, 10- and 7.5-fold, respectively. We generated scaffold sequences with 56% to 39% repetitiveness using a 12-base window (Figures S8 and S9). For comparison, the standard DNA origami scaffold M13mp18 contains only 2% sequence redundancy by this measure. We followed the design strategy described above, allowing the script to generate staple sequences with lengths ranging from 38-77 bases and a crossover density of 240 crossovers per 1000 nucleotides. We successfully folded all three versions as can be seen by the transmission electron micrograph class averages (Figures 3A-C) and representative micrographs of individual particles (Figures 3D and S10).

The fraction of structures that migrated as a monomeric species in gel electrophoresis was estimated as integrated intensity of the leading band divided by total intensity of gel lane up to and including the well (Figure S11). Here we found 48%, 59% and 61% of in the leading bands for the designs with 10, 15 and 20 different staple sequences, respectively. Subsequently, the yield estimate was refined by manually counting the percentage of well-folded particles from purified structures as seen in electron micrographs. For the designs with 10, 15 and 20 different staple sequences we counted 55%, 69% and 83% intact structures, and thus calculated absolute yields of 26%, 41%, and 51%, respectively (Figure 3E). We observed an inverse correlation between the yield of intact structures and the repetitiveness in the scaffold sequence.

Finally, we quantified the impact of staple-to-scaffold concentration on folding yield of our designs. We folded structures with 2-, 3-, 4-, 5-, 6-, 8-, and 10-to-1 ratios of staple-to-scaffold binding sites and measured the folding yield by gel electrophoresis (Figure S12). This study was carried out with the 24-helix bundle with 10 unique staple sequences. We noted relative similar yields (21%-22%) for the folding reactions with 2- and 3-fold staple excess, higher yields (33%-42%) for the folding with 4-, 5-, and 6-fold excess and the highest yields (49%-50%) for the assemblies with 8- and 10-fold excess of staples.

In conclusion, we devised a novel DNA nanostructure design approach by employing custom scaffolds that allows for successful folding of large (>10,000) nucleotide structures with an order-of-magnitude reduction in staple count compared to similar template-based shapes. We tested several combinations of design parameters, namely staple length, staple and scaffold crossover density, and total number of strands. Future exploration of design space and fine-tuning of low-level parameters may further boost yields and reduce the number of strands required for folding. We hope that our approach will provide useful inspiration in realizing applications of DNA self-assembly that require large-scale production of complex nanostructures.

Supplementary Data: DNA nanostructure design schematics, DNA sequences, TEM images, agarose gels, custom Python script, additional figures and tables.

Notes: The authors declare no competing financial interest.

Acknowledgements: This work was funded by grants from the Del E. Webb Foundation (13-2-28), the Army Research Office (W911NF-14-1-0507), and National Science Foundation (CCF-1317640). We thank H. Tran and E. Palovcak for helpful comments on the manuscript.

## References

1. Seeman,N.C. (2010) Nanomaterials based on DNA. Annu. Rev. Biochem., 79, 65-87.
2. Winfree,E., Liu,F., Wenzler,L.A. and Seeman,N.C. (1998) Design and self-assembly of two-dimensional DNA crystals. Nature, 394, 539-544.
3. Wei,B., Dai,M. and Yin,P. (2012) Complex shapes self-assembled from single-stranded DNA tiles. Nature, 485, 623-626.
4. Ke,Y., Ong,L.L., Shih,W.M. and Yin,P. (2012) Three-dimensional structures self-assembled from DNA bricks. Science, 338, 1177-1183.
5. Shih,W.M., Quispe,J.D. and Joyce,G.F. (2004) A 1.7-kilobase single-stranded DNA that folds into a nanoscale octahedron. Nature, 427, 618-621.
6. Rothemund,P.W.K. (2006) Folding DNA to create nanoscale shapes and patterns. Nature, 440, 297-302.
7. Martin,T.G. and Dietz,H. (2012) Magnesium-free self-assembly of multi-layer DNA objects. Nat. Commun., 3, 1103.
8. Dietz,H., Douglas,S.M. and Shih,W.M. (2009) Folding DNA into twisted and curved nanoscale shapes. Science, 325, 725-730.
9. Han,D., Pal,S., Nangreave,J., Deng,Z., Liu,Y. and Yan,H. (2011) DNA origami with complex curvatures in three-dimensional space. Science, 332, 342-346.
10. Han,D., Pal,S., Yang,Y., Jiang,S., Nangreave,J., Liu,Y. and Yan,H. (2013) DNA gridiron nanostructures based on four-arm junctions. Science, 339, 1412-1415.
11. Benson,E., Mohammed,A., Gardell,J., Masich,S., Czeizler,E., Orponen,P. and Hogberg,B. (2015) DNA rendering of polyhedral meshes at the nanoscale. Nature, 523, 441-444.
12. Said,H., Schuller,V.J., Eber,F.J., Wege,C., Liedl,T. and Richert,C. (2013) M1.3--a small scaffold for DNA origami. Nanoscale, 5, 284-290.
13. Aldaye,F.A., Lo,P.K., Karam,P., McLaughlin,C.K., Cosa,G. and Sleiman,H.F. (2009) Modular construction of DNA nanotubes of tunable geometry and single- or double-stranded character. Nat. Nanotechnol., 4, 349-352.
14. Gerling,T., Wagenbauer,K.F., Neuner,A.M. and Dietz,H. (2015) Dynamic DNA devices and assemblies formed by shape-complementary, non-base pairing 3D components. Science, 347, 1446-1452.
15. Ke,Y., Sharma,J., Liu,M., Jahn,K., Liu,Y. and Yan,H. (2009) Scaffolded DNA origami of a DNA tetrahedron molecular container. Nano Lett., 9, 2445-2447.
16. Andersen,E.S., Dong,M., Nielsen,M.M., Jahn,K., Subramani,R., Mamdouh,W., Golas,M.M., Sander,B., Stark,H., Oliveira,C.L.P., et al. (2009) Self-assembly of a nanoscale DNA box with a controllable lid. Nature, 459, 73-76.
17. Douglas,S.M., Dietz,H., Liedl,T., Hogberg,B., Graf,F. and Shih,W.M. (2009) Self-assembly of DNA into nanoscale three-dimensional shapes. Nature, 459, 1154-1154.
18. Ke,Y., Douglas,S.M., Liu,M., Sharma,J., Cheng,A., Leung,A., Liu,Y., Shih,W.M. and Yan,H. (2009) Multilayer DNA origami packed on a square lattice. J. Am. Chem. Soc., 131, 15903-15908.
19. Kick,B., Praetorius,F., Dietz,H. and Weuster-Botz,D. (2015) Efficient Production of Single-Stranded Phage DNA as Scaffolds for DNA Origami. Nano Lett., 15, 4672-4676.
20. Douglas,S.M., Marblestone,A.H., Teerapittayanon,S., Vazquez,A., Church,G.M. and Shih,W.M. (2009) Rapid prototyping of 3D DNA-origami shapes with caDNAno. Nucleic Acids Res., 37, 5001-5006.
21. Ducani,C., Kaul,C., Moche,M., Shih,W.M. and Hogberg,B. (2013) Enzymatic production of 'monoclonal stoichiometric' single-stranded DNA oligonucleotides. Nat. Methods, 10, 647-652.
22. Douglas,S.M., Chou,J.J. and Shih,W.M. (2007) DNA-nanotube-induced alignment of membrane proteins for NMR structure determination. Proc. Natl. Acad. Sci. U. S. A., 104, 6644-6648.