There is ongoing debate about what proportion of large eukaryotic genomes consists of non-functional ‘junk’ DNA. This debate centres around findings from the human genome project. On one hand, less than 10% of the human genome shows signatures of selection and only ~2% has been shown to be unambiguously functional; on the other, almost all the genome shows evidence of genomic activity in one form or another. So, are these pervasive genomic activities evidence that most of the genome encodes some function, or are they simply ‘background noise’ arising from junk DNA?
To resolve this debate, we are using a process we pioneered to generate synthetic random-sequence chromosomes de novo. As these random chromosomes have never been shaped by selection, they will allow us to determine whether bona fide non-functional DNA exhibits similar pervasive genomic activities as those seen in noncoding regions of genomes. The process we developed leverages the ability of the enzyme Terminal deoxynucleotidyl Transferase (TdT) to randomly add dNTPs to an oligonucleotide. With this process, we can generate >1016 distinct kilobase length random sequences in a single overnight reaction. These random sequences are then stitched together, alongside elements for chromosome maintenance and selection, to generate a series of random chromosomes. We will introduce these random chromosomes into yeast and human cells, and measure genomic activities including transcription, DNA methylation, and chromatin structure. These data will then be used to generate profiles for our synthetic junk DNA and for known functional genomic elements by employing approaches such as machine learning. Ultimately, these profiles will allow us to pinpoint which genomic regions are likely to be junk, which are likely to carry known but unannotated functions, and which may harbour yet-undiscovered functions.