Audio examples


G Sequence

An artificial test DNA sequence that consists (GGG)n:

GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG

G sequence
(default settings)

This simple mononucleotide sequence is useful to understanding the sonification output and highlights the characteristic triplet note phrasing which is the basis of the subsequent auditory displays. Since the same codon motif occurs in each reading frame the same note is played on each of three instruments giving rise to a highly repetitive pattern. Additionally, no disruption to either instrument occurs which highlights that the sequence contains no start or stop codons.


Mutated G Sequence

Test DNA sequence (GGG)n with a G->T point mutation:

GGGGGGGGGGGGGGGTGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG

Mutated G Sequence
(default settings)

The introduction of a point mutation into the 'G Sequence' causes only a transient change of one note of each reading frame/instrument (i.e. a change of up to three notes in one triplet, allowing for degeneracy in the genetic code). This is exemplified by a pronounced change in the auditory display of the ‘Mutated G Sequence’ at approx. 3 seconds from the start, no further change is evident. The mutation in this simple sequence is clearly heard through sonification.


G Sequence with STOP in the 1st Reading Frame

The sequence is similar to the (GGG)n except for the stop codon:

GGGGGGGGGGGGGGGTAAGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG

STOP in the 1st reading frame
(default settings)

This sounds the same as the 'G Sequence' until the stop codon is played (at about 3 seconds from the beginning).This stop codon occurs in the first reading frame which maps to instrument 1 (piano) and therefor it becomes silent for the remainder of the sonification. Following the stop codon the characteristic triplet note phrasing is replaced by a two note phrasing for the remainder of the auditory display (remember that its now missing the piano) with a rest beat in place of the absent audio.
The stop codon itself (TAA) is silent in frame one (as is the remainder of that frame) whereas in frames two and three it overlaps two bases into AAG and one base into AGG, respectively (remember we are moving along the sequence one base at a time) and gives rise to two distinct notes before the reoccurrence of GGG.


G Sequence with STOP in all Reading Frames

Again similar to the (GGG)n but with stop codons in all frames:

GGGGGGGGGGGGGGGTAAGTAAGTAAGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG

STOP in all reading frames
(default settings)

(Restart after 10 codons)

Same as above until the stop codons are parsed, beyond this point (approx. 4 seconds) the audio streams from all reading frames becomes silent for the remainder of the sonification. In the second sonification of this sequence, the 'Restart after 10 codons' option was selected which forces the audio to restart in all frames after a short period of silence (10 codons) even in the absence of an ATG start codon.


STOP in All, START in 1st RF

Repetitive G sequences with stop codons in all three reading frames followed by a start codon in the 1st frame.

GGGGGGGGGGGGGGGTAAGTAAGTAAGGGGGGGGGGATGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG

STOP in all reading frames START in 1st RF
(default settings)

Same as 'STOP in all reading frames' above through to the point where all reading frames becomes silent, however the presence of a start codon in reading frame 1 causes instrument 1 to restart at approx. 7 seconds, whilst the others frames remain silent. Audio notes from the isolated instrument are staggered with two rests beats between each note due to the two silent frames.


AT only DNA

AT rich DNA (absence of GC bases)

AAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATTAAATTATT

AT rich DNA
(default settings)

AT rich DNA
(Ignore Start/Stop)

Audio plays with characteristic triplet pattern (three instruments) however due to the high chance of an TAA stop codons in each reading frame the audio becomes silent after only 4 seconds and remains so for the remaining 39 seconds. The absence of G precludes the occurrence of an ATG start codon to restart the audio. Selecting the 'Restart after 10 codons' causes no change compared to the default settings due to the re-occurrence of TAA's (stop codons) within the passage of 10 codons. In contrast the 'Ignore Start/Stop' option results in sonification of the entire sequence. It may not be obvious but he audio phrasing repeats every 8 triplets due to the repetitive nature of the artificial DNA sequence.


GC only DNA

GC rich DNA (absence of AT bases)

GGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCCGGGCCGCC

GC rich DNA
(default settings)

Audio plays with characteristic triplet pattern (three instruments) for the duration of the sequence (approx. 43 seconds) with the default settings. There are no interruptions in the auditory display because all start and stop codons require A and T bases which are absence. This is in stark contrast to the display of the AT rich DNA using the same default settings.


Human Telomeric DNA

Human DNA sequence that consists of tandem arrays of the hexanucleotide sequence (TTAGGG)n, for example:

TTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGAGTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTGTTAGGGTTAGGGTTAGGG

Human Telomeric DNA
(Ignore Start Stop)

The audio from this sequence is highly repetitive and repeats approximately every 6 bases. This sequence was sonified using the "reading frame algorithm" that reads groups of three (3) bases at a time (as triplets) hence after TWO sets of triplets the notes repeat. Notice the change in the repetitive sound that occurs at approx. 13 sec that reflects a subtle change in the DNA sequence at bp 79 (insertion of AG in place of T) in addition to a change at 41 sec (due to the insertion of TG). This is clearly apparent in the sonification but not so apparent by visual inspection of the sequence

Sequence data published by:
Moyzis, R. K., J. M. Buckingham, et al. (1988). "A highly conserved repetitive DNA sequence, (TTAGGG)n, present at the telomeres of human chromosomes." PNAS. 85, 6622-6626.


Alphoid Repetitive DNA

Human DNA sequence that consists of tandem arrays of the pentanucleotide sequence (CCATT)n, for example:

CCATTCCATTCCATTCCATTCCATTCCATTCCATTCCATTCCATTCCATTCCATTCCATTCCATTCCATTCCATTCCATTCCATTCCATTCCATTCCATTATAGTCCATTCCATTCCATTCCATTCCATTCAATTCCATTCCATTACAATTCGTTCCATTCCATTCTATTCCGTACCATTCGATTCCATTCCATACCATCCATTCCATTCCATTCCATTCATTCCATTCCGTTCCATTCCGTTCATTCATTCATTCCATTCTATTCGGATTAATTCCAATCTATTCCATTCATTGCATTCTATTCCATTCCATTGCAATCGAGTTGAATACATTGCATTCTATTCATTCATTCATTCCATTCCATTCCGGAAGATTA

Human alphoid repetitive sequence
(Ignore Start Stop)

Human alphoid repetitive sequence (Restart on ATG)

The audio from this sequence is clearly repetitive but notice that the audio sound is more complex than the previous telomeric DNA sequence. This is because the sequence repeats approximately every 5 bases whereas the "reading frame algorithm" reads groups of three bases at a time, hence the melody repeats every 15 bases, that is FIVE sets of triplets occur before the notes repeat. The first 17 sec (100 bp) is a synthetic alphoid sequence that is purely repetitive with no sequence variation. Following this an actual alphoid repetitive sequence in that contains sequence variations that are clearly audible


Ras coding sequence (cDNA)

This sequence represents the first exon of the Human Ras DNA sequence,an important gene in cell signalling and human disease:

ATGACGGAATATAAGCTGGTGGTGGTGGGCGCCGGCGGTGTGGGCAAGAGTGCGCTGACCATCCAGCTGATCCAGAACCATTTTGTGGACGAATACGACCCCACTATAGAGGATTCCTACCGGAAGCAGGTGGTCATTGATGGGGAGACGTGCCTGTTGGACATCCTGGATACCGCCGGCCAGGAGGAGTACAGCGCCATGCGGGACCAGTACATGCGCACCGGGGAGGGCTTCCTGTGTGTGTTTGCCATCAACAACACCAAGTCTTTTGAGGACATCCACCAGTACAGGGAGCAGATCAAACGGGTGAAGGACTCGGATGACGTGCCCATGGTGCTGGTGGGGAACAAGTGTGACCTGGCTGCACGCACTGTGGAATCTCGGCAGGCTCAGGACCTCGCCCGAAGCTACGGCATCCCCTACATCGAGACCTCGGCCAAGACCCGGCAGGGAGTGGAGGATGCCTTCTACACGTTGGTGCGTGAGATCCGGCAGCACAAGCTGCGGAAGCTGAACCCTCCTGATGAGAGTGGCCCCGGCTGCATGAGCTGCAAGTGTGTGCTCTCCTGA

Human H-Ras cDNA
(Silent until ATG)

Human H-Ras cDNA
(Highlight STOP START)

This sequence was sonified using the "reading frame algorithm" in which a different instrument is used to sonify each reading frame, in this example a bright piano, electric bass (pick) and timpani were used to sound each frame. In addition the "use Start/Stop codons" option was selected, so that whenever a stop codon is detected in either reading frame the instrument is silenced as are the following 10 codons (notes). Notice how the bright piano plays throughout (i.e. the Ras open reading frame) whereas both the bass and timpani cut out repeatedly as stop codons occur in these respective reading frames. This leads to sections of audio with solo piano (e.g. at 3 sec and 1:30 min), piano and timpani (e.g. 9 to 17 sec) or piano and bass duets (predominantly from 45 sec to 1:00 min) plus the full trio ensemble (e.g. from 30 sec and 1:05 mins).

Sequence data taken from:
Homo sapiens chromosome 11 genomic contig, GRCh37.p5 Primary Assembly, NCBI Reference Sequence: NT_009237.18 (beginning at position 189)


15S rRNA sequence

Yeast mitochondrial DNA sequence that codes for the 15S ribosomal RNA

GTAAAAAATTTATAAGAATATGATGTTGGTTCAGATTAAGCGCTAAATAAGGACATGACACATGCGAATCATACGTTTATTATTGATAAGATAATAAATATGTGGTGTAAACGTGAGTAATTTTATTAGGAATTAATGAACTATAGAATAAGCTAAATACTTAATATATTATTATATAAAAATAATTTATATAATAAAAAGGATATATATATAATATATATTTATCTATAGTCAAGCCAATAATGGTTTAGGTAGTAGGTTTATTAAGAGTTAAACCTAGCCAACGATCCATAATCGATAATGAAAGTTAGAACGATCACGTTGACTCTGAAATATAGTCAATATCTATAAGATACAGCAGTGAGGAATATTGGACAATGATCGAAAGATTGATCCAGTTACTTATTAGGATGATATATAAAAATATTTTATTTTATTTATAAATATTAAATATTTATAATAATAATAATAATAATATATATATATAAATTGATTAAAAATAAAATCCATAAATAATTAAAATAATGATATTAATTACCATATATATTTTTATATGGATATATATATTAATAATAATATTAATTTTATTATTATTAATAATATATTTTAATAGTCCTGACTAATATTTGTGCCAGCAGTCGCGGTAACACAAAGAGGGCGAGCGTTAATCATAATGGTTTAAAGGATCCGTAGAATGAATTATATATTATAATTTAGAGTTAATAAAATATAATTAAAGAATTATAATAGTAAAGATGAAATAATAATAATAATTATAAGACTAATATATGTGAAAATATTAATTAAATATTAACTGACATTGAGGGATTAAAACTAGAGTAGCGAAACGGATTCGATACCCGTGTAGTTCTAGTAGTAAACTATGAATACAATTATTTATAATATATATTATATATAAATAATAAATGAAAATGAAAGTATTCCACCTGAAGAGTACGTTAGCAATAATGAAACTCAAAACAATAGACGGTTACAGACTTAAGCAGTGGAGCATGTTATTTAATTCGATAATCCACGACTAACCTTACCATATTTTGAATATTATAATAATTATTATAATTATTATATTACAGGCGTTACATTGTTGTCTTTAGTTCGTGCTGCAAAGTTTTAGATTAAGTTCATAAACGAACAAAACTCCATATATATAATTTTAATTATATATAATTTTATATTATTTATTAATATAAAGAAAGGAATTAAGACAAATCATAATGATCCTTATAATATGGGTAATAGACGTGCTATAATAAAATGATAATAAAATTATATAAAATATATTTAATTATATTTAATTAATAATATAAAACATTTTAATTTTTAATATATTTTTTTATTATATATTAATATGAATTATAATCTGAAATTCGATTATATGAAAAAAGAATTGCTAGTAATACGTAAATTAGTATGTTACGGTGAATATTCTAACTGTTTCGCACTAATCACTCATCACGCGTTGAAACATATTATTATCTTATTATTTATATAATATTTTTTAATAAATATTAATAATTATTAATTTATATTTATTTATATCAGAAATAATATGAATTAATGCGAAGTTGAAATACAGTTACCGTAGGGGAACCTGCGGTGGGCTTATAAATATCTTAAATATTCTTACA

15S_rRNA non-coding RNA
(Silent until ATG).

15S_rRNA non-coding RNA
(Highlight STOP START).

This sequence was process in the same way as the ras sequence above and likewise has no discernable melodic patterns such as those used to detect tandem repeats in repetitive DNA, however, the audio is still highly recognisable. This audio is characterised by the complete absence of any "triplet" note passages and the presence of repeated sections of silence. All passages are either single notes or pairs of notes with in each triplet (deriving from each of three reading frames). This is because of repeated stop codons occurring in all reading frames (including TGA). The rRNA is not translated and therefore stop codons have no effect (or act to inhibit translation if it were to occur). Clearly the "reading frame algorithm" combined with the "Start/Stop codons" option is effective in sonifying the rRNA into a distinctive audio stream.

The second example is the same sequence with the Highlight STOP START option to sonify the occurence of these motifs with purcussion sounds even when the audio from the respective instrument/reading frame is silent.

Sequence data taken from:
15S_rRNA 15S_RRNA SGDID:S000007287, Chr Mito from 6546-8194