Tutorial

library(ace2fastq)

Introduction

The package provides a function that converts “.ace” files (ABI Sanger capillary sequence assembly files) to standard “.fastq” files. The file format is currently used in genomics to store contigs. To the best of our knowledge, no R function is available to convert this format into the more popular fastq file format. The development was motivated in the context of the analysis of 16S metagenomic data by the need to convert the .ace files with contig data for further analysis. Each file only contains one sequence and corresponding quality values.

The function expects as a minimum a full path to the .ace file. By default a corresponding file with .fastq extension instead of .ace will be created. Also by default, the id line in the fastq file start after the obligatory @ with the name of the original .ace file without the extension followed by the internal original id from the .ace file.

A default example follows:

library(ace2fastq)

filename <- system.file("sampledat/1.seq.ace", package = "ace2fastq")

out_file <- ace_to_fastq(filename)

lines <- readLines(out_file)
#> [1] "@1.seq CO Contig1 1489 2 12 U"
#> [1] "gctccctgatgttagcggcggACGGGTGAGTAACACGTGGG"
#> [1] "+"
#> [1] "!!!!!!!!!!!!!!!!!!!!!DUNUUUUUUUNUDIIIUUUU"

A example with the alternative id pattern follows:

library(ace2fastq)

filename <- system.file("sampledat/1.seq.ace", package = "ace2fastq")

out_file <- ace_to_fastq(filename, name2id = FALSE)

lines <- readLines(out_file)
#> [1] "@CO Contig1 1489 2 12 U"

The target directory path can also be changed.