Refer to the accompanying sample_ace_file.txt (below)
AS <number of contigs> <total number of reads in ace file>
CO <contig name> <# of bases> <# of reads in contig> <# of base segments in contig> <U or C>
The U or C indicates whether the contig has been complemented from the
way phrap originally created it. Thus this is always U for an ace
file created by phrap.
BQ
This starts the list of base qualities for the unpadded consensus
bases. The contig is the one from the previous CO, hence no name is
needed here.
AF <read name> <C or U> <padded start consensus position>
This line replaces the 'AssembledFrom*' line in the previous ace file
format. C or U means complemented or uncomplemented. The <read name>
is the true read name (no .comp on it as with the previous ace file
format.)
BS <padded start consensus position> <padded end consensus position> <read name>
This replaces the 'BaseSegment*' line from the previous ace file format.
RD <read name> <# of padded bases> <# of whole read info items> <# of read tags>
QA <qual clipping start> <qual clipping end> <align clipping start> <align clipping end>
This is new information not found in the previous ace file. If the
entire read is low quality, then <qual clipping start> and <qual
clipping end> will both be -1. These positions are offsets from the
left end of the read (left, as shown in Consed). Hence for bottom
strand reads, the offsets are from the end of the read. The offsets
are 1-based. That is, if the left-most base is in the aligned,
high-quality region, <qual clipping start> = 1 and <align clipping
start> = 1 (not zero).
DS CHROMAT_FILE: <name of chromat file> PHD_FILE: <name of phd file> TIME: <date/time of the phd file> CHEM: <prim, term, unknown, etc> DYE: <usually ET, big, etc> TEMPLATE: <template name> DIRECTION: <fwd or rev>
There can be additional information on this line.
This replaces the DESCRIPTION line from the old ace file.
The following is for transient read tags (those generated by
crossmatch and phrap). They are not fully implemented, and the format
may eventually change. The read is implied by the location of the
whole read info item within the ace file. They are found after the DS
line for a read.
RT{
<read name> <tag type> <what program created tag> <padded read pos start> <padded read pos end> <date when tag was created in form YYMMDD:HHMISS>
}
for example:
RT{
djs14_680.s1 matchElsewhereLowQual phrap 904 933 990823:114356
}
There are consensus tags now in the ace file. All consensus tags have
the following format:
CT{
<contig name> <tag type> <what program created tag> <padded cons pos start> <padded cons pos end> <date when tag was created in form YYMMDD> <NoTrans>
(possibly additional information)
}
The NoTrans is optional--it indicates that, when you reassemble, this
tag should not be transferred to the new assembly. This is true with
tags that should be recreated each time because they have to do with
the assembly (e.g., repeat tags).
e.g.,
CT{
Contig206 repeat tagRepeats.perl 118732 119060 990823:115033 NoTrans
AluY
}
In the case of most consensus tag types, there is only 1 line for the
consensus tag. In the case of comment tags and oligo tags, there are
additional lines of information. The comment tag includes the comment
on the additional lines. The oligo tag has the following information:
<oligo name> <oligo bases from 5' to 3'> <melting temp> <C or U
indicating whether the oligo is top strand or bottom strand relative
to the orientation of the contig as created by phrap>
WA{
<tag type> <what program created tag> <date tag was created in form YYMMDD:HHMISS>
1 or more lines of data
}
This line is a 'whole assembly' tag. It is used for information
referring to the assembly as a whole. Currently, phrap puts its
version and phrap command line options in a WA tag.
You can append CT, WA, and RT tags to the end of the ace file in any
order you like.