Sujai's mind dump

    • Edit
    • Delete
    • Tags
    • Autopost

    contigimage - create contig images based on .ace file

    [ Excerpt from CONSED Documentation ]
    • Intro: New .ace file format
    • ACE File Format
    • Sample Ace File

    NEW ACE FILE FORMAT

    There is a new ace file format (since early 1998).  If you still
    haven't changed to the new ace file format, you must do so now since
    it contains information that is not contained in the old ace file
    format.  This additional information (e.g., the alignment and quality
    clipping values) are essential for some of the Consed functions (e.g.,
    navigate by single stranded, navigate by single subclone, Autofinish)
    to work correctly.
    
    Another reason to switch to the new ace format is that you will get
    faster Consed startup performance.  The new ace file format is also
    much smaller (about 60% as big as the old).
    
    The new phrap (Aug 1998 and better) writes the new ace format (using
    the -new_ace switch).  Since Consed now uses the additional
    information found only in the new ace format, if you are editing an
    assembly, you should first re-phrap to take advantage of this
    additional information.
    
    Consed can read either old or new ace format.
    Consed can also write either new or old ace format.  It write the new
    ace format by default--see 'Options'/'General Preferences'.  Also see
    the Consed parameter:
    
    consed.writeThisAceFormat: 2
    
    (where 2 means 'new' and 1 means 'old')
    
    If you have scripts that read the ace file, you will need to modify
    those scripts for the new ace format.  Here is the format:

    Ace File Format

    Refer to the accompanying sample_ace_file.txt (below)
    
    AS <number of contigs> <total number of reads in ace file>
    
    CO <contig name> <# of bases> <# of reads in contig> <# of base segments in contig> <U or C>
    
    The U or C indicates whether the contig has been complemented from the
    way phrap originally created it.  Thus this is always U for an ace
    file created by phrap.
    
    BQ
    
    This starts the list of base qualities for the unpadded consensus
    bases.  The contig is the one from the previous CO, hence no name is
    needed here.
    
    AF <read name> <C or U> <padded start consensus position>
    
    This line replaces the 'AssembledFrom*' line in the previous ace file
    format.  C or U means complemented or uncomplemented.  The <read name>
    is the true read name (no .comp on it as with the previous ace file
    format.)
    
    BS <padded start consensus position> <padded end consensus position> <read name>
    
    This replaces the 'BaseSegment*' line from the previous ace file format.
    
    RD <read name> <# of padded bases> <# of whole read info items> <# of read tags>
    
    QA <qual clipping start> <qual clipping end> <align clipping start> <align clipping end>
    
    This is new information not found in the previous ace file.  If the
    entire read is low quality, then <qual clipping start> and <qual
    clipping end> will both be -1.  These positions are offsets from the
    left end of the read (left, as shown in Consed).  Hence for bottom
    strand reads, the offsets are from the end of the read.  The offsets
    are 1-based.  That is, if the left-most base is in the aligned,
    high-quality region, <qual clipping start> = 1 and <align clipping
    start> = 1 (not zero).
    
    DS CHROMAT_FILE: <name of chromat file> PHD_FILE: <name of phd file> TIME: <date/time of the phd file> CHEM: <prim, term, unknown, etc> DYE: <usually ET, big, etc> TEMPLATE: <template name> DIRECTION: <fwd or rev>
    
    There can be additional information on this line.
    This replaces the DESCRIPTION line from the old ace file.
    
    The following is for transient read tags (those generated by
    crossmatch and phrap).  They are not fully implemented, and the format
    may eventually change.  The read is implied by the location of the
    whole read info item within the ace file.  They are found after the DS
    line for a read.
    
    RT{
    <read name> <tag type> <what program created tag> <padded read pos start> <padded read pos end> <date when tag was created in form YYMMDD:HHMISS>
    }
    
    for example:
    
    RT{
    djs14_680.s1 matchElsewhereLowQual phrap 904 933 990823:114356
    }
    
    There are consensus tags now in the ace file.  All consensus tags have
    the following format:
    
    CT{
    <contig name> <tag type> <what program created tag> <padded cons pos start> <padded cons pos end> <date when tag was created in form YYMMDD> <NoTrans>
    (possibly additional information)
    }
    
    The NoTrans is optional--it indicates that, when you reassemble, this
    tag should not be transferred to the new assembly.  This is true with
    tags that should be recreated each time because they have to do with
    the assembly (e.g., repeat tags).
    
    e.g.,
    
    CT{
    Contig206 repeat tagRepeats.perl 118732 119060 990823:115033 NoTrans
    AluY
    }
     
    In the case of most consensus tag types, there is only 1 line for the
    consensus tag.  In the case of comment tags and oligo tags, there are
    additional lines of information.  The comment tag includes the comment
    on the additional lines.  The oligo tag has the following information:
    <oligo name> <oligo bases from 5' to 3'> <melting temp> <C or U
    indicating whether the oligo is top strand or bottom strand relative
    to the orientation of the contig as created by phrap>
    
    WA{
    <tag type> <what program created tag> <date tag was created in form YYMMDD:HHMISS>
    1 or more lines of data
    }
    
    This line is a 'whole assembly' tag.  It is used for information
    referring to the assembly as a whole.  Currently, phrap puts its
    version and phrap command line options in a WA tag.
    
    You can append CT, WA, and RT tags to the end of the ace file in any
    order you like.
    via animalgenome.org

    • 13 January 2010
    • Views
    • Permalink
    • Favorited 0 Times
    • Tweet
  • About Sujai

  • Subscribe

    Subscribe to this posterous
    Unsubscribe
    Follow this posterous RSS
    You're a contributor here (Edit)
    This is your Space (Edit)
    Follow by email »
    Get the latest updates in your email box automatically.
  • Follow Me

      Twitter

Theme created for Posterous by Obox