Changeset 63


Ignore:
Timestamp:
Aug 11, 2010, 4:56:33 PM (10 years ago)
Author:
j@…
bzr:base-revision:
j@dannynavarro.net-20100811134542-zwi05fd7d7sfa5b3
bzr:committer:
Danny Navarro <j@dannynavarro.net>
bzr:file-ids:

mzcms/parsers.py parsers.py-20100806092910-g1sxvv1o5b9umkof-1
bzr:mapping-version:
v4
bzr:repository-uuid:
724254b2-fbe6-419d-9466-c04ef4c9d29d
bzr:revision-id:
j@dannynavarro.net-20100811144627-6jp78gyefi7hkaxm
bzr:revno:
63
bzr:revprop:branch-nick:
trunk
bzr:root:
trunk
bzr:timestamp:
2010-08-11 16:46:27.408999920 +0200
bzr:user-agent:
bzr2.1.2+bzr-svn1.0.3
svn:original-date:
2010-08-11T14:46:27.409000Z
Message:

Fixed StopIteration? fasta parser exception. Some sequences are still missing, though

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/mzcms/parsers.py

    r62 r63  
    265265
    266266def parse_fasta(fasta_path, proteins):
    267     # XXX: Improve parser, don't seek back
     267    # XXX: Improve parser, don't seek back. Make indepdent of protein
     268    # folder
    268269    """Update protein sequence container from fasta file"""
    269270    fasta_file = open(fasta_path, 'rb')
    270271    while True:
    271         line = fasta_file.readline().decode('utf-8').strip()
    272         if line.startswith('>IPI:IPI'):
    273             seq_lines = list()
    274             prot_id = line.split('|')[0][5:]
    275             while True:
    276                 try:
    277                     line = fasta_file.readline().decode('utf-8').strip()
    278                 except StopIteration:
    279                     return
    280                 if not line.startswith('>'):
    281                     seq_lines.append(line)
    282                     last_line = fasta_file.tell()
    283                 else:
    284                     sequence = ''.join(seq_lines)
    285                     if prot_id in proteins and \
    286                         proteins[prot_id].sequence == 'TBI':
    287                         proteins[prot_id].sequence = sequence
    288                         fasta_file.seek(last_line)
    289                     break
     272        try:
     273            line = fasta_file.next().decode('utf-8').strip()
     274        except StopIteration:
     275            return
     276        else:
     277            if line.startswith('>IPI:IPI'):
     278                seq_lines = list()
     279                prot_id = line.split('|')[0][5:]
     280                while True:
     281                    line = fasta_file.next().decode('utf-8').strip()
     282                    if not line.startswith('>'):
     283                        seq_lines.append(line)
     284                        last_line = fasta_file.tell()
     285                    else:
     286                        if seq_lines:
     287                            sequence = ''.join(seq_lines)
     288                            if prot_id in proteins and \
     289                                    proteins[prot_id].sequence == 'TBI':
     290                                proteins[prot_id].sequence = sequence
     291                                fasta_file.seek(last_line)
     292                        break
    290293
    291294# XXX: Use better defaults for containers and factories
Note: See TracChangeset for help on using the changeset viewer.