Record¶
The Record class represents a single FASTA/FASTQ sequence record with name, sequence, quality, and comment fields.
Constructor¶
Creates a new sequence record.
Parameters
name(str): Sequence identifier (the part after>or@).sequence(str): Nucleotide sequence.quality(Optional[str]): Quality scores (for FASTQ records). DefaultNone.comment(Optional[str]): Optional comment (the part after the identifier on the same line). DefaultNone.
Examples
# FASTA record
fa = Record("seq1", "ACGTACGT", comment="example")
# FASTQ record
fq = Record("read1", "ACGT", "IIII")
Attributes¶
All attributes are gettable and settable.
name → str¶
Sequence identifier.
sequence → str¶
Nucleotide sequence.
quality → str¶
Quality scores (empty string for FASTA records).
comment → Optional[str]¶
Comment line (without the leading space). None if absent.
length → int¶
Length of the sequence (same as len(record)).
Sequence Operations¶
upper(inplace=False)¶
Returns the sequence converted to uppercase.
Parameters
inplace(bool): IfTrue, modifies the record’s sequence in place. DefaultFalse.
Returns
str: Uppercase sequence.
Example
rec = Record("id", "acgt")
print(rec.upper()) # "ACGT"
print(rec.sequence) # "acgt"
rec.upper(inplace=True)
print(rec.sequence) # "ACGT"
lower(inplace=False)¶
Returns the sequence converted to lowercase.
Parameters
inplace(bool): IfTrue, modifies the record’s sequence in place. DefaultFalse.
Returns
str: Lowercase sequence.
reverse(inplace=False)¶
Returns the reverse of the sequence.
Parameters
inplace(bool): IfTrue, reverses the record’s sequence in place. DefaultFalse.
Returns
str: Reversed sequence.
Example
hpc()¶
Homopolymer compression: removes consecutive identical bases.
Returns
str: Compressed sequence.
Example
subseq(start, length)¶
Extracts a subsequence.
Parameters
start(int): Start index (0‑based).length(int): Number of bases to extract.
Returns
str: Subsequence.
Raises
AssertionError: Ifstartorstart + lengthare out of bounds.
Example
kmers(k)¶
Generates all k‑mers of length k from the sequence.
Parameters
k(int): K‑mer length.
Yields
str: Next k‑mer.
Raises
ValueError: Ifk > len(record).
Example
Slicing¶
Records support slice notation to extract subsequences:
Internally uses __gititem__ (note the typo in the current implementation). The slice must be contiguous; step is not supported.
Magic Methods¶
__len__()¶
Returns sequence length. Equivalent to record.length.
__str__()¶
Returns a short representation: "seqioRecord(name=...)".
__repr__()¶
Returns a developer‑friendly representation: "seqioRecord(name=..., len=...)".
Internal Notes¶
- The
Recordclass wraps a low‑level_seqioRecordC++ object. - The
_fromRecordclass method is used internally byseqioFileto construct Python records from C++ objects. - Quality strings are stored as empty strings for FASTA records; the
qualityproperty returns"", notNone.
Pickling Support¶
Record instances can be serialized with pickle: