Class for the enzymatic digestion of proteins. More...
#include <OpenMS/CHEMISTRY/EnzymaticDigestion.h>
Classes | |
| struct | BindingSite |
| struct | CleavageModel |
Public Types | |
| enum | Enzyme { ENZYME_TRYPSIN, SIZE_OF_ENZYMES } |
| Possible enzymes for the digestion (adapt NamesOfEnzymes & nextCleavageSite_() if you add more enzymes here) More... | |
| enum | Specificity { SPEC_FULL, SPEC_SEMI, SPEC_NONE, SIZE_OF_SPECIFICITY } |
| when querying for valid digestion products, this determines if the specificity of the two peptide ends is considered important More... | |
Public Member Functions | |
| EnzymaticDigestion () | |
| Default constructor. More... | |
| EnzymaticDigestion (const EnzymaticDigestion &rhs) | |
| Copy constructor. More... | |
| EnzymaticDigestion & | operator= (const EnzymaticDigestion &rhs) |
| Assignment operator. More... | |
| SignedSize | getMissedCleavages () const |
| Returns the number of missed cleavages for the digestion. More... | |
| void | setMissedCleavages (SignedSize missed_cleavages) |
| Sets the number of missed cleavages for the digestion (default is 0). This setting is ignored when log model is used. More... | |
| Enzyme | getEnzyme () const |
| Returns the enzyme for the digestion. More... | |
| void | setEnzyme (Enzyme enzyme) |
| Sets the enzyme for the digestion (default is ENZYME_TRYPSIN). More... | |
| Specificity | getSpecificity () const |
| Returns the specificity for the digestion. More... | |
| void | setSpecificity (Specificity spec) |
| Sets the specificity for the digestion (default is SPEC_FULL). More... | |
| void | digest (const AASequence &protein, std::vector< AASequence > &output) const |
| Performs the enzymatic digestion of a protein. More... | |
| Size | peptideCount (const AASequence &protein) |
Returns the number of peptides a digestion of protein would yield under the current enzyme and missed cleavage settings. More... | |
| bool | isLogModelEnabled () const |
| use trained model when digesting? More... | |
| void | setLogModelEnabled (bool enabled) |
| enables/disabled the trained model More... | |
| DoubleReal | getLogThreshold () const |
| Returns the threshold which needs to be exceeded to call a cleavage (only for the trained cleavage model on real data) More... | |
| void | setLogThreshold (DoubleReal threshold) |
| bool | isValidProduct (const AASequence &protein, Size pep_pos, Size pep_length) |
Returns true if peptide at position pep_pos with length pep_length within protein protein was generated by the current model. More... | |
Static Public Member Functions | |
| static Enzyme | getEnzymeByName (const String &name) |
| static Specificity | getSpecificityByName (const String &name) |
Static Public Attributes | |
| static const std::string | NamesOfEnzymes [SIZE_OF_ENZYMES] |
| Names of the Enzymes. More... | |
| static const std::string | NamesOfSpecificity [SIZE_OF_SPECIFICITY] |
| Names of the Specificity. More... | |
Protected Member Functions | |
| void | nextCleavageSite_ (const AASequence &sequence, AASequence::ConstIterator &p) const |
moves the iterator p behind (i.e., C-term) the next cleavage site of the sequence More... | |
| bool | isCleavageSite_ (const AASequence &sequence, const AASequence::ConstIterator &p) const |
tests if position pointed to by p (N-term side) is a valid cleavage site More... | |
Protected Attributes | |
| SignedSize | missed_cleavages_ |
| Number of missed cleavages. More... | |
| Enzyme | enzyme_ |
| Used enzyme. More... | |
| Specificity | specificity_ |
| specificity of enzyme More... | |
| bool | use_log_model_ |
| use the log model or naive digestion (with missed cleavages) More... | |
| DoubleReal | log_model_threshold_ |
| Threshold to decide if position is cleaved or missed (only for the model) More... | |
| Map< BindingSite, CleavageModel > | model_data_ |
| Holds the cleavage model. More... | |
Class for the enzymatic digestion of proteins.
Digestion can be performed using simple regular expressions, e.g. [KR] | [^P] for trypsin. Also missed cleavages can be modelled, i.e. adjacent peptides are not cleaved due to enzyme malfunction/access restrictions. If n missed cleavages are given, all possible resulting peptides (cleaved and uncleaved) with up to n missed cleavages are returned. Thus no random selection of just n specific missed cleavage sites is performed.
An alternative model is also available, where the protein is cleaved only at positions where a cleavage model trained on real data, exceeds a certain threshold. The model is published in Siepen et al. (2007), "Prediction of missed cleavage sites in tryptic peptides aids protein identification in proteomics.", doi: 10.1021/pr060507u The model is only available for trypsin and ignores the missed cleavage setting. You should however use setLogThreshold() to adjust FP vs FN rates. A higher threshold increases the number of cleavages predicted.
| enum Enzyme |
Possible enzymes for the digestion (adapt NamesOfEnzymes & nextCleavageSite_() if you add more enzymes here)
| Enumerator | |
|---|---|
| ENZYME_TRYPSIN | |
| SIZE_OF_ENZYMES | |
| enum Specificity |
Default constructor.
| EnzymaticDigestion | ( | const EnzymaticDigestion & | rhs | ) |
Copy constructor.
| void digest | ( | const AASequence & | protein, |
| std::vector< AASequence > & | output | ||
| ) | const |
Performs the enzymatic digestion of a protein.
| Enzyme getEnzyme | ( | ) | const |
Returns the enzyme for the digestion.
convert enzyme string name to enum returns SIZE_OF_ENZYMES if name is not valid
| DoubleReal getLogThreshold | ( | ) | const |
Returns the threshold which needs to be exceeded to call a cleavage (only for the trained cleavage model on real data)
| SignedSize getMissedCleavages | ( | ) | const |
Returns the number of missed cleavages for the digestion.
| Specificity getSpecificity | ( | ) | const |
Returns the specificity for the digestion.
|
static |
convert spec string name to enum returns SIZE_OF_SPECIFICITY if name is not valid
|
protected |
tests if position pointed to by p (N-term side) is a valid cleavage site
| bool isLogModelEnabled | ( | ) | const |
use trained model when digesting?
| bool isValidProduct | ( | const AASequence & | protein, |
| Size | pep_pos, | ||
| Size | pep_length | ||
| ) |
Returns true if peptide at position pep_pos with length pep_length within protein protein was generated by the current model.
|
protected |
moves the iterator p behind (i.e., C-term) the next cleavage site of the sequence
| EnzymaticDigestion& operator= | ( | const EnzymaticDigestion & | rhs | ) |
Assignment operator.
| Size peptideCount | ( | const AASequence & | protein | ) |
Returns the number of peptides a digestion of protein would yield under the current enzyme and missed cleavage settings.
| void setEnzyme | ( | Enzyme | enzyme | ) |
Sets the enzyme for the digestion (default is ENZYME_TRYPSIN).
| void setLogModelEnabled | ( | bool | enabled | ) |
enables/disabled the trained model
| void setLogThreshold | ( | DoubleReal | threshold | ) |
Sets the threshold which needs to be exceeded to call a cleavage (only for the trained cleavage model on real data) Default is 0.25
| void setMissedCleavages | ( | SignedSize | missed_cleavages | ) |
Sets the number of missed cleavages for the digestion (default is 0). This setting is ignored when log model is used.
| void setSpecificity | ( | Specificity | spec | ) |
Sets the specificity for the digestion (default is SPEC_FULL).
|
protected |
Used enzyme.
|
protected |
Threshold to decide if position is cleaved or missed (only for the model)
|
protected |
Number of missed cleavages.
|
protected |
Holds the cleavage model.
|
static |
Names of the Enzymes.
|
static |
Names of the Specificity.
|
protected |
specificity of enzyme
|
protected |
use the log model or naive digestion (with missed cleavages)
| OpenMS / TOPP release 1.11.1 | Documentation generated on Thu Nov 14 2013 11:19:27 using doxygen 1.8.5 |