Recherche de patterns proteiques ou nucleiques

Accéder à l'outil de crétion de séquences par permutation

Ce site permet de rechercher des patterns (syntaxe prosite) dans un lot de séquences protéiques ou nucléiques fournies par l'utilisateur ou contenues dans des bases de données existantes.
La recherche se fait avec fuzzpro, fuzznuc ou fuzztran.

Le résultat indique l'ordre des patterns trouvés sur les séquences.

Proteins sequences / proteic patterns
Nucleic sequences / nucleic patterns
Nucleic sequences / proteic patterns (sequences will be translated)

Séquences à sonder	Patterns (syntaxe Prosite) à rechercher

Ou fichier fasta

N'afficher que les séquences ayant tous les patterns soumis

The PROSITE pattern definition from the PROSITE documentation follows (proteic pattern).
    The standard IUPAC one-letter codes for the amino acids are used.
    The symbol `x' is used for a position where any amino acid is accepted.
    Ambiguities are indicated by listing the acceptable amino acids for a given position, between square parentheses `[ ]'. For example: [ALT] stands for Ala or Leu or Thr.
    Ambiguities are also indicated by listing between a pair of curly brackets `{ }' the amino acids that are not accepted at a given position. For example: {AM} stands for any amino acid except Ala and Met.
    Each element in a pattern is separated from its neighbor by a `-'. (Optional in fuzzpro).
    Repetition of an element of the pattern can be indicated by following that element with a numerical value or a numerical range between parenthesis. Examples: x(3) corresponds to x-x-x, x(2,4) corresponds to x-x or x-x-x or x-x-x-x.
    When a pattern is restricted to either the N- or C-terminal of a sequence, that pattern either starts with a `<' symbol or respectively ends with a `>' symbol.
    A period ends the pattern. (Optional in fuzzpro).
    All other characters, including spaces are not allowed.

For example, in SWISSPROT entry 100K_RAT you can look for the pattern:

[DE](2)HS{P}X(2)PX(2,4)C

This means: Two Asps or Glus in any order followed by His, Ser, any residue other then Pro, then two of any residue followed by Pro followed by two to four of any residue followed by Cys.

The search is case-independent, so 'AAA' matches 'aaa'.

The PROSITE pattern definition from the PROSITE documentation follows (nucleic pattern).
    The standard IUPAC one-letter codes for the nucleotides are used.
    The symbol 'n' is used for a position where any nucleotide is accepted.
    Ambiguities are indicated by listing the acceptable nucleotides for a given position, between square parentheses '[ ]'. For example: [ACG] stands for A or C or G.
    Ambiguities are also indicated by listing between a pair of curly brackets '{ }' the nucleotides that are not accepted at a given position. For example: {AG} stands for any nucleotides except A and G.
    Each element in a pattern is separated from its neighbor by a '-'. (Optional in fuzznuc).
    Repetition of an element of the pattern can be indicated by following that element with a numerical value or a numerical range between parenthesis. Examples: N(3) corresponds to N-N-N, N(2,4) corresponds to N-N or N-N-N or N-N-N-N.
    When a pattern is restricted to either the 5' or 3' end of a sequence, that pattern either starts with a '<' symbol or respectively ends with a '>' symbol.
    A period ends the pattern. (Optional in fuzznuc).

For example, [CG](5)TG{A}N(1,5)C