memo regex

Rédigé par Paulo Aucun commentaire
Classé dans : Bash, Linux, Divers Mots clés : regex

Memo 'classe de caractere' utilisable en vim, sed, python et probablement bien d'autres...
Attention: pour utiliser les expressions ci-dessous : il faut les encadrer par des []
ex : [[:digit:]]

    Alphanumeric characters: ‘[:alpha:]’ and ‘[:digit:]’; in the ‘C’ locale and ASCII character encoding, this is the same as ‘[0-9A-Za-z]’.

    Alphabetic characters: ‘[:lower:]’ and ‘[:upper:]’; in the ‘C’ locale and ASCII character encoding, this is the same as ‘[A-Za-z]’.

    Blank characters: space and tab.

    Control characters. In ASCII, these characters have octal codes 000 through 037, and 177 (DEL). In other character sets, these are the equivalent characters, if any.

    Digits: 0 1 2 3 4 5 6 7 8 9.

    Graphical characters: ‘[:alnum:]’ and ‘[:punct:]’.

    Lower-case letters; in the ‘C’ locale and ASCII character encoding, this is a b c d e f g h i j k l m n o p q r s t u v w x y z.

    Printable characters: ‘[:alnum:]’, ‘[:punct:]’, and space.

    Punctuation characters; in the ‘C’ locale and ASCII character encoding, this is ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~.

    Space characters: in the ‘C’ locale, this is tab, newline, vertical tab, form feed, carriage return, and space.

    Upper-case letters: in the ‘C’ locale and ASCII character encoding, this is A B C D E F G H I J K L M N O P Q R S T U V W X Y Z.

    Hexadecimal digits: 0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f.

source :


any character except new line    
whitespace character
non-whitespace character
hex digit
non-hex digit
octal digit
non-octal digit
head of word character (a,b,c...z,A,B,C...Z and _)
non-head of word character
printable character
like \p, but excluding digits
word character
non-word character
alphabetic character
non-alphabetic character
lowercase character
non-lowercase character
uppercase character
non-uppercase character


matches 0 or more of the preceding characters, ranges or metacharacters .* matches everything including empty line
matches 1 or more of the preceding characters...
matches 0 or 1 more of the preceding characters...
matches from n to m of the preceding characters...
matches exactly n times of the preceding characters...
matches at most m (from 0 to m) of the preceding characters...
matches at least n of of the preceding characters...
where n and m are positive integers (>0)

matches 0 or more of the preceding atom, as few as possible
matches 1 or more of the preceding characters...
matches at lease or more of the preceding characters...
matches 1 or more of the preceding characters...
where n and m are positive integers (>0)

Python, gestion des groupes :

import re

# definition des groupes : device, tag et alias
RE_IFACE = re.compile('^(?P<device>[^.]+)'

m = RE_IFACE.match(iface)
if m: 
    # acces au contenu de chaque groupe
    print m.groups()
    device, tag, alias ='device', 'tag', 'alias')


Les commentaires sont fermés.