Beruflich Dokumente
Kultur Dokumente
spaCy
A D VA N C E D N L P W I T H S PA C Y
Ines Montani
spaCy core developer
The nlp object
# Import the English language class
from spacy.lang.en import English
Hello
world
!
world
world!
Index: [0, 1, 2, 3, 4]
Text: ['It', 'costs', '$', '5', '.']
is_alpha: [True, True, False, False, False]
is_punct: [False, False, False, False, True]
like_num: [False, False, False, True, False]
Ines Montani
spaCy core developer
What are statistical models?
Enable spaCy to predict linguistic attributes in context
Part-of-speech tags
Syntactic dependencies
Named entities
nlp = spacy.load('en_core_web_sm')
Binary weights
Vocabulary
# Process a text
doc = nlp("She ate the pizza")
She PRON
ate VERB
the DET
pizza NOUN
# Process a text
doc = nlp(u"Apple is looking at buying U.K. startup for $1 billion")
Apple ORG
U.K. GPE
$1 billion MONEY
spacy.explain('GPE')
spacy.explain('NNP')
spacy.explain('dobj')
'direct object'
Ines Montani
spaCy core developer
Why not just regular expressions?
Match on Doc objects, not just strings
iPhone X
loved dogs
love cats
bought a smartphone
buying apps