LINGUIST List 26.3643

Fri Aug 14 2015

Software: English; Portuguese; Spanish; Computational Linguistics; Semantics; Syntax; Text/Corpus Linguistics: SentiLecto 2.4

Editor for this issue: Andrew Lamont <>

Date: 13-Aug-2015
From: Fernando Balbachan <>
Subject: English; Portuguese; Spanish; Computational Linguistics; Semantics; Syntax; Text/Corpus Linguistics: SentiLecto 2.4
E-mail this message to a friend

SentiLecto demo:

Sentilecto is a NLU engine that yields a highly fine-grained representation of complex texts. The pipeline starts by splitting text into sentences and clauses, then maps clauses into SVO slots just the way native spearker would understand natural language. SentiLecto leans on outstanding linguistic features such as: passive/active voice transformation, negation scope, anaphora resolution and co-reference chains, modality treatment, semantic features (animity and others) and accurate verbal frames for all Spanish verbs, even with 'se-impersonal' usages ('se mostraron retratos' = 'alguien mostrĂ³ retratos' = 'somebody showed portraits'), 'se-clitic' usages (for example, plain action 'mostrar' 'to show something' vs. 'mostrarSE' 'to show yourself, namely to feel some way before a situation').

Also, SentiLecto can flawlessly identify whether or not an utterance is a real fact (fact mining) over which an opinion could span, and it can recognize & classify named-entities (NERC) with identity matching.

Finally, SentiLecto better suits into entity-based Sentiment Analysis paradigm. Unlike other approaches, this solution can deal with polarity shifting in the same sentence ('I like chocolate but I hate strawberry ice-cream'), within embedded clauses ('Norwegians, who are an aggressive people, export the exquisite herring'), or even onto the very same word ('Somebody who wasted a chance to do something' means that person did something bad about something good). SentiLecto better represents the premise whereby the entities involved in the opinion are syntactically mapped onto SVO (subject-verb-object) slots for their sentiment assignments: 'Mary hates John' (2 entities but only the object has a negative presentation) vs. 'Mary defames John' (the same 2 entities but only the subject has negative presentation).

SentiLecto is being used to automatically generate this blog with more than 300 high-quality posts on a daily basis, rewriting and enriching content and, more interestingly, merging news covering the same facts. This is just a show case of SentiLecto's NLU capabilities.

SentiLecto currently works only for Spanish, but soon it will be available for Brazilian Portuguese (1 month) and English (3 months)

Looking forward to hearing about Linguists' feedback.

Dr. Fernando Balbachan, Ph.D.

Linguistic Field(s): Computational Linguistics
                            Text/Corpus Linguistics

Subject Language(s): English (eng)
                            Portuguese (por)
                            Spanish (spa)

Page Updated: 14-Aug-2015