-
Annotazione semantica di delibere comunali
Progetto POC per l'uso delle tecniche di text mining su documenti della pubblica amministrazione per migliorare la trasparenza e l’accesso alle informazioni da parte dei... -
Synthetic Datasets for Fine-Grained Fairness Analysis of Abusive Language Det...
Three synthetic datasets covering different types of bias grouped by target, namely sexism, racism and ableism. The reason for distinguishing the records by abuse targets is...-
CSV
The resource: 'Synthetic Datasets for ...' is not accessible as guest user. You must login to access it!
-
CSV
-
Articles and comments of major Estonian newspapers
The dataset contains articles and comments of four major Estonian news portals since early 2000s to 2016. -
Brexit Twitter User Vote Intent
A list of users for which vote intent in the UK EU membership referendum has been established. -
Sheffield NERD Tweet Corpus
The dataset contais 794 tweets annotated with named entities disambiguated against DBpedia, and split into equally sized training and test portions. 400 tweets from 2013 comes...-
FINF
The resource: 'Sheffield NERD Tweet Corpus' is not accessible as guest user. You must login to access it!
-
FINF
-
UK General Election Vote Intent
A list of Twitter users for whom party political allegiance/vote intent has been established. -
Introduction to Data Curation
This course is an introduction to data collection, data preparation & transformation and data analysis. It contains the essential concepts for a researcher in order to...-
PDF
The resource: 'Introduction to Data Curation' is not accessible as guest user. You must login to access it!
-
PDF
-
Twitter social bots
Spambots are automated accounts (i.e., accounts driven by a bot) that repeatedly advertise unsolicited and often harmful content (e.g., malware, URLs to phishing Web sites,... -
Broad Twitter Corpus
The Broad Twitter Corpus is a named entity-annotated dataset of tweets, collected in order to capture temporal, spatial and social diversity. The goal of the corpus is to...-
JSON
The resource: 'Broad Twitter Corpus' is not accessible as guest user. You must login to access it!
-
JSON
-
Twitter fake followers
Fake followers are fake accounts massively created to follow a target account and that can be bought from online markets. In other words, their goal is that of increasing the... -
Measurement Expression Annotator
Annotates numbers and measurement expressions in text. This method recognises many types of measurements including length, temperature, time and speed, and calculates their...-
method-engine
The resource: 'Run method' is not accessible as guest user. You must login to access it!
-
method-engine
-
SWAT
SWAT is a entity-salience system which identifies on-the-fly the semantic focus of a document, expressed by its Salient Wikipedia Entities. The core of this technology is... -
Twitter Opinion Mining English
This tool recognises opinionated sentences in English tweets and it classifies them as positive or negative. It also indicates emotion type, author and target of the opinion,...-
method-engine
The resource: 'Run method' is not accessible as guest user. You must login to access it!
-
method-engine
-
Summa Text Summarization (Es)
The SUMMA Text Summarization (ES) uses the SUMMA toolkit developed by Horacio Saggion to provide a generic Spanish document summarizer.-
method-engine
The resource: 'Run method' is not accessible as guest user. You must login to access it!
-
method-engine
-
GATE Cloud COVID-19 Misinformation Categoriser
A machine learning classifier trained to categorise claims about COVID-19 into 10 categories proposed by the Reuters Institute for the Study of Journalism - Public authority...-
method-engine
The resource: 'Method Engine' is not accessible as guest user. You must login to access it!
-
method-engine
-
DecarboNet Environmental Annotator
The DecarboNet environmental annotation service identifies named entities, environmental terms, linguistic features and sentiment in social media texts.-
method-engine
The resource: 'Run method' is not accessible as guest user. You must login to access it!
-
method-engine
-
WAT
WAT is an entity linker, namely a tool that identifies meaningful substrings (called "spots") in an unstructured English text and link each of them to the unambiguous entity...-
HTML
The resource: 'Link to the Application' is not accessible as guest user. You must login to access it!
-
HTML
-
Part Of Speech Tagger For Tweets
This service tags tweets with part-of-speech information, e.g. nouns and verbs.-
method-engine
The resource: 'Run method' is not accessible as guest user. You must login to access it!
-
method-engine
-
GATE Cloud Rumour Veracity Classifier
User generated content such as tweets often make claims that are unsubstantiated and possibly untrue. This service attempts to classify whether a text is discussing a rumour...-
method-engine
The resource: 'Method Engine' is not accessible as guest user. You must login to access it!
-
method-engine
-
German Named Entity Recognizer For Tweets
This method analyses German tweets for names of persons, locations and organizations. It also performs normalization of abbreviations and commonTwitter slang.-
method-engine
The resource: 'Run method' is not accessible as guest user. You must login to access it!
-
method-engine