56 items found

Groups: Societal Debates and Misinformation Tags: Text mining

Filter Results
  • Experiment

    Annotazione semantica di delibere comunali

    Progetto POC per l'uso delle tecniche di text mining su documenti della pubblica amministrazione per migliorare la trasparenza e l’accesso alle informazioni da parte dei...
    • PDF
      The resource: 'Annotazione Delibere' is not accessible as guest user. You must login to access it!
    • ZIP
      The resource: 'Codice sorgente' is not accessible as guest user. You must login to access it!
  • Dataset

    Synthetic Datasets for Fine-Grained Fairness Analysis of Abusive Language Det...

    Three synthetic datasets covering different types of bias grouped by target, namely sexism, racism and ableism. The reason for distinguishing the records by abuse targets is...
    • CSV
      The resource: 'Synthetic Datasets for ...' is not accessible as guest user. You must login to access it!
  • Dataset

    Articles and comments of major Estonian newspapers

    The dataset contains articles and comments of four major Estonian news portals since early 2000s to 2016.
  • Dataset

    Brexit Twitter User Vote Intent

    A list of users for which vote intent in the UK EU membership referendum has been established.
  • Dataset

    Sheffield NERD Tweet Corpus

    The dataset contais 794 tweets annotated with named entities disambiguated against DBpedia, and split into equally sized training and test portions. 400 tweets from 2013 comes...
    • FINF
      The resource: 'Sheffield NERD Tweet Corpus' is not accessible as guest user. You must login to access it!
  • Dataset

    UK General Election Vote Intent

    A list of Twitter users for whom party political allegiance/vote intent has been established.
  • TrainingMaterial

    Introduction to Data Curation

    This course is an introduction to data collection, data preparation & transformation and data analysis. It contains the essential concepts for a researcher in order to...
    • PDF
      The resource: 'Introduction to Data Curation' is not accessible as guest user. You must login to access it!
  • Dataset

    Twitter social bots

    Spambots are automated accounts (i.e., accounts driven by a bot) that repeatedly advertise unsolicited and often harmful content (e.g., malware, URLs to phishing Web sites,...
  • Dataset

    Broad Twitter Corpus

    The Broad Twitter Corpus is a named entity-annotated dataset of tweets, collected in order to capture temporal, spatial and social diversity. The goal of the corpus is to...
    • JSON
      The resource: 'Broad Twitter Corpus' is not accessible as guest user. You must login to access it!
  • Dataset

    Twitter fake followers

    Fake followers are fake accounts massively created to follow a target account and that can be bought from online markets. In other words, their goal is that of increasing the...
  • Method

    Measurement Expression Annotator

    Annotates numbers and measurement expressions in text. This method recognises many types of measurements including length, temperature, time and speed, and calculates their...
    • method-engine
      The resource: 'Run method' is not accessible as guest user. You must login to access it!
  • Application

    SWAT

    SWAT is a entity-salience system which identifies on-the-fly the semantic focus of a document, expressed by its Salient Wikipedia Entities. The core of this technology is...
  • Method

    Twitter Opinion Mining English

    This tool recognises opinionated sentences in English tweets and it classifies them as positive or negative. It also indicates emotion type, author and target of the opinion,...
    • method-engine
      The resource: 'Run method' is not accessible as guest user. You must login to access it!
  • Method

    Summa Text Summarization (Es)

    The SUMMA Text Summarization (ES) uses the SUMMA toolkit developed by Horacio Saggion to provide a generic Spanish document summarizer.
    • method-engine
      The resource: 'Run method' is not accessible as guest user. You must login to access it!
  • Method

    GATE Cloud COVID-19 Misinformation Categoriser

    A machine learning classifier trained to categorise claims about COVID-19 into 10 categories proposed by the Reuters Institute for the Study of Journalism - Public authority...
    • method-engine
      The resource: 'Method Engine' is not accessible as guest user. You must login to access it!
  • Method

    DecarboNet Environmental Annotator

    The DecarboNet environmental annotation service identifies named entities, environmental terms, linguistic features and sentiment in social media texts.
    • method-engine
      The resource: 'Run method' is not accessible as guest user. You must login to access it!
  • Application

    WAT

    WAT is an entity linker, namely a tool that identifies meaningful substrings (called "spots") in an unstructured English text and link each of them to the unambiguous entity...
    • HTML
      The resource: 'Link to the Application' is not accessible as guest user. You must login to access it!
  • Method

    Part Of Speech Tagger For Tweets

    This service tags tweets with part-of-speech information, e.g. nouns and verbs.
    • method-engine
      The resource: 'Run method' is not accessible as guest user. You must login to access it!
  • Method

    GATE Cloud Rumour Veracity Classifier

    User generated content such as tweets often make claims that are unsubstantiated and possibly untrue. This service attempts to classify whether a text is discussing a rumour...
    • method-engine
      The resource: 'Method Engine' is not accessible as guest user. You must login to access it!
  • Method

    German Named Entity Recognizer For Tweets

    This method analyses German tweets for names of persons, locations and organizations. It also performs normalization of abbreviations and commonTwitter slang.
    • method-engine
      The resource: 'Run method' is not accessible as guest user. You must login to access it!