The HEREDITARY project is launching the GutBrainIE Task #6 of the BioASQ Lab as part of the CLEF 2026 conference, to be held from September 21-24, 2026 at the Friedrich-Schiller-Universität in Jena, Germany.

The GutBrainIE is a Natural Language Processing (NLP) challenge focusing on advancing information extraction from biomedical literature. In this edition participants will be asked to develop and benchmark NLP systems capable of extracting structured knowledge from PubMed abstracts related to the gut-brain axis and its associations with Alzheimer’s disease, Parkinson’s disease, Multiple Sclerosis, Amyotrophic Lateral Sclerosis (ALS), and mental health.

Subtasks Overview

The GutBrainIE task is divided into two main subtasks. In the first task, participants are asked to identify and classify specific text spans into predefined categories, while in the second one they have to determine if a particular relationship defined between two categories holds or not.

These tasks are also divided into 4 four subtasks covering entity recognition, disambiguation, and relation extraction:

  • Subtask 6.1.1 – Named Entity Recognition (NER)

Participants must identify text spans and classify them into one of 13 predefined categories, such as bacteria, chemical or microbiota.

  • Subtask 6.1.2 – Named Entity Recognition and Disambiguation (NERD)

Following the Subtask 6.1.1, entity mentions must be linked to concept identifiers from selected biomedical reference resources.

  • Subtask 6.2.1 – Mention-Level Relation Extraction (M-RE)

Teams must detect relations between specific entity mentions within abstracts.

  • Subtask 6.2.2 – Concept-Level Relation Extraction (C-RE)

This subtask is related to the concept level, enabling systems to capture deeper knowledge connections.

Each task requires participants to submit structured tuples following clearly defined formats, with examples available in the official submission guidelines.

Growing International Participation

Interest in GutBrainIE continues to expand. Last year, 17 teams worldwide took part in entity and relation extraction challenges within BioASQ. The 2026 edition significantly extends the scope by introducing:

  • A new entity linking task.
  • One of the largest domain-specific relation extraction collections.
  • Enhanced annotation efforts involving 10+ domain experts.
  • Collaboration with 70+ trained layman annotators.
  • A revised and improved dataset building upon previous editions.

Early registrants receive priority access to the training datasets, making this a valuable opportunity for research groups working on entity extraction, relation extraction, or entity disambiguation in specialized domains.

Registration for CLEF 2026 is open until April 2026!