AN INITIATIVE BY

AI ready data

Pipeline for processing Greek texts and converting them into ready-to-use datasets for Large Language Models.

Dataset Timeline

Tracking the evolution of our datasets and total token ingestion

All datasets are released under Creative Commons licenses

APR 2026

eellak-articles

25.10MB

8.5M Tokens

APR 2026

opengov-deliberations-v2

357.71MB

111.4M Tokens

APR 2026

e-nautilia

4.61MB

2.7M Tokens

APR 2026

artos-zois

12.20MB

4.0M Tokens

APR 2026

amna-press

1.48GB

158.2M Tokens

APR 2026

ert-press

36.4MB

9.8M Tokens

MAR 2026

modern-greek-dictionary

33MB

4.9M Tokens

MAR 2026

istorima

416.02MB

138.9M Tokens

JAN 2026

openbook.gr

251.63MB

133M Tokens

JAN 2026

Greek PhD Theses Corpus

7.06GB

5.34B Tokens

JUN 2025

eurlex-greek-legislation

2.21GB

604M Tokens

APR 2025

ellinika_dedomena_europaikou_koinovouliou

1.09GB

273M Tokens

APR 2025

Apothetirio_Kallipos

572MB

196M Tokens

MAR 2025

Apothetirio_Pergamos

2.25GB

839M Tokens

JAN 2025

1000_prwta_xronia_ellhnikhs

104MB

33M Tokens

JAN 2025

Ekklisiastika_Keimena

16.7MB

6.5M Tokens

DEC 2024

Wikisource_Greek_texts

116.3MB

38M Tokens

DEC 2024

klasikh_arx_ell_grammateia

63.8MB

20.4M Tokens

DEC 2024

Sxolika_vivlia

31.0MB

10.1M Tokens

NOV 2024

Ellinika_Keimena_Project_Gutenberg

38.9MB

12.3M Tokens

NOV 2024

95k_deigma_ellinikis

28.3MB

2.94M Tokens

NOV 2024

dimodis_logotexnia

384KB

0.1M Tokens

Growth Chart

Cumulative Token Volume

7.952.178.676

TOTAL TOKENS

We've got an entire team dedicated to this project

Want to collaborate or get involved? We love partnerships and new contributors.

Get in touch

Prof. Petros Stefaneas

Scientific Advisor

Petros is scientifically responsible for GlossAPI, guiding the development of principled and reliable training material for NLP systems. His leadership ensures that GlossAPI not only processes Greek text with technical precision, but also upholds clarity, trustworthiness, and ethical integrity.

Foivos Karounos

Software Engineer

Foivos Karounos has studied Computer Science and Psychology and is interested in the development of the technological ecosystem in Greece. He has taken on various roles related to business strategy, cryptocurrency performance forecasting, and research in epistemology. His role in the glossAPI team is that of the lead Software Engineer (Former Chief Vibe Coder).

Myrsini Ioannou

Software Engineer

Myrsini Ioannou studied Applied Mathematics and Physical Sciences at the NTUA and holds a Master’s in Sound and Music Computing. She joined the glossAPI team in March 2025 as a Software Engineer, focusing on NLP technologies.

Nikos Tsekos

Software Engineer

Nikos Tsekos is an undergraduate Computer Engineering student and Software Engineer focused on machine learning applications. He currently works with GFOSS (Open Technologies Alliance) on the GlossAPI team, contributing to data pipelines and applied ML workflows.

Dimitris Athanasopoulos

Software Engineer

Dimitrios Athanasopoulos is an undergraduate student of Computer and Informatics Engineering and joined the glossAPI team through Google Summer of Code 2025, where he contributed to extending the pipeline and extracting new data, a project to which he continues to contribute. In parallel, he is involved in Web Development, participating in the development and maintenance of the present website.

Katerina Spanou

Data Engineer

Katerina Spanou studied Digital Systems at the University of Piraeus. She completed her postgraduate studies at the National and Kapodistrian University of Athens in the program “Digital Media Communication and Interaction Environments,” specializing in Natural Language Processing (NLP) and the training of machine learning models. She has worked as a data analyst and data engineer, focusing on the design and implementation of solutions for the processing, analysis, and correlation of linguistic and quantitative data. She joined the GlossAPI team in February 2026.

Ioanna Moura

Linguist

Ioanna Moura is a linguist and a trainee interpreter in Greek Sign Language (GSL). She completed her undergraduate studies in Greek Philology and her postgraduate studies in Language Technology at the National and Kapodistrian University of Athens.

Dimitrios Vogiatzis

Software Engineer