About Us

About Us

We are a group of volunteer researchers dedicated to promoting equal access to multimodal and multilingual AI. Our goal is to build a permissive and open stack for developing multimodal LLMs. This initiative is a collaborative effort led by OntocordAI.

The -m in Aurora-M2 refers to our focus on multimodal, multilingual, multidomain mixture-of-experts (MoE) models, each of which we aim to explore and develop through ongoing research.

Building on our previous success—Aurora-M: Open Source Continual Pre-training for Multilingual Language and Code—we are training a family of models aligned with laws, regulations, and policies for controllable AI. The series will include models with parameter sizes of 3B, 8B, and 21B, aligned with the comprehensive policy framework of the EU AI Act, specifically Annex III of the Act.

As part of our commitment to openness, we plan to open-source the entire training pipeline and experimental process—including data synthesis and the evolving methodologies we employ in model training. Stay with us!

Team

Team Members
Huu Nguyen
Huu Nguyen
Harsh Raj
Harsh Raj
Ken Tsui
Ken Tsui
Minh Chien Vu
Minh Chien Vu
Diganta Misra
Diganta Misra
Victory May
Victory May
Marianna Nezhurina
Marianna Nezhurina
Christoph Schuhmann
Christoph Schuhmann
Robert Kaczmarczyk
Robert Kaczmarczyk

Contact

Github: Aurora-M2

X: @Ontocord

Email: engage@ontocord.ai

Publish on 2025-01-01,Update on 2025-04-14