Third Datathon - NER for Kyrgyz language | Community of AI enthusiasts

Awards

👋🎉 Hello friends!

Our two-day Datathon III on the Kyrgyz language is completed! 📚🌐

🚩The task was to create a model that would extract (select) named entities from text in the Kyrgyz language. The task of Named Entity Recognition (NER) is to identify named entities (which are individual words and sequences of words) in text and classify them into predefined categories, for example, PERSON, ORGANIZATION, LOCATION ) and others.

Results

46 submissions were uploaded! 📈👏 All results were assessed automatically using the F1 score metric. 📊🏅

Datathon winners:

🥇 "Adis Davletov" - 66.5% F1.
🥈 "Ya Mashina" - 65.2% F1.
🥉 "Team 121" - 62.9% F1.

A team of experienced AI researchers and 100 volunteers who helped with markup worked on the dataset for 3 months. All details can be found at link.

But that's not all!

At the hackathon, the first corpus of the Kyrgyz language with 100 million words was presented - tilcorpusu.org.

Thanks to all participants for their active participation and contribution to the development of AI! 🙏💡

🔥Special thanks to our partners: Compass College - for all the conditions provided for holding the datathon and High Technology Park @htp__kg - our irreplaceable partner, supporting us from the very beginning! 🤝🏢

Thank you for your support in the development of AI in Kyrgyzstan! 💖🌐