Finding of the week

Hey, community!

In this topic, we’ll share new findings about data-centric AI regularly. These can be best practices, great scientific papers, or demos of cool applications.

Feel free to share the content with anyone you think might find this interesting. And also feel free to add your own insights on this topic or another one. Let’s build data-centric AI! :slight_smile:

We’ll start the topic with something many people of you already are familiar with, but as it is of such importance, it shouldn’t be missing in any thread about breakthroughs in AI: Transformer models.

The Transformer is one of the more recent deep learning model architectures that is able to learn the context of an input sequence through self-attention, a set of mathematical techniques that detect the relationships between elements of the input sequence, e.g. the relationships between words of a given sentence.

They were introduced only 5 years ago in a paper by Google called “Attention is all you need”, which already accumulated over 34.000 citations. Nowadays, transformers are used in almost every domain from protein structure prediction to computer vision and audio classification. But they are especially useful in language-related tasks, where they originated from.

They are the magic tech behind translation services, question answering bots, image captioning services, and sentiment analyses. We make use of transformers to embed your text data into a machine-readable format that preserves important context information.

Transformers power our Neural Search, Active Learners, and Zero-Shot Classifiers. To get you started practicing with transformers we recommend the beginner’s resources of Hugging Face. If you want to cover the theoretical background first, you can take a look at the original 2017 paper.

There is also an annotated version of that paper with modern line-by-line PyTorch implementations of the presented concepts and some additional comments.

1 Like