Is your feature request related to a problem? Please describe.
Embeddings are only creatable through the app & with huggingface models. There is no way to upload/integrate already existing/precomputed embeddings
Describe the solution you’d like
I want to upload personally created embeddings (e.g. an upload option similar to the record upload). They should be usable in the app as well as a visualization for them (e.g. with koaning/bulk - link see Additional Context).
Describe alternatives you’ve considered
Requested by @GeorgePearse on Discord
One last thought before I call it a day. I know the variation in dimensionality is what you stated was the problem with an upload embeddings functionality, but I actually only want to upload 2d ‘embeddings’ e.g. the output of UMAP such that it can actually be usefully visualized, in the same way, that koaning/bulk and GitHub - phurwicz/hover: Label data at scale. Fun and precision included. allow you to. This covers quite a lot of use cases (admittedly 2D would not be so good for ‘get similar’ with QDrant, but great for a quick summary, they may just be two completely different features)
In this space (super quick visualization and labelling) there are a few tools, but none are set up neatly enough to actually manage a project. And as for the production-grade tools (yourselves, rubrix, and a few others), none of you seem to have this feature, so it might be a nice way to distinguish yourselves a little.
demoability of a 2d scatter plot (with meaningful embeddings) to senior management is 10/10 when you’re trying to argue that your team should adopt a tool Or in my case arguing that you should use NLP at all https://projector.tensorflow.org/
Actually had some interesting ideas, if you go to custom on the bottom left you can create axes of similarity to different examples. They just got some of the levels of abstraction wrong which makes it a real pain to work with. Also doesn’t work for any meaningfully sized text