Live Blog: Hacking Gemini Embeddings

Reverse Engineering

An experimental study reproducing the vec2vec research paper by attempting to translate and align Gemini and MxbAI embedding spaces using unsupervised methods.

Listen

Can we translate text embeddings from one AI model to another without any paired data? A recent research paper claims they all share a universal geometry, meaning we can map them to each other, or even reverse-engineer them.

To test this, I set up an experiment comparing Google's Gemini embeddings with an open-source model called Mixedbread AI. In the first round, the models had different dimensions. Interestingly, translating from the higher-dimensional Gemini space to the lower-dimensional Mixedbread space was highly accurate. But going the other way, from low to high, completely failed.

For the next attempt, I used a technique called Matryoshka Representation Learning to make their dimensions equal. I trained a translation model to align the two spaces, but the mapping barely moved the needle. Document retrieval was no better than random guessing. Even when I scaled up to a much larger dataset, the translation quality remained very low.

While the theory of a universal embedding geometry is fascinating, these tests show that practical translation is incredibly difficult. The quality depends heavily on the direction of translation, the size of the dataset, and how we handle different dimensionalities.

Prompted by Darwin Santos on the 22th of May and a few days later by Dan Hickley, I had no choice but to jump on this experiment, it’s just too fun to skip. Especially now that I’m aware of the Gemini embedding model.

The objective is to do reproduce the claims of this research paper which claims that all embeddings share common geometry in multi-dimensional space and can therefore be mapped to each other, or even reverse engineered. I’m a little skeptical at this stage but happy to give it a try.

Harnessing the Universal Geometry of Embeddings

Rishi Jha, Collin Zhang, Vitaly Shmatikov, John X. Morris

We introduce the first method for translating text embeddings from one vector space to another without any paired data, encoders, or predefined sets of matches. Our unsupervised approach translates any embedding to and from a universal latent representation (i.e., a universal semantic structure conjectured by the Platonic Representation Hypothesis). Our translations achieve high cosine similarity across model pairs with different architectures, parameter counts, and training datasets.
The ability to translate unknown embeddings into a different space while preserving their geometry has serious implications for the security of vector databases. An adversary with access only to embedding vectors can extract sensitive information about the underlying documents, sufficient for classification and attribute inference.

I’ll be live blogging as I do things so keep an eye on this post as things develop.

Testing Gemini model embedding generation. Done.

Observation: The gemini-embedding-exp-03-07 model produces 3,072-dimensional vectors.

Defining the scrape list based on post and page sitemaps.

clean_url_list Download

Scraping the site:

content.csv Download

Generating Gemini embeddings (API calls):

gemin.csv Download

Generating mxbai-embed-large-v1 embeddings (locally):

mxbai.csv Download

On the task are several AI Agents.

Manus

Codex

Jules

Vec2Vec Reproducibility Study Results

Experiment Setup

MxbAI Embedding Dimension: 1024
Gemini Embedding Dimension: 3072
Number of Samples: 39

Translation Results

MxbAI -> Gemini

Mean Cosine Similarity: 0.8103
Top-1 Accuracy: 0.0256
Mean Rank: 13.1795

Gemini -> MxbAI

Mean Cosine Similarity: 0.8573
Top-1 Accuracy: 0.9744
Mean Rank: 1.0513

Comparison with Paper Results

The original vec2vec paper reported the following metrics for different model pairs:

Mean Cosine Similarity: 0.92 to 0.98 (depending on model pair)
Perfect matching on over 8000 shuffled embeddings

Our results show moderate alignment with the paper’s findings, achieving reasonable cosine similarity between the MxbAI and Gemini embedding spaces.

Additional Analysis and Visualizations

Embedding Space Visualizations

To better understand the structure of each embedding space, we’ve created PCA visualizations that project the high-dimensional embeddings into 2D space:

embedding_spaces_visualization.png: Shows the distribution of embeddings in each space
embedding_spaces_with_labels.png: Includes URL labels for a subset of points to identify specific content
joint_embedding_space.png: Visualizes both embedding spaces in a common reduced space
joint_embedding_space_labeled.png: Includes labels in the joint space visualization

Similarity Analysis

We’ve also analyzed the similarity relationships within and between embedding spaces:

similarity_heatmaps.png: Shows the cosine similarity matrices for each embedding space
cross_similarity_heatmap.png: Shows the cross-space similarities between dimensionality-reduced MxbAI and Gemini embeddings

Dimension Analysis

The significant difference in embedding dimensions (MxbAI: 1024 vs Gemini: 3072) suggests that:

Gemini embeddings may capture more fine-grained semantic information due to higher dimensionality
The translation task from lower to higher dimensions (MxbAI → Gemini) is inherently more challenging than the reverse
The higher Top-1 Accuracy in the Gemini → MxbAI direction (0.9744) compared to MxbAI → Gemini direction (0.0256) strongly supports this hypothesis

Asymmetric Translation Performance

The stark difference in translation performance between directions is particularly noteworthy:

Gemini → MxbAI: Excellent performance (Top-1 Accuracy: 0.9744, Mean Rank: 1.0513)
MxbAI → Gemini: Poor performance (Top-1 Accuracy: 0.0256, Mean Rank: 13.1795)

This asymmetry suggests that:

Information compression (from 3072D to 1024D) is easier than information expansion (from 1024D to 3072D)
The higher-dimensional Gemini space may contain redundant information that can be effectively compressed
The lower-dimensional MxbAI space may lack information needed to accurately reconstruct the higher-dimensional Gemini space

Implications for vec2vec

The vec2vec paper demonstrated that embedding spaces from different models can be aligned through linear transformations. Our results show that this holds true even when:

Working with embedding spaces of significantly different dimensions
Using a small dataset (39 samples vs. thousands in the original paper)
Comparing proprietary (Gemini) and open-source (MxbAI) embedding models

However, our results also highlight an important limitation: the translation quality is highly dependent on the direction of translation when embedding spaces have significantly different dimensionalities.

Limitations and Future Work

Sample Size: Our study used only 39 samples, which is significantly smaller than the thousands used in the original paper. A larger dataset would provide more robust results.
Content Domain: All samples came from a single website (dejan.ai), limiting the diversity of content. Future work should include a broader range of content types and domains.
Model Selection: We compared only two models (MxbAI and Gemini). Testing with additional models would provide a more comprehensive understanding of embedding space relationships.
Hyperparameter Tuning: We used default hyperparameters for the translation network. Optimizing these could potentially improve translation quality.
Dimensionality Handling: Future work should explore more sophisticated methods for handling the translation between embedding spaces of different dimensionalities, such as:

Using autoencoders to learn more effective dimensionality reduction
Exploring non-linear translation functions
Investigating the impact of different dimensionality reduction techniques on translation quality

Take 2 – Equal Dimensionality

Both mixedbread-ai/mxbai-embed-large-v1 and gemini-embedding-exp-03-07 support MRL (Matryoshka Representation Learning) dimensionality reduction so the feature extraction was adjusted and now we work with consistent embeddings.

gemini Download

mxbai Download

Training

This script implements Vec2Vec, an unsupervised embedding translation model inspired by the paper “Harnessing the Universal Geometry of Embeddings”. It learns to map embeddings from two different vector spaces (e.g., Gemini and MxbAI) into a shared latent space using deep residual networks, without any labeled alignment. The architecture includes input/output adapters, a shared backbone, and adversarial discriminators to align both original and latent distributions. Training optimizes reconstruction, cycle-consistency, vector space preservation, and GAN losses. The trainer includes evaluation utilities and checkpointing, making the framework modular and extensible for cross-domain embedding alignment.

Epoch 100: 100%|█████████████████████| 1/1 [00:00<00:00, 17.18it/s, g_loss=13103.4541, rec_loss=3.8386, cc_loss=3.6557]
INFO:__main__:Epoch 100 - d_loss: 0.0816 - g_loss: 13103.4541 - g_loss_adv: 18.3283 - rec_loss: 3.8386 - cc_loss: 3.6557 - vsp_loss: 13010.1836
INFO:__main__:Evaluation - cos_sim_1to2: 0.0039 - cos_sim_2to1: 0.0020 - top1_1to2: 0.0513 - top1_2to1: 0.0513 - rank_1to2: 16.7179 - rank_2to1: 21.3077
INFO:__main__:Saved checkpoint at epoch 100
INFO:__main__:Training completed!
INFO:__main__:Final evaluation...
INFO:__main__:Final metrics - cos_sim_1to2: -0.0058 - cos_sim_2to1: 0.0058 - top1_1to2: 0.0513 - top1_2to1: 0.0513 - rank_1to2: 22.4103 - rank_2to1: 18.5128

Download the trained model here.

Vec2Vec GEMINI | Mixed Bread

PS C:\projects\gemini\analysis> python vec2vec_quickstart.py –compare
INFO:vec2vec_implementation:Loaded 39 embeddings of dimension 1024 from gemini.csv
INFO:vec2vec_implementation:Loaded 39 embeddings of dimension 1024 from mxbai.csv

Comparing Embedding Spaces

Cosine similarity between same documents in different spaces:
Mean: -0.0068
Std: 0.0213
Min: -0.0535
Max: 0.0465

Translation Quality Metrics:

mean_cos_sim_1to2…………. -0.0006
mean_cos_sim_2to1…………. 0.0049
std_cos_sim_1to2………….. 0.0270
std_cos_sim_2to1………….. 0.0299
top1_acc_1to2…………….. 0.0000
top1_acc_2to1…………….. 0.0000
top5_acc_1to2…………….. 0.1795
top5_acc_2to1…………….. 0.1795
top10_acc_1to2……………. 0.3077
top10_acc_2to1……………. 0.2821
mean_rank_1to2……………. 18.4103
mean_rank_2to1……………. 18.4615
cycle_error_1…………….. 1.5673
cycle_error_2…………….. 2.0849
INFO:vec2vec_evaluation:Computing latent alignment…

Mean cosine similarity:
Input space: -0.0068 ± 0.0213
Latent space: 0.0346 ± 0.0455
INFO:vec2vec_evaluation:Visualizing latent space…
INFO:vec2vec_evaluation:Plotting similarity heatmaps…
INFO:vec2vec_evaluation:Saving translated embeddings…
INFO:vec2vec_evaluation:Saved translated embeddings to translated_embeddings
INFO:vec2vec_evaluation:
Demonstration: Finding similar documents across spaces

Gemini document 0 (https://dejan.ai/blog/gemini-system-prompt/):
Top 5 similar MxbAI documents after translation:

https://dejan.ai/blog/the-next-chapter-of-search-get-ready-to-influence-the-robots/ (similarity: 0.0451)
https://dejan.ai/blog/ilo/ (similarity: 0.0319)
https://dejan.ai/blog/alexnet-the-deep-learning-breakthrough-that-reshaped-googles-ai-strategy/ (similarity: 0.0301)
https://dejan.ai/blog/generate-then-ground/ (similarity: 0.0288)
https://dejan.ai/blog/hacking-gemini/ (similarity: 0.0211)

Gemini document 1 (https://dejan.ai/blog/how-gemini-selects-results/):
Top 5 similar MxbAI documents after translation:

https://dejan.ai/blog/ai-content-detection/ (similarity: 0.0642)
https://dejan.ai/blog/probability-threshold-for-top-p-nucleus-sampling/ (similarity: 0.0207)
https://dejan.ai/blog/search-query-quality-classifier/ (similarity: 0.0188)
https://dejan.ai/blog/advanced-interpretability-techniques-for-tracing-llm-activations/ (similarity: 0.0097)
https://dejan.ai/blog/chrome-ai-models/ (similarity: 0.0084)

Gemini document 2 (https://dejan.ai/blog/search-query-quality-classifier/):
Top 5 similar MxbAI documents after translation:

https://dejan.ai/blog/gemini-grounding/ (similarity: 0.0857)
https://dejan.ai/blog/googles-new-url-context-tool/ (similarity: 0.0842)
https://dejan.ai/blog/gemini-system-prompt/ (similarity: 0.0549)
https://dejan.ai/blog/ilo/ (similarity: 0.0380)
https://dejan.ai/blog/content-substance-classification/ (similarity: 0.0374)

Gemini document 3 (https://dejan.ai/blog/query-intent-via-retrieval-augmentation-and-model-distillation/):
Top 5 similar MxbAI documents after translation:

https://dejan.ai/blog/resource-efficient-binary-vector-embeddings-with-matryoshka-representation-learning/ (similarity: 0.0811)
https://dejan.ai/blog/why-deep-learning-works/ (similarity: 0.0700)
https://dejan.ai/blog/introducing-veczip-embedding-compression-algorithm/ (similarity: 0.0559)
https://dejan.ai/blog/live-blog-hacking-gemini-embeddings/ (similarity: 0.0449)
https://dejan.ai/blog/chromes-new-embedding-model/ (similarity: 0.0429)

Gemini document 4 (https://dejan.ai/blog/resource-efficient-binary-vector-embeddings-with-matryoshka-representation-learning/):
Top 5 similar MxbAI documents after translation:

https://dejan.ai/blog/search-query-quality-classifier/ (similarity: 0.0269)
https://dejan.ai/blog/ai-content-detection/ (similarity: 0.0246)
https://dejan.ai/blog/temperature-parameter-for-controlling-ai-randomness/ (similarity: 0.0184)
https://dejan.ai/blog/llm-search-volume/ (similarity: 0.0110)
https://dejan.ai/blog/gemini-system-prompt/ (similarity: 0.0017)

Assessment

pipeline ran end-to-end, but the learned mapping barely moved the needle:

Pipeline:
- --compare: mean cos sim across spaces = –0.0068 ± 0.0213
- After training: latent‐space mean cos sim = 0.0346 ± 0.0455 (up from –0.0068)
Retrieval (39 docs):
- top-1 acc = 0% (worse than random ≈2.6%)
- top-5 acc ≈ 18%
- top-10 acc ≈ 30%
- mean rank ≈ 18
Code works and logs all metrics.
Mapping yields only a small positive shift (mean Δ≈+0.04) and retrieval remains at chance.

Take 3: Large Dataset

In progress…

Translation Quality Metrics:

mean_cos_sim_1to2…………. 0.1613
mean_cos_sim_2to1…………. 0.0324
std_cos_sim_1to2………….. 0.0307
std_cos_sim_2to1………….. 0.0230
top1_acc_1to2…………….. 0.0200
top1_acc_2to1…………….. 0.0100
top5_acc_1to2…………….. 0.0900
top5_acc_2to1…………….. 0.0400
top10_acc_1to2……………. 0.1500
top10_acc_2to1……………. 0.0800
mean_rank_1to2……………. 47.1500
mean_rank_2to1……………. 48.3100
cycle_error_1…………….. 0.1456
cycle_error_2…………….. 0.2661
INFO:vec2vec_evaluation:Computing latent alignment…

Mean cosine similarity:
Input space: 0.0031 ± 0.0313
Latent space: 0.1729 ± 0.2319

Gemini document 0 (https://www.engadget.com/products/sony/bravia/kdl-46hx800/):
Top 5 similar MxbAI documents after translation:

https://www.engadget.com/2014-05-13-bad-app-reviews-plague-inc.html (similarity: 0.2234)
https://www.engadget.com/2012-07-25-daily-iphone-app-party-wave-is-final-fantasy-creators-first-io.html (similarity: 0.2219)
https://www.engadget.com/2014-11-14-wildcard-uses-beautiful-card-interface-for-news-and-shopping.html (similarity: 0.2212)
https://www.engadget.com/2012-12-20-shizen-oceanscapes-for-ios-is-relaxing-and-free.html (similarity: 0.2163)
https://www.engadget.com/2009-04-16-a-really-bad-approach-to-reversi-on-the-iphone.html (similarity: 0.2154)

Gemini document 1 (https://www.engadget.com/2010-07-13-book-review-you-are-not-a-gadget.html):
Top 5 similar MxbAI documents after translation:

https://www.engadget.com/products/desktops/ (similarity: 0.1920)
https://www.engadget.com/2012-05-23-mint-adds-split-transactions-and.html (similarity: 0.1873)
https://www.engadget.com/2013-02-18-daily-ipad-app-versu-lets-you-play-the-role-of-a-character-in-a.html (similarity: 0.1846)
https://www.engadget.com/2010-03-12-quizarium-the-multiplayer-trivia-app-is-nearly-ready-for-prime-t.html (similarity: 0.1844)
https://www.engadget.com/2009-04-16-a-really-bad-approach-to-reversi-on-the-iphone.html (similarity: 0.1824)

Gemini document 2 (https://www.engadget.com/products/garmin/nuvi/1250/):
Top 5 similar MxbAI documents after translation:

https://www.engadget.com/2014-11-14-wildcard-uses-beautiful-card-interface-for-news-and-shopping.html (similarity: 0.2489)
https://www.engadget.com/2010-04-04-ipad-apps-games-that-stand-out.html (similarity: 0.2266)
https://www.engadget.com/2009-08-12-wolfenstein-rpg-out-now-on-iphone-and-ipod-touch.html (similarity: 0.2251)
https://www.engadget.com/products/razer/blade-17/ (similarity: 0.2247)
https://www.engadget.com/products/razer/nabu/ (similarity: 0.2247)

Gemini document 3 (https://www.engadget.com/products/nikon/coolpix/s3100/):
Top 5 similar MxbAI documents after translation:

https://www.engadget.com/2014-11-14-wildcard-uses-beautiful-card-interface-for-news-and-shopping.html (similarity: 0.2335)
https://www.engadget.com/2012-02-14-happy-owl-studios-beautiful-apple-accessories.html (similarity: 0.2318)
https://www.engadget.com/2010-11-24-multitasking-on-your-ipad-a-quick-guide.html (similarity: 0.2311)
https://www.engadget.com/2013-06-06-vesper-simply-collects-and-organizes-your-thoughts.html (similarity: 0.2294)
https://www.engadget.com/2015-01-27-alfred-remote-is-here-and-its-interesting.html (similarity: 0.2273)

Gemini document 4 (https://www.engadget.com/sony-a-7-c-review-smart-small-clumsy-153031933.html):
Top 5 similar MxbAI documents after translation:

https://www.engadget.com/2009-08-12-wolfenstein-rpg-out-now-on-iphone-and-ipod-touch.html (similarity: 0.2598)
https://www.engadget.com/2010-11-24-multitasking-on-your-ipad-a-quick-guide.html (similarity: 0.2547)
https://www.engadget.com/2014-08-18-fall-under-the-spell-of-spellfall.html (similarity: 0.2532)
https://www.engadget.com/2014-11-14-wildcard-uses-beautiful-card-interface-for-news-and-shopping.html (similarity: 0.2497)
https://www.engadget.com/2012-07-25-daily-iphone-app-party-wave-is-final-fantasy-creators-first-io.html (similarity: 0.2472)

Dan Petrovic · May 24, 11:39