Wikipedia Equation Embeddings

Datasets Github

12 Jan 2019

Summary

DOI

The Wikipedia Equation Embeddings dataset includes two datasets that each contain word2vec embeddings generated from LaTex equations extracted from Wikipedia statistics and math articles, along with metadata for the articles themselves. You can see the README.md in each of the math and statistics subfolders for information on generation and using the data.

Download

The datasets are both provided via the Github repository:


git clone https://www.github.com/vsoch/wikipedia-equations
wget https://github.com/vsoch/wikipedia-equations/archive/0.0.1.zip
wget https://github.com/vsoch/wikipedia-equations/archive/0.0.1.tar.gz

Questions

Here are some interesting questions these datasets might help answer:

Other questions?

If you have other questions, or want help for your project, please don’t hesitate to open an issue. If you use any of the datasets in your work, please remember to include the doi.