Universal Calculus et al: Word2vec

Word2vec is a group of related models that are used to produce word embeddings. These models are shallow, two-layer neural networks that are trained to reconstruct linguistic contexts of words. Word2vec takes as its input a large corpus of text and produces a high-dimensional space (typically of several hundred dimensions), with each unique word in the corpus being assigned a corresponding vector in the space. Word vectors are positioned in the vector space such that words that share common contexts in the corpus are located in close proximity to one another in the space.^[1]
Word2vec was created by a team of researchers led by Tomas Mikolov at Google. The algorithm has been subsequently analysed and explained by other researchers^[2]^[3] and a Bayesian version of the algorithm is proposed as well.^[4] Embedding vectors created using the Word2vec algorithm have many advantages compared to earlier algorithms like Latent Semantic Analysis.

https://en.wikipedia.org/wiki/Word2vec

https://code.google.com/archive/p/word2vec/

Portal

https://github.com/dav/word2vec
https://code.google.com/archive/p/word2vec/

Universal Calculus et al

Saturday, October 15, 2016

Word2vec

No comments:

Post a Comment