This page contains the guided laboratory of the Embedding Spaces topic for the Deep Learning course at the Master in Artificial Inteligence of the Universitat Politècnica de Catalunya.

The slides used in the class are the following:




WARNING: What follows is content from old versions of this course. Its here because it contains interesting links.

The codes introduced in this guided lab can be found here

Some necessary resources are found within gpfs. This must be copied to your local .keras/models folder. Their location is: /gpfs/scratch/bsc28/hpai/storage/data/dl-labs/

Other sources for experimentation

Beyond the codes explained in class, there are other online resources of interest that may be used for experimentation.

Word2vec

The original code of word2vec, as released by its authors, can be found here.

Kaggle word2vec tutorial

Bag of Words Meets Bags of Popcorn is a tutorial for understanding and operating with the word2vec model.

McCormick inspect word2vec

This repository uses gensim in Python to load word2vec pre-trained model, and inspects some of the details of the vocabulary.

word2vec tutorial in TensorFlow

Tutorial of word2vec using TensorFlow: Vector Representations of Words

Word2vec in Keras

Using Gensim Word2Vec Embeddings in Keras. A short post and script regarding using Gensim Word2Vec embeddings in Keras, with example code.

Using pre-trained word embeddings in a Keras model. Official.

CBOW implementation in Keras without dependencies

StackOverflow details

Gensim

Official gensim tutorials

Multimodal embeddings

GitHub of Jamie Ryan Kiros visual-semantic-embedding Implementation of the image-sentence embedding method described in “Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models” multimodal-neural-language-models A bare-bones NumPy implementation of “Multimodal Neural Language Models” (Kiros et al, ICML 2014) skip-thoughts Sent2Vec encoder and training code from the paper “Skip-Thought Vectors” neural-storyteller A recurrent neural network for generating little stories about images

Pinterest Multimodal Dataset ToolBox This is a toolbox to download and manage the released part of the Pinterest40M multimodal dataset introduced in the paper Training and Evaluating Multimodal Word Embeddings with Large-scale Web Annotated Images. More information can be found on the Project Page.