Recurrent Neural Networks (Autonomous Laboratory)

This assignment is to be solved individually.

The autonomous laboratory is open for your exploration. Using the basic code provided in the guided laboratory, you should modify it to test different settings, and to understand the effect of the various components.

You could use any dataset you wish as long as it is a problem that involves sequential data (e.g. time series, music, trajectories, text…).

You will have to inform me of the dataset you chosed as soon as possible, no repeated datasets are allowed. Send me an email with your choice (bejar@cs.upc.edu)

At the end of this page you have a list of possible datasets and datasources. Using one of those is not a requirement, but don’t use a dataset that is too simple or that has been used on tensorflow, keras, pytorch, CNTK demos.

You can also use any framework you wish, train model you can find, or online dataset.

In the end you must upload to the raco a report around 10-12 pages long as a PDF document (references not included) with your work. Minimum font size of 11.

In general, it is better to focus in one aspect and to study it thoroughly, than to do small experiments on different topics without going any deep.

In this document there is a list of suggested questions you may consider to explore. It is not necessary to answer any or all these questions. You may choose to perform different experiments, for exploring different aspects of recurrent neural networks. Follow your own curiosity.

Possible questions

Does the type of recurrent unit cell have an impact on the results (Vanilla RNN, LSTM, GRU)
Does the use of multiple layers help?
How many neurons per layer are enough to overfit?
When overfitting, what is the result of applying various regularization techniques? Is the behaviour different for different types of recurrent units?
Does the optimization algorithm have an impact on the speed the network converges
Does the batch size have an impact on the results?
If the data has sequences of different length, does have an impact if the padding is at the begginnig or the end?
If the sequences are reversed, does it improve the result?
If the dataset is a continuous sequence and it is a regression problem, using longer windows impoves the accuracy?
If other methods have been already applied to the dataset, do recurrent networks obtain better results?

Tips

Follows a list of tips for your report

Properly introduce the data you work with
Define the problem to solve (classification or regression) and the adequate loss function
Properly split the data into train, test and validation (if needed)
Don’t use datasets for which achieving a 95% of accuracy is straight-forward
Use and show loss and/or accuracy plots in your discussion
Apply techniques (e.g., regularization, dropout…) for a reason and properly motivate it. Dont do it just for the sake of doing stuff.
Use a number of neurons per layer which is a power of 2
Don’t try to use all possible methods and techniques. Focus on certain aspects, try to understand and interpret their behavior
Do not show only positive results, failures are as important as successes
Use statistics and graphics to support your explanations
When using a plot, stare at it for a while. Try to reason what can be understood from it. Make yourself related questions and act in consequence.
To properly analyze and understand a model, you must train it to overfitting

This is a list of possible datasets and data sources: