On this page you can find all the projects available for Friday!
Project 1 - Social Network Analysis
This project is about using Python to visualise and analyse network data in Python. In this case we will focus on loading using Python to analyse social network data (eg. Twitter, Facebook, etc.). The idea is to build networks that show how people in multidisciplinary fields are connected on social networks (i.e. who follows whom). This will allow us to identify the key individuals who connect independent groups that make up a multidisciplinary subject matter (eg. biologist, geologists, marine experts, policy makers, etc. who are involved with climate change).
- Load data from static files/an API
- Build and visualise a network
- Perform simple analysis on this network
Publicly available social network data from Twitter/Facebook/etc.
Project 2 - Analysing Metadata of Publications
This project is about analysing the metadata of academic publications.
- Get the publications related to that journal or topic of interest
- Perform some sort of metadata analysis on the publications. For instance, we could:
- analyse keyword occurrence within a certain category/topic of interest
- analyse keyword cooccurrence within a category
- construct a graph to visualise the similarity between publications of a givencategory
This data is available via APIs from certain publishers (e.g. Elsevier).
Project 3 - Optimisation in Python
This project will focus on learning to use perform convex optimisation in Python. We will mainly be using the cvxopt library to perform convex optimisation given an objective function and a set of constraints. We could then use this to perform some experiments on toy datasets.
- Learn to use cvxopt (or other) library to perform convex optimisation in Python
Project 4 - Influenza Outbreak Analysis
This project aims to analyse influenza outbreak profiles. Specifically we will learn to use different data analysis and clustering methods to cluster countries based on outbreak profiles. Given these groups, one of the challenges is to then visualise this high-dimensional data in Python.
- Load data from static files
- Learn to perform clustering in Python
- Visualise high-dimensional data
WHO database of influenza outbreak worldwide for the past 20 years
Project 5 - Building a spam filter
Using a public database, build a classifier that can classify junk emails
- Import data from SpamAssassin public corpus
- Select features
- Build a binary classifier to decide if an email is spam or not
- Uses numpy, regular expressions and nltk
SpamAssassin public corpus
Project 6 - Visualising Molecule Data
In this project, we will learn how to load data from text files into Python, create visualisations of the data and finally combine these images to create a video. In problems based on reaction engineering we can get text files in which time and amount of molecules can be stored. If we have 3 or 4 species in our system and denote them with a distinct colour, it would be very interesting to create videos showing the increment and decrement of those species over time. The most common tool to describe production/deduction of a population is by plotting time vs. amount of molecules. However, it would also be very fascinating to describe that process through videos.
- Load data from text files into Python
- Find a way to visualise the data
- Combine the visualisations from multiple text files to create a video
Project 7 - Proteomic Interaction Map
I propose to create a proteomic interaction map. Using protein interaction data available online, the project would be:
- to find a way to read this dataset with Python; and
- to write a tool that can be used to visualise the interaction between individual proteins.