Projects

On this page you can find all the projects available for Friday!

Problem Statement

This project is about using Python to visualise and analyse network data in Python. In this case we will focus on loading using Python to analyse social network data (eg. Twitter, Facebook, etc.). The idea is to build networks that show how people in multidisciplinary fields are connected on social networks (i.e. who follows whom). This will allow us to identify the key individuals who connect independent groups that make up a multidisciplinary subject matter (eg. biologist, geologists, marine experts, policy makers, etc. who are involved with climate change).

Project goals

Load data from static files/an API
Build and visualise a network
Perform simple analysis on this network

Datasets

Publicly available social network data from Twitter/Facebook/etc.

Problem Statement

This project is about analysing the metadata of academic publications.

Project goals

Get the publications related to that journal or topic of interest
Perform some sort of metadata analysis on the publications. For instance, we could:
- analyse keyword occurrence within a certain category/topic of interest
- analyse keyword cooccurrence within a category
- construct a graph to visualise the similarity between publications of a givencategory

Datasets

This data is available via APIs from certain publishers (e.g. Elsevier).

Problem Statement

This project will focus on learning to use perform convex optimisation in Python. We will mainly be using the cvxopt library to perform convex optimisation given an objective function and a set of constraints. We could then use this to perform some experiments on toy datasets.

Project goals

Learn to use cvxopt (or other) library to perform convex optimisation in Python

Problem Statement

This project aims to analyse influenza outbreak profiles. Specifically we will learn to use different data analysis and clustering methods to cluster countries based on outbreak profiles. Given these groups, one of the challenges is to then visualise this high-dimensional data in Python.

Project goals

Load data from static files
Learn to perform clustering in Python
Visualise high-dimensional data

Dataset

WHO database of influenza outbreak worldwide for the past 20 years

Problem Statement

Using a public database, build a classifier that can classify junk emails

Project goals

Import data from SpamAssassin public corpus
Select features
Build a binary classifier to decide if an email is spam or not
Uses numpy, regular expressions and nltk

Datasets

SpamAssassin public corpus

Problem Statement

In this project, we will learn how to load data from text files into Python, create visualisations of the data and finally combine these images to create a video. In problems based on reaction engineering we can get text files in which time and amount of molecules can be stored. If we have 3 or 4 species in our system and denote them with a distinct colour, it would be very interesting to create videos showing the increment and decrement of those species over time. The most common tool to describe production/deduction of a population is by plotting time vs. amount of molecules. However, it would also be very fascinating to describe that process through videos.

Project goals

Load data from text files into Python
Find a way to visualise the data
Combine the visualisations from multiple text files to create a video

Problem Statement

I propose to create a proteomic interaction map. Using protein interaction data available online, the project would be:

Project goals

to find a way to read this dataset with Python; and
to write a tool that can be used to visualise the interaction between individual proteins.

Datasets

Available online.

Projects

Project 1 - Social Network Analysis

Problem Statement

Project goals

Datasets

Project 2 - Analysing Metadata of Publications

Problem Statement

Project goals

Datasets

Project 3 - Optimisation in Python

Problem Statement

Project goals

Project 4 - Influenza Outbreak Analysis

Problem Statement

Project goals

Dataset

Project 5 - Building a spam filter

Problem Statement

Project goals

Datasets

Project 6 - Visualising Molecule Data

Problem Statement

Project goals

Project 7 - Proteomic Interaction Map

Problem Statement

Project goals

Datasets