Ian Goodfellow: Generative Adversial Networks

Jessica YungData Science, Highlights, Talk Reviews

The second talk I went to at AI WithTheBest 2016 was Ian Goodfellow’s talk on Generative Adversarial Networks (GANs), which he invented. Ian is a researcher at OpenAI. GANs are generative models based on supervised learning and game theory. They learn to generate realistic samples and have mostly been used to generate images. For example, you can feed it images … Read More

Dennis Mortensen, Part 1: Frameworks for looking at the AI Market

Jessica YungData Science, Talk Reviews

dennis-mortensen

Last weekend (24th-25th Sept) was the second Ai.WithTheBest conference – an online conference about artificial intelligence. Over two days, speakers gave talks (often from their homes) with live Q&A. Dennis Mortensen of x.ai kicked off this year’s conference with three frameworks for looking at AI products. He gave two frameworks for looking at the AI market and one framework for … Read More

Dennis Mortensen, Part 2: Horizontal vs Vertical AI

Jessica YungData Science, Talk Reviews

In this three-part series we discuss three frameworks for viewing the AI market presented by Dennis Mortensen from X.ai at the 2016 AI WithTheBest conference. (Links to Part 1, Part 3) This time we will discuss the second framework for looking at the AI market: Horizontal vs Vertical AI. The natural question was then how one would differentiate between a … Read More

Questions to ask when deciding how to approach predictive problems

Jessica YungData Science

Is the situation stochastic or deterministic? Is it time-inhomogeneous? (Different across time?) How much data do you have available? What limitations are there with respect to computational cost (compute and time), both for training and predicting? Do you need to try actions to learn about situations? (If so, consider Reinforcement Learning.) Do your actions have an impact on the environment? … Read More

Converting Jupyter Notebooks to PDFs (debugging pdflatex errors)

Jessica YungData Science

I had problems converting my .ipynb file to PDF with Jupyter’s method which uses pdflatex (see below for why it didn’t work), so I used an alternative method: Alternative Method I exported the file to HTML (File -> Download as -> HTML) and then Converted the HTML file to a PDF on http://html2pdf.com/. It is free and worked well. Watch out … Read More

Two ways CodeWars trumps HackerRank

Jessica YungUncategorized

Disclaimer: I am a fan of HackerRank’s contests and wide range of problems across languages and domains. Here I discuss two ways CodeWars is better than HackerRank. (Both are sites where you can do programming problems competitively.) Theme: CodeWars is better for beginner – intermediate learning. CodeWars emphasises testing throughout. While you’re working on the problem, there is a code … Read More

Code Wars: Consecutive Strings Programming Problem

Jessica YungProgramming

Woe is me, I am a terrible programmer. Spoiler alert: Scroll down for terrible code followed by elegant code. Question You are given an array strarr of strings and an integer k. Your task is to return the first longest string consisting of k consecutive strings taken in the array. Example: longest_consec([“zone”, “abigail”, “theta”, “form”, “libe”, “zas”, “theta”, “abigail”], 2) … Read More

Udacity Connect Review (London)

Jessica YungData Science

Udacity Connect are in-person meet-ups to supplement Udacity’s Nanodegrees, online certifications that consist of a series of courses and graded projects. (Udacity is an online educational organisation that offers technology-centred online courses.) After piloting in the US over the summer, Udacity Connect launched in London last week. In this post I describe what happened at the second(my first) Machine Learning … Read More

What data does Facebook.com load?

Jessica YungData Science, Highlights

Today we’re going to look at your Facebook homepage’s source code. This is interesting because it gives you an idea of the data Facebook is using every time you load your Newsfeed and accompanying ticker and chat windows, what exactly this data is, and how this data is stored and formatted. Here’s an example: (Scroll down for step-by-step instructions on … Read More

Removing Outliers from your Data

Jessica YungData Science

Hastily compiled, from uDacity’s Intro to Machine Learning videos. Here’s a general recipe for removing outliers from your data: 1. Train with all data. 2. Remove ~10% of data (points with highest residual error). 3. Train again. Obviously don’t remove outliers blindly – sometimes they are important and you should pay attention to them. But outliers that are results of … Read More