How Python implements dictionaries

Jessica YungProgrammingLeave a Comment

Python Dictionaries: Not even a space-time tradeoff If you could choose to store things that you’d want to look up later in a Python dictionary or in a Python list, which would you choose? It turns out that looking up items in a Python dictionary is much faster than looking up items in a Python list. If you search for … Read More

Numpy Views vs Copies: Avoiding Costly Mistakes

Jessica YungData Science, ProgrammingLeave a Comment

In this post we will talk about the differences between views and copies. It’s really important you’re aware of the difference between the two. Otherwise you might run into problems like accidentally modifying arrays. What are views and copies? With a view, it’s like you are viewing the original (base) array. The view is actually part of the original array even though … Read More

LSTMs for Time Series in PyTorch

Jessica YungMachine Learning, UncategorizedLeave a Comment

I can’t believe how long it took me to get an LSTM to work in PyTorch! There are many ways it can fail. Sometimes you get a network that predicts values way too close to zero. In this post, we’re going to walk through implementing an LSTM for time series prediction in PyTorch. We’re going to use pytorch’s nn module … Read More

Generating Autoregressive data for experiments

Jessica YungData Science, Machine LearningLeave a Comment

In this post, we will go through how to generate autoregressive data in Python, which is useful for debugging models for sequential prediction like recurrent neural networks. When you’re building a machine learning model, it’s often helpful to check that it works on simple problems before moving on to complicated ones. I’ve found this is especially useful for debugging neural … Read More

What makes Numpy Arrays Fast: Memory and Strides

Jessica YungMachine Learning, Programming

How is Numpy so fast? In this post we find out how Numpy’s ndarray is stored and how it is usually manipulated by Numpy functions using strides. Getting to know the ndarray A NumPy ndarray is a N-dimensional array. You can create one like this:

These arrays are homogenous arrays of fixed-sized items. That is, all the items in … Read More

MSE as Maximum Likelihood

Jessica YungMachine Learning

MSE is a commonly used error metric. But is it principly justified? In this post we show that minimising the mean-squared error (MSE) is not just something vaguely intuitive, but emerges from maximising the likelihood on a linear Gaussian model. Defining the terms Linear Gaussian Model Assume the data is described by the linear model , where . Assume is … Read More

Maximum Likelihood as minimising KL Divergence

Jessica YungMachine Learning

Sometimes you come across connections that are simple and beautiful. Here’s one of them! What the terms mean Maximum likelihood is a common approach to estimating parameters of a model. An example of model parameters could be the coefficients in a linear regression model , where is Gaussian noise (i.e. it’s random). Here we choose parameter values that maximise the … Read More

Python Lists vs Dictionaries: The space-time tradeoff

Jessica YungProgramming, Python

If you had to write a script to check whether a person had registered for an event, what Python data structure would you use? It turns out that looking up items in a Python dictionary is much faster than looking up items in a Python list. How much faster? Suppose you want to check if 1000 items (needles) are in … Read More