Paper review: (What’s in a Name? Reducing Bias in Bios Without Access to Protected Attributes)[https://arxiv.org/abs/1904.05233]
Overall, I have a negative opinion of this paper. It introduces a novel way of bias mitigation, but provides little theoretical justification for why such an approach is useful, and also does not compare their work with any previous work. More over, I am extremely skeptical of the entire point of their paper, which is that we do not have access to protected attributes and so should find a way of providing fairness without looking at protected attributes.
[Read More]
On Papers
As a follow-up to my recent post on research tips, I would like to mention that confidence and story-telling is key. Many papers have some flaws, and substantial overlap with existing papers, which becomes especially apparent once you get to know the field. The key is presenting a convincing narative that steers the discussion away from focusing too much on those flaws and similarity. In particular, when you are building the introduction, you want to build and hammer home a singular line of thought, a single progression of thought, which makes it OBVIOUS that YOUR approach is the most logical next step forward.
[Read More]
Pure Storage Coding Challenge
Just finished the Pure Storage Coding Challenge! It consists of some coding questions, and then some CS theory questions. Overall, it was a very fair coding challenge, and actually had interesting questions that test your knowledge of CS beyond just Leetcode :)
Great test!
Research Tips
Research is hard. Here are some tips I’ve picked up over the years: 1. Try to present your work to as many people who will listen. It forces you to really understand the content yourself, and serves as a reality check 2. Make dedicated time for research. As a grad student, it is easy to put research to the side, but to solve any non-trivial problem requires constant, structured attention. 3.
[Read More]
Variational Auto Encoder
VAEs are autoencoders with some added randomness. The encoder network (also called recognition or inference network) outputs parameters of a probability distribution for each data point. Then, for each data point, we sample from this parametrized distribution, and feed the SAMPLE to the decoder network, which then is tasked with reconstructing the output. The training objective involves changing from an integral, into just doing optimization, hence the VAE variational objective. In particular, all we do is maximize the ELBO, which helps push up our p(x) which normally would require an integral!
[Read More]
Upcoming
So without further ado, here are the posts I plan to make:
Exotic python data structures and where to find them (heapq, OrderedDict, Deque, bisect) Onsite interview with Microsoft Reparameterization tricks: Gaussian vs gumbel The REINFORCE gradient estimator Probability notations Research itinerary Birthday post ABD, L1, and Deep learning Symposium follow-up Scale AI follow-up The supremum IS big-O notation (usually we want the TIGHTEST worst case bound) Model vs Data Parallelism [Some more reinforcement learning][1] [Some more reinforcement learning]1 (Note that the markdown in preview does not exactly work with some more exotic features, like citations and references)
[Read More]
Roadmap
What exactly is the value prop of this blog? Well I like to think that I have some interesting things going on in my life:
Machine learning, deep learning, linear algebra, statistics. I am a student in deep learning, and am lucky enough to be exposed daily to the inspirational and amazing work being conducted in the U of T ML group, at Vector Institute. In particular, I want to talk about the awesome papers and concepts I learn about in CSC2547 and CSC2541 My research: to appear in ACL 2020 (fingers crossed!
[Read More]
How to make a Hugo Subsite
You may have noticed that this blog exists under my main Github personal website. How did I get it to work?
Github exposes your entire vitual directory by default. (This is why you can just put raw assets in the repo, and they will be served when queried directly) So, I just made Hugo build the entire blog subsite to a folder in my main repo!
In short:
[Read More]
Hello World!
I’ve finally set up the Hugo blog subsite on my personal github site, so here is the obligatory “Hello, world!”
Current thoughts about Hugo: Neat, but probably would have been better served by a more established/feature-rich system like Jekyll or perhaps Gatsby!
In particular: I have concerns about dating/naming of posts, especially as this list grows (hopefully) long!