Joshua Zhanson

Email: joshuazhansonspam at yahoo dot com

I currently work as a Data and Applied Scientist at Microsoft in Mountain View. I graduated with a Master's of Language Technologies from Carnegie Mellon University in August 2022 after working on computer vision, robotics, and NLP. I graduated B.S. in Computer Science with a minor in Machine Learning from Carnegie Mellon University in May 2020.

I'm broadly interested in leveraging structure to tackle tough problems, such as learning visual representations from embodied interaction, robotic control in deep reinforcement learning, and learning control from language.

I'm passionate about teaching and undergraduate research. I have fun organizing events. I also occasionally write instructive blog posts.

In my spare time, I love hiking, biking, running, and the great outdoors. I also play board games and tabletop RPGs.

Research

On Proximal Policy Optimization's Heavy-tailed Gradients

Saurabh Garg, Joshua Zhanson, Emilio Parisotto, Adarsh Prasad, J. Zico Kolter, Zachary C. Lipton, Sivaraman Balakrishnan, Ruslan Salakhutdinov, Pradeep Ravikumar

arxiv, ICML page

ICML 2021, Science and Engineering of Deep Learning (SEDL) Workshop at ICLR 2021, CMU SCS Honors Senior Thesis 2020

Proximal Policy Optimization, a policy gradient optimization algorithm commonly used in deep reinforcement learning, relies on an arsenal of heuristics to offset significant heavy-tailedness in its gradients, which we can elide with Geometric Median-of-Means, a high-dimensional estimator from robust statistics.

Proprioceptive Spatial Representations for Generalized Locomotion

Joshua Zhanson, Emilio Parisotto, Ruslan Salakhutdinov

pdf, video, website, code (coming soon)

Workshop on Structure and Priors in Reinforcement Learning (SPiRL) at ICLR 2019

A body-space state representation that implicitly encodes relative positional information by mapping sensor readings onto a spatial grid over the robot’s body facilitates learning of a general locomotion policy transferable across many randomized body configurations with different balances and requiring different walking gaits.

Teaching

the cover of Writing for Computer Science Third Edition by Zobel

15-300 Research and Innovation in Computer Science

TA, Fall 2020

the cover of Computer Systems: A Programmer's Perspective Third Edition by Bryant and O'Hallaron

15-213 Introduction to Computer Systems

TA, Summer 2020

I led development of new lecture activities and porting existing lecture activities to a virtual format. I also proposed and implemented a remote active learning lecture format and schedule, and coordinated staff attendance at virtual active learning lectures.

the cover of Reinforcement Learning An Introduction by Sutton

10-703 Deep Reinforcement Learning and Control

TA, Fall 2019

I updated and released homeworks on policy gradient methods and other topics, held office hours, and wrote in quiz and exam questions.

15-300 Research and Innovation in Computer Science

TA, Fall 2019

I help equip students with the tools needed to get involved in and effectively carry out undergraduate research. I also provide mentorship and support during undergraduate research projects and coaching on topics such as academic writing and presenting research to an academic and non-academic audience.

15-213 Introduction to Computer Systems

TA, Summer 2019

I developed active learning lecture activities on system-level I/O, network protocols, and participated in development for activities on bit-level representations, machine programming, and exceptional control flow (processes and signals). I also scaled and benchmarked memory access traces to evaluate student dynamic memory allocator submissions.

Projects

15-410 Pebbles Kernel "jOSh"

With Joshua Kalapos.

For 15-410 Operating System Design and Implementation, Spring 2020

We built an Unix-like operating system kernel with support for tasks and threads, paging and multiple virtual memory address spaces, scheduling and sleeping, and context switching and preemptive multitasking to run on real x86 IA-32 Intel hardware.

a diagram of the four steps of Monte Carlo Tree Search

Parallel Monte Carlo Tree Search for Tak

With Eric Nie.

Final project for 15-418 Parallel Computer Architecture and Programming, Spring 2019

We implemented and benchmarked a leaf-parallelized, root-parallelized, and tree-parallelized Monte Carlo Tree Search agent to play the fantasy board game Tak.

a screenshot from the Atari game Breakout

breakout-demo

I implemented and trained Advantage Actor-Critic agent to reach 200+ average reward per episode over 100 episodes and max reward of 428 on OpenAI Gym Breakout-v0 environment. I also wrote a blog post about it.

a black-and-white state representation for an agent in the slither.io game

Slither-703

With Shreyan Bakshi and Eric Nie.

Final project for 10-703 Deep Reinforcement Learning and Control, Spring 2018

We implemented Snake in Python as a single-agent and multi-agent OpenAI Gym environment and successfully trained DDQN and A2C on both environments. Then, we patched up the abandoned OpenAI Universe and trained A2C on the online multiplayer game slither.io against real humans.

Involvement

ScottyLabs

Former Director of Events

It's my fourth year organizing CMU's largest annual hackathon TartanHacks. I gave a tech talk on Git and using the Bash terminal at tech talk event Crash Course. Later, I organized and mentored other speakers for tech talk events Web Dev Weekend and Crash Course.

Salmon Days

Previously, I volunteered at Issaquah, WA's annual Salmon Days festival celebrating the return of almost two tons of salmon as a general-purpose can-do problem solver on the legendary "Ofishal XStream Team."