Big thanks to WeWork for hosting.
Want to be an official sponsor? Check out
Notes copied directly from the hackpad!
Availability of the US Government
Looking at the last year of the availability of the US government. What is the lifetime availability for the 230 years; what is the uptime? How many nines of availability have we had? Three!
If the shutdown continues, we will lose the third nine of availability on Monday. The government would have to remain open for over three more years in order to regain that nine that was lost.
Prior to 1980, even though Congress would sometimes be late with appropriations bills, but until the Anti-Deficiency Act was interpreted strangely.
I learned, wrote, published, and promoted! About Gaussian Processes!
At a hackathon and attendees were given random data. One team did machine learning – but you can’t learn anything from machine learning on random data!
But they used Gaussian processes – and … wait, he didn’t know enough on how to critique that. So he looked at Gaussian Processes for Machine Learning from MIT Press, and more articles, and the implementation in scikit-learn.
And ended up writing a blog post about how Gaussian Processes are not so fancy – using kernels, which defines how close two things are. It’s exactly like linear regression on a particular transformation. Turns out gaussian processes are useful after all, especially if you don’t have a lot of data, like for hyperparameter optimization.
Ben Ortiz Ulloa
Florida Shuffle: A Toy Model of The Opioid Rehab Cycle
In May 2018, John Oliver had a special about rehabs and spoke about the Florida shuffle: when addicts go to rehab, they leave rehab, and they’re not exactly cured, and often relapse. The government also does not regulate the efficacy of rehab facilities.
Agent-based models are rule-based. What do these agents do?
Addict: an addict is either unstable (relapsed and is currently using); stable (not using) Rehab: a facility where an unstable addict would go Post-Rehab: a facility where an addict would go after time in a rehab facility despite their status. With agent-based models, you try to replicate reality and then add interventions to see if anything changed.
In this game: the intervention is removing the worst-performing rehab centers at time t.
Subreddit Classification via Natural Language Processing
Creating a natural language processing model that could distinguish between two subreddits (communism & socialism).
- Query PushShift API to retrieve submissions
- Clean & pre-process text
- Vectorize/tokenize text
- Gridsearch to optimize hyperparameters across two classification algorithms
- Removed extraneous tags, removed post, moderator posts, links, short posts
Lots of similarities between socialism & communism, so that’ll make it a little more difficult. Used 75K vectors, removed/edited stopwords, used N-gram range of 1,2
Ran two models: logistic regression & random forest.
Logistic regression performed better!
Removing more stop words would improve predictability of this model.
Sentiment analysis tool using Natural Language Processing on selected Reddit data
Like Ashley’s project before, created a Natural Language Processing on selected Reddit data.
Inspired by how his week tends go to: spiking between happy and panicked!
What is the overall tone of this text?
https://goo.gl/wrxGWq and enter some text
Under the hood, what’s going on?
Gather the data from Reddit via the PushShift API, transform and improving the data, and vectorizing. Reduced the feature set down to ~10k words. Didn’t use stemming or lemmatization. Removed some stop words but not all. Using n-grams, you can be more certain that removing stop words isn’t going to change the meaning of a phrase.
Training score ia ~89% or so.
What’s new with Metro Map Maker: January 2019
Metro Map Maker was inspired by art!
Metro Map Maker started great, and it’s been improved! Usage has increased! Hundreds of maps! Thousands of maps! But how can we track how many maps? ON PAPER??? That is not technology of the twenty-first century! Naturally, we must not lose the speed of the user experience! Previously, there were 14 steps to get a map into the gallery! Wow! So many steps! Very diligent! But so much work! THERE WERE REGULAR EXPRESSIONS!?!?!!? Ugh
NOW: Just four steps! Review, determine where, step three, step four! Done! Key to the process: How to automate the generation of the image? Have you seen the data-url for a 1600x1600 canvas? THAT IS BIG! BUT: What if we do a tiny little canvas secretly? In the background? Optimize those images! Down to just 2-5KB. AND YOU NEED THAT if you want it to be dope with 500 images in the gallery! HACK! To change the color depth of the HTML canvas! Thanks StackOverFlow!
So, everyone should make lots of beautiful maps, with beautiful terrain!
Inspired by art. Metro map of your dreams! What if we actually invested in public transit?
stretchMap() on the console!
less cards is a new type of card that makes it simple to share contact details with others without ever having to worry about running out of business cards.
Created a “less card” that has a QR code with your details on it. (Also has an NFC chip). How can you give your contact information without actually losing the business card you have?
On Android: just bring the card within range of the phone and it’ll bring up the contact right in the phone. On iPhone, the camera app natively can scan a QR code and open it.
Synthetic Countenance, a personalized GAN
GANs == “generative adversarial network”
Two networks that try to fake each other out and continually get better.
Used “Progressive Growing of GANs” & the Celebrity training set.
Created a bunch of images of fake people that look very real trained on real celebrities!