Coding for a New House

I don’t want to read about it, just take me to your super cool interactive map.

We are moving to the greater New York City area this summer where Tim joins the leadership team of Colgate-Palmolive. As tempting as it is to spend all our time on Trulia, Estately or Zillow looking at specific houses, we knew that our focus was best spent on understanding the different areas and the trades they presented. I’m an analyst at heart, and always try to do the analysis at the right level of detail. At this stage, this means a map that incorporates (in order) schools, commute times, and lifestyle options. As an advisor to spatial.ai, Tim’s been inspired to create insightful mashups. Maps are pretty much the most excellent example of useful data where one can quickly do analysis without any voicetrack. The right map can serve as a common medium for discussion with friends, realtors and our own understanding as we try to hone in on the right area. With a good contextualized spatial understanding, we can be ready to make the quick decisions that house-hunting presents.

This is why a large number of sites display helpful data geared towards house-hunters. As we started looking at different map-based real estate search engines online, we found different merits to each one but no site gave us the commute, school information and the lifestyle options we care about in one interface. Estately was the most promising. The product side was clearly driven by developers with clean url lookups and clever metrics like walkability. Trulia is the most full featured with some really cool features, like price heatmaps that would be useful if they didn’t have so many blank regions. I enjoy Trulia the most, but it doesn’t have the latest listings.

Trulia Heat Map
Trulia Heat Map

Zillow has an awesome api but legally can’t provide anything that can be called "bulk data". Redfin’s killer feature is the ability to search by school district. This is pretty much critical, since the school district doesn’t often match the town name and we started falling in love with houses online that we had to give up once we found out it wasn’t in a school district we were ok with.

Schools

In Alexandria, we love our house, elementary school, church and community. In order to find the best school system possible, we relied heavily on the rankings assigned by njmonthly.com. Their ranking was a composite of school environment, student performance and student outcomes. These scores were based entirely on data reported by the schools to the state Department of Education (DOE) and published in the School Performance Reports section of the DOE website. You can read more about their methodology at njmonthly.com. We also looked at Great Schools to crosscheck the list. Tim used python, the google geocoding API and google sheets to get geocoordinates for each school. He then was able to pull these into google maps builder and assign a color corresponding to the schools’ rank. While there is a lot more work in the future to better understand the potential at each school, the map below was very helpful for us.

Commute Time

Ok, this is the fun part where Tim’s gets to use his ninja programming skillz. Tim is going to be traveling a lot, but when he is home he will often be in Piscataway, NJ and Manhattan. Nothing online would show the average, maximum or minimum commute times for multiple locations. Additionally, we wanted combined different traffic patterns and the optimal route found by comparing public transit and driving. In order to build this, Tim build a python script that used the google directions api and the associated python library to provide transportation times. He then used matplotlib and basemap to put a grid across the region of interest and then used the contour features to generate contour lines for regions that were 20, 30, 40, 50, 60, and 70 minutes away. This produced lots of plots that helped get a feel of the major transportation routes and how traffic varied by time of day.

Of course, Tim did excursions over time of day and built maps that looked at optimistic- and worst-case scenarios. In the end, it worked best to make each excursion a map layer and to bring in different data sets as we had questions. The most helpful map presented the contour lines made from averaging the best commute from each grid point (in this case a 15 x 15 grid):

How much does commute vary?

The sparkline in each row below shows the commute time for 12 times between 4:30am to 10am each morning. Transit options weren’t possible to Colgate’s technology center, but they generally were to NYC. Commute times below are in minutes. I’m was expecting to see more variance in the commute time. This is either an error in my code or Google really isn’t forecasting commute times based on historical traffic.


Colgate NYC
  Driving Transit
Location mean Commute mean Commute mean Commute
Westfield, Union 31 46 93
livingston 40 48 118
Chatham, Morris 40 45 122
West Windsor-Plainsboro South, Mercer 34 70 132
Holmdel, Monmouth 34 59
West Windsor-Plainsboro North, Mercer 34 71 195
Livingston, Essex 44 43 106
Montgomery, Somerset 34 78
Haddonfield Memorial, Camden 64 98 223
Princeton, Mercer 33 71 170
short hills 36 39 137
New Providence, Union 35 46 138
Ridge (Basking Ridge), Somerset 25 53 131
westfield 30 46 85
Watchung Hills Regional (Warren), Somerset 26 48    
Millburn, Essex 37 40    
Glen Rock, Bergen 55 35 105
Kinnelon, Morris 52 50 128


Lifestyle

Our social structure revolves around our church, then around fitness (CrossFit and Rock Climbing Gyms) and other town-centered options (like shopping at Whole Foods, or a charming downtown). We wanted to live as close to the city as possible, while still able to find a nice home with the right school. The most helpful way to incorporate this information was to build several lists and then use the google geocoding API to get the geocoordinates. From here, it was simple to export to CSV and upload into the mashup. This produced this insanely cool composite map.

Results: Potential Locations

Princeton, Montgomery, West Windsor

We love the downtown, schools and academic atmosphere of Princeton. It is close to cool companies like SRI and major research centers. It also has nice neighborhoods and is very walkable. It has a train to NYC and has good church options. It is much farther from the city than we want to be and the house prices are much higher in Princeton proper when compared with the local area.

Westfield, Milburn, Short Hills, Livingston, Monclair

There was another cluster much closer to the city. We also like the option of attending Redeemer Church of Montclair. However, we hate to give up the university town and high tech feel of the town.

Summary

In all, we now look forward to visiting in person and getting a feel for these neighborhoods. I feel like we have a good map that we can update as we get to know the area better. Hats of to Google for making so much accessible through APIs and for making such nice interfaces to view everything with. Open standards just plain rock.

We put this post together to collect our thoughts, share our code and methodology, but also to help the dialogue with our friends. If you have any thoughts on the above, please email us at tim_and_chrissy@www.theboohers.org.

Resources

Please feel free to use, modify and enjoy the code we wrote for this. Feel free to see and edit our spreadsheet

Links

Code

Code to build commute maps

Code to build commute table

DIY Roulette Wheel and Probability for Kids

My daughter had to make a “probability game” for her fifth grade math class. She has been learning javascript and digital design so we worked together to design and build a roulette wheel for her class.

First, she drew out a series of sketches and we talked it over. She wanted a 0.5 inch thick plywood 2 foot diameter, but after some quick calculations we decided on a $1/4$ inch thick wheel and a 18″ diameter. I had to talk her into adding pegs and a ball bearing from a skateboard from amazon for $2. The inner diameter is $1/4$ inch so I also bought a package of dowels for $5 to make the pegs. I also bought a 1/2 sheet of plywood (that I used about 1/3 of) and some hardware from Home Depot.

She wanted 10 sections with combinations of the following outcomes: Small, Large and Tiny prizes as well as two event outcomes: Spin Again and Lose a Spin. Each student would have at most three turns. We had the following frequencies (out of 10):

Outcome Frequency
Small 3
Large 2
Tiny 1
Lose a spin 3
Spin Again 1

This led to a ~~fun~~ (frustrating) discussion monte carlo code,  conditional probabilities and cumulative probabilities. Good job teacher! We got to answer questions like:

  • What is the probability of getting a large prize in a game (three spins)?
  • What is the probability you get no prize?
  • What is the expected number of spins?

She really threw the math for a loop with the Spin Again and Lose a Spin options. We had to talk about systems with a random number of trials. My favorite part was exposing her to true randomness. She was convinced the wheel was biased because she got three larges in a row. I had to teach her that true random behavior was more unbalanced than her intuition might lead her to believe.

In order to understand a problem like this, it is all about the state space. There are four possible outcomes: three different prizes or no prize. To explain the effect the spin skips have on the outcomes, I had to make the diagram below. Each column represents one of the three spins, each circle represents a terminal outcome and each rectangle represents a result of a spin.

Drawing1

From this, we can compute the probabilities for each of the 17 outcomes:

1 2 3 Prob
1 $P_L$ 0.200
2 $P_S$ 0.300
3 $P_T$ 0.100
4 L $P_L$ 0.060
5 L $P_S$ 0.090
6 L $P_T$ 0.030
7 L L 0.090
8 L S 0.030
9 S $P_L$ 0.020
10 S $P_S$ 0.030
11 S $P_T$ 0.010
12 S L 0.030
13 S S $P_L$ 0.002
14 S S $P_S$ 0.003
15 S S $P_T$ 0.001
16 S S L 0.003
17 S S S 0.001

I would love to find a more elegant solution, but the strange movements of the state-space left me with little structure I could exploit.

And we can add these to get the event probabilities and (her homework) to generate the expected values of prizes she needs to bring when 20 students are going to play the game:

Probability Expected Value Ceiling
$P_L$ 28{aaa01f1184b23bc5204459599a780c2efd1a71f819cd2b338cab4b7a2f8e97d4} 5.64 6
$P_S$ 42{aaa01f1184b23bc5204459599a780c2efd1a71f819cd2b338cab4b7a2f8e97d4} 8.44 9
$P_T$ 14{aaa01f1184b23bc5204459599a780c2efd1a71f819cd2b338cab4b7a2f8e97d4} 2.82 3
NP 16{aaa01f1184b23bc5204459599a780c2efd1a71f819cd2b338cab4b7a2f8e97d4} 3.10 4

We can also get the probabilities for the number of spins:

Count Probability
One spin 0.600
Two 0.390
Three 0.010

Simulation

When the probability gets hard . . . simulate, and let the law of large numbers work this out.

This demonstrated the probability of getting a prize was:

Probability Expected Value Ceiling
$P_L$ 28{aaa01f1184b23bc5204459599a780c2efd1a71f819cd2b338cab4b7a2f8e97d4} 5.64 6
$P_S$ 42{aaa01f1184b23bc5204459599a780c2efd1a71f819cd2b338cab4b7a2f8e97d4} 8.46 9
$P_T$ 14{aaa01f1184b23bc5204459599a780c2efd1a71f819cd2b338cab4b7a2f8e97d4} 2.82 3
NP 15{aaa01f1184b23bc5204459599a780c2efd1a71f819cd2b338cab4b7a2f8e97d4} 3.08 4

Design

So I took her designs and helped her write the following code to draw the wheel in Adobe Illustrator. This didn’t take long to write, because I had written similar code to make a series of clocks for my 5 year old to teach him how to tell time. The code was important to auto-generate the designs, because we must have tried 10 different iterations of the game.

Which produced this Adobe Illustrator file that I could laser-cut:

spinner2

From here, I designed a basic structure in Fusion 360. I cut the base and frame from $1/2$ inch birch plywood with a $1/4$ inch downcut endmill on a ShopBot.

A render:

the wheel v19

And a design:

2016-06-09 (1)

If you want the fusion file, request in comments and I’ll post.

Please let me know if you have any questions and I’ll share my design. Next up? We are going to print a new wheel to decide who washes the dishes! Kids get double the frequency.

Kids Lego table: Case study in Automation for Design

Motivation

I had to upgrade the Lego table I made when my kids were much smaller. It needed to be higher and include storage options. Since I’m short on time, I used several existing automation tools to both teach my daughter the power of programming and explore our decision space. The goals were to stay low-cost and make the table as functional as possible in the shortest time possible.

Lauren and I had fun drawing the new design in SketchUp. I then went to the Arlington TechShop and build the frame easily enough from a set of 2x4s. In order to be low-cost and quick, we decided to use the IKEA TROFAST storage bins. We were inspired from lots of designs online such as this one:

lego-table-example

However, the table I designed was much bigger and build with simple right angles and a nice dado angle bracket to hold the legs on.

table_with_bracket

The hard part was figuring out the right arrangement to place the bins underneath the table. Since my background is in optimization I was thinking about setting up two-dimensional knapsack problem but decided to do brute-force enumeration since the state-space was really small. I built two scripts: one in Python to numerate the state space and sort the results and one in JavaScript, or Extendscript, to automate Adobe Illustrator to give me a good way to visually considered the options. (Extendscript just looks like an old, ES3, version of Javascript to me.)

So what are the options?

There are two TROFAST bins I found online. One costs \$3 and the other \$2. Sweet. You can see their dimensions below.

options

They both are the same height, so we just need to determine how to make the row work. We could arrange each TROFAST bin on the short or long dimension so we have 4 different options for the two bins:

Small Side Long Side
Orange 20 30
Green 30 42

First, Lauren made a set of scale drawings of the designs she liked, which allowed us to think about options. Her top left drawing, ended up being our final design.

lauren designs

I liked her designs, but it got me thinking what would all feasible designs look like and we decided to tackle this since she is learning JavaScript.

Automation

If we ignore the depth and height, we then have only three options $[20,30,42]$ with the null option of $0$ length. With these lengths we can find the maximum number of bins if the max length is $112.4 \text{cm}$. Projects like this always have me wondering how to best combine automation with intuition. I’m skeptical of technology and aware that it can be a distraction and inhibit intuition. It would have been fun to cut out the options at scale or just to make sketches and we ended up doing those as well. Because I’m a recreational programmer, it was fairly straightforward to enumerate and explore feasible options and fun to show my daughter some programming concepts.

$$ \left\lfloor
\frac{112.4}{20}
\right\rfloor = 5 $$

So there are $4^5$ or $1,024$ total options from a Cartesian product. A brute force enumeration would be $O(n^3)$, but fortunately we have $\text{itertools.product}$ in python, so we can get all our possible options easily in one command:

itertools.product([0,20,30,42], repeat=5)

and we can restrict results to feasible combinations and even solutions that don’t waste more than 15 cm. To glue Python and Illustrator together, I use JSON to store the data which I can then open in Illustrator Extendscript and print out the feasible results.

results

Later, I added some colors for clarity and picked the two options I liked:

options

These both minimized the style of bins, were symmetric and used the space well. I took these designs forward into the final design. Now to build it.

final_design

Real Math

But, wait — wrote enumeration? Sorry, yes I didn’t have much time when we did this, but there are much better ways to do this. Here are two approaches:

Generating Functions

If your options are 20, 30, and 40, then what you do is compute the coefficients of the infinite series

$$(1 + x^{20} + x^{40} + x^{60} + …)(1 + x^{30} + x^{60} + x^{90} + …)(1 + x^{40} + x^{80} + x^{120} + …)$$

I always find it amazing that polynomials happen to have the right structure for the kind of enumeration we want to do: the powers of x keep track of our length requirement, and the coefficients count the number of ways to get a given length. When we multiply out the product above we get

$$1 + x^{20} + x^{30} + 2 x^{40} + x^{50} + 3 x^{60} + 2 x^{70} + 4 x^{80} + 3 x^{90} + 5 x^{100} + …$$

This polynomial lays out the answers we want “on a clothesline”. E.g., the last term tells us there are 5 configurations with length exactly 100. If we add up the coefficients above (or just plug in “x = 1”) we have 23 configurations with length less than 110.

If you also want to know what the configurations are, then you can put in labels: say $v$, $t$, and $f$ for twenty, thirty, and forty, respectively. A compact way to write $1 + x^20 + x^40 + x^60 + … is 1/(1 – x^20)$. The labelled version is $1/(1 – v x^20)$. Okay, so now we compute

$$1/((1 – v x^{20})(1 – t x^{30})(1 – f x^{40}))$$

truncating after the $x^{100}$ term. In Mathematica the command to do this is

Normal@Series[1/((1 - v x^20) (1 - t x^30) (1 - f x^40)), {x, 0, 100}]

with the result

$$1 + v x^{20} + t x^{30} + (f + v^2) x^{40} + t v x^{50} + (t^2 + f v + v^3) x^{60} + (f t + t v^2) x^{70} + (f^2 + t^2 v + f v^2 + v^4) x^{80} + (t^3 + f t v + t v^3) x^{90} + (f t^2 + f^2 v + t^2 v^2 + f v^3 + v^5) x^{100}$$

Not pretty, but when we look at the coefficient of $x^{100}$, for example, we see that the 5 configurations are ftt, ffv, ttvv, fvvv, and vvvvv.

Time to build it

Now it is time to figure out how to build this. I figured out I had to use $1/2$ inch plywood. Since I do woodworking in metric, this is a dimension of 0.472 in or 1.19888 cm.

 $31.95 / each Sande Plywood (Common: 1/2 in. x 4 ft. x 8 ft.; Actual: 0.472 in. x 48 in. x 96 in.) 

or at this link

So the dimensions of this are the side thickness $s$ and interior thickness $i$ with shelf thickness $k$. Each shelf is $k = 20-0.5 \times 2 \text{cm} = 19 \text{cm}$ wide. All together, we know:

$$w = 2\,s+5\,k+4\,i $$

and the board thickness is $t$ where $t < [s, i]$.

which gives us:

st width
s 1.20
i 3.75
k 19.00
w 112.40

Code

The code I used is below:

References

Work Life Balance for a Dad and Husband — who loves His Job

In a meeting last week, I had a moment of clarity that put a question directly in front of me that I’ve been dreading to answer: How much of my heart should I put into my work and at what cost to my family and other work interests?

You see, the meeting ended abruptly at 5pm because it was time for that particular office to “lock up”, meaning that they had boundaries and were used to going home to their families. I’m used to meetings at the end of the day being extra-long because there is almost a contest in the Pentagon to see who stays the latest, and therefore works the hardest and, we might assume, cares the most, is the smartest and has the overall highest worth to society. I stretch this a bit, but only slightly. Throughout my working life there has always been a tension to put in more hours, give the most of your heart and life to the system. This seems to be a concrete way to distinguish yourself as a top-tier worker.

Part of this is because the military is a large bureaucracy where everyone is compared to their peers at a local level and the system allows for abstract feedback. But working harder always demands some form of recognition. This is not all bad. I am convinced it is a good thing to want to do good work and devote oneself to making a difference, even if there are high personal costs. Looking some of my heros off the top of my head: Jesus, Dietrich Bonhoeffer, Martin Luther King, CS Lewis, Cicero and the Apostle Paul  — I don’t see a 9 to 5 life. All these individuals probably didn’t coach little league (or their equivalent). All had strange family lives and suffered terribly.

This is the start of the conundrum: how much do our heros mislead us in who we are supposed to be? I mean, the suffering of Paul and his desire to spread the Gospel were a singular focus — totally out of sync with my desire to optimize my “wellbeing” in several spheres: relational, physical, spiritual, financial and intellectual. Did Paul check the air in his tires, fund his 401(k), apply fertilizer in the spring, always remember to write thank you notes, and read challenging books, oh and did he remember that what interests his boss, should fascinate, and consume, him?

No, he lived for the Gospel. Which is what I should be doing. Now, before this devolves into a discussion of life focus and God’s will, I want to bring this back to the main point: How much of my heart goes into my work? I have two main thoughts on this.

First, we should serve our work and put our full heart into it. I don’t jog well. I kind putz around and get tired. However, I can run fast. When I really put my heart into something, I can hold around a 6/min mile pace for marathon distances and am willing to really take my effort into a pretty extreme place. The same goes for my work. I can perform, but I have to really focus and really push myself to do something hard. I feel I should be sprinting at work — giving my employer, who happens to be the US citizen the very best I can. In giving my heart to my occupation and seeking to make a difference, I leave the legacy to my family of hard work and societal contribution. In working hard, I serve my son in ways that working only 8 hours a day might never provide him.

However, I feel that I need to be grounded in the Gospel. My heart must be grounded in the Gospel. My day must start and end with devotions and prayer. My risk tolerance must be calibrated by eternal consequences and supported by my knowledge that my self worth is provided by the Gospel. My hunger, motivation and passion must be centered in the Gospel. My sole metric is the commandment of Christ: am I loving the Lord my God above all else and am I loving my neighbor as myself?

Perhaps there is a tension here, and this tension is where I feel called to be. What it means practically, is that I am here to serve: to serve my family, to serve my country and to serve my God. But! The Gospel tackles fear head on. The Gospel tells me to forget all that and to trade fear for love.

Thoughts greatly appreciated.