The wealth of property

Note: This post was revised at 10:30 AM to remove vacant lots from the data and map. All calculations have been adjusted. (Thanks, Justin!)

Property tax records are a frequently overlooked resource for regional demographic data. In particular, they can provide a window into distribution of wealth, something census data does a poor job of illustrating. In Tyler, as in most places, property tax data is publicly accessible. Unfortunately, it’s not easily available in bulk. I’ve long put off writing a script to extract the data from the Smith County Appraisal District (SCAD) website, despite knowing it would be a rich source of information. So I was particularly pleased when a friend of a friend at Tyler-based TaxNetUSA offered me the complete Smith County property tax rolls, already cleaned and ready to be analyzed. Thanks to this new, rich data source I can make things like this:

Tyler home values, 2011

I find this map fascinating for any number of reasons, but first a few caveats about how it was created. It might seem that this would be as simple as selecting properties labeled residential and putting them on a map. In fact, there is no single way of identifying a property as being residential. In order to estimate what constitutes a “home” I used the following process:

  • Connect each property tax record to the tax parcel documenting its shape (the parcel shapes are what is actually mapped).
  • Connect each tax parcel to the zoning area it is in.
  • Filter out all tax parcels not in a residential zone.
  • Filter out all parcels with zero “improvement value” to remove vacant lots.

Unfortunately, for reasons that are unclear, this doesn’t come even close to filtering out all businesses, churches, state lands, etc. (Ostensibly the zoning in Tyler is a total mess.) To accomplish this I had to go over the data and generate a long list of rules about what not to include, based on the name of the “owner”. So, for instance, I got rid of anything that ended in “LLC”, “PENTECOSTAL” or “FOUNDATION”. After a few hours of this, I had a set of properties that I believe more or less correspond to individually and jointly owned homes. It should not be believed to be perfect, but any errors should be evenly distributed throughout the dataset.

My first intuition was to shade this map by value increments of $100,000. This had the intriguing property of very clearly illustrating the “one percent” of Occupy Wall Street fame. They have homes valued at more than $500,000 and mostly live around Hollytree Country Club. However, this approach also lumped more than half of all properties (52.5%) into the “less than $100,000” group. When I observed that the median value was only $95,572 it became clear that this approach was obscuring a lot of what was interesting about the underlying data.

Instead of equal intervals, I’ve used quantiles to the represent the data, that is, five groups where each group accounts for 20% of the properties. Thus each color corresponds to approximately 4,723 of the 23,616 mapped properties. This has the much more valuable effect of illustrating where there is poverty (the 20% of homes worth less than $45,841) and prosperity (homes worth more than $173,640).

Even more interestingly, this approach seems to adequately distinguish the lower-middle, middle, and upper-middle classes relative to Tyler norms, though one should bear in mind that apartments are not illustrated at all. As low-income groups tend to cluster in multi-family units one should imagine that population being significantly larger than the map illustrates. Despite these limitations, I feel that this approach adequately demonstrates the reality of the economic divisions in the city. (I’m pleased to note that my neighborhood, Charnwood, is, as in all things, a melting pot.)

I didn’t create this map to invoke the specter of class warfare, though frankly I don’t think we can be reminded too frequently that many Americans can’t afford to eat properly. I did create it in order to demonstrate how geographically and racially aggravated these class divisions are. Tyler north of Front Street is poor and Hispanic. (See the race map, I made last year.) South of Front Street is wealthy and white. The predominantly African American communities in west and northwest Tyler are marginally better off than the Hispanic areas. However, North Tyler is also growing much more rapidly, thanks to a 55% expansion in Tyler’s Hispanic population over the last ten years. These demographic forces are going to have an unprecedented impact on the city during the next decade.

Hopefully this map will encourage individuals to carefully consider Tyler’s class stratification, especially as it impacts efforts to support minority communities, revitalize downtown and prevent economic stagnation. I’ve only just scratched the surface of this property tax data and I expect to do several more blog posts using it. In the meantime, here are a few more facts about the data presented on this map.

  • Total value of all properties on the map: $2,942,661,243.
  • City property taxes collected on this amount: $6,147,219.
  • Total land area: 63438 square acres.
  • Median year of home construction: 1960.

Thanks for reading.

What I will not be doing next

In January I had an opportunity to turn Hack Tyler into something more than it is today. It was a chance to elevate the effort to a higher profile, pursue larger projects and even gain some modest financial support. I planned extensively for this chance, wrote a careful announcement letter to the city council and solicited advice from many friends and colleagues.

I have decided not to pursue this opportunity.

Though my decisions are beholden to no one (more on this in a moment), I feel I should justify this choice, in part because I believe if I had pursued this opportunity the scope and impact of Hack Tyler projects could have expanded tremendously. In short, I’m walking away from the chance to do more good.

Hack Tyler, is andI intendalways will be, a rabbit hole for me. It is an opportunity to experiment and scratch my own itches. When I built Tyler Transit I knew it could be useful to many people, but that was not why I did it. I did it because I wanted it. I was my own user.

Were I to formalize Hack Tyler it would cease to become an outlet for my creative whims and intuitions. I would be making things because they are needed and not because I am passionate about them. This is a often a minor distinction, but the cumulative result is that the effort would begin to seem more and more like work until eventually I lost motivation.

I know this, because it’s already happened. I’ve forced myself to work on things, not because I was excited about them, but rather because I believe they are what Tyler needs most. This has dulled my interest in them and ultimately caused me to put them aside. This is entirely my own fault. In my enthusiasm for a good idea I began to view myself as a revolutionary instead of a tinkerer. I was wrong and the result is that I ceased to be personally invested in the projects. This isn’t good for me and it would almost certainly be fatal to any organized entity I might endeavor to form around the project.

I want to create great things for Tyler and I hope others are equally eager about the possibilities. However, I won’t turn Hack Tyler into something that I must manage. If I am to be excited about it, it must be for my own reasons—the real ones, not the ones I made up.

I can’t tell you how frequently I will be updating the site in the future, but I can guarantee that when I do it will be because I’m excited to share something I’m passionate about, not just something I believe is objectively important.

As an example, today I’d like to share with you a new, albeit minor, project:

2010 Census: Age diversity in Smith County

This is a map I made of Smith County age demographics using census data. It is also an experiment in designing a dot-density map to be color-blind friendly.

In addition, this map was a response to interest from Glory Development Corporation, a Tyler non-profit dedicated to building and rehabilitating affordable housing. This map was built to help them identify areas where there are concentrations of elderly residents. (I also made a separate, less busy version more specific to their needs.) They have big dreams and good ideas and I’m helping them because their passions overlap with mine. That, I believe, is where I am at my best and it is where I shall engage my efforts from this point on.

2010 Census: Racial diversity in Smith County (map!)

Note: It has been nearly six weeks since I last wrote on Hack Tyler. I expected some of this for the reasons outlined in my last post, however, the delay was extended by a two week period during which I thought I might not be moving to Tyler after all. This has turned out not to be the case. The exact dates are still undetermined, but I will be moving in the Fall.


Since starting Hack Tyler I’ve wanted to collaborate with locals who know the place better than I. For this post I invited Mike Rogers, a native of Tyler and recent graduate of the University of Richmond, to publish in tandem with me. Read his thoughtful reflections on race in Tyler at his blog, Highways and Hallowed Halls.

In the last decade the population of Tyler grew 15.8% to 96,900, not quite keeping pace with the growth of Texas or Smith County, both of which topped 20%. Over the same time period Tyler’s Hispanic population grew 55% to 20,511—the city’s most significant demographic shift of the decade. These and numerous other insights can be gleaned from the Summary File 1 (SF1) census release, which was made available for Texas on Thursday.

The SF1 is what is most commonly thought of as the “big” census release. It contains very granular population counts summarized by race, family status, age, sex, housing status and a variety other subjects. This is the data that is commonly used by newspapers, city planners, and demographers to make informative maps, plan services, and project population trends, respectively. I’ve spent much of the last six months analyzing census data for my work at the Chicago Tribune, which last week culminated in the release of detailed maps of same-sex relationships and children less than five years old for the Chicagoland area.

Over the last few evenings I’ve taken advantage of my access to the embargoed census data to use these same techniques to prepare race map for Tyler and wider Smith County. Many thanks to my fellow news applications hackers for allowing me to recycle our source code for generating map tiles and presenting them online. Click the screenshot to view the map. (Then come back and keep reading!)

2010 Census: Racial diversity in Smith County

2010 Census: Racial diversity in Smith County

Tyler, like most American cities, is visibly segregated along racial lines. Blacks and Hispanics occupy the areas north and west of downtown, though those two groups are themselves more integrated than I would have expected. (Chicago’s extreme segregation has hyper-sensitized me to trends such as these.) The eastern and southern parts of Tyler are predominantly white, though some areas are more racially integrated.

A few things to look for on the map:

  • Dense clusters of mixed race frequently indicate group quarters, such as the Smith County jail, or student housing on the UT Tyler campus. Others mark apartment buildings and developments of townhouses.
  • The racial trends continue beyond the city, with most Hispanics living to the north of the city and most whites living to the south.
  • Whites are both more spread out and more populous in the rural areas of Smith County, accounting for over 62% of the county’s total population (Tyler included).
  • White residents particularly cluster in the lakefront communities around Lake Tyler and Saline Bay.

Though its less visible in the map it’s also worth noting that the Asian population in Tyler spiked by over 125% to 1,807.  Though this represents only 1.9% of Tyler’s 2010 population it outpaces an already dramatic 71% surge in the total Asian population of Texas.

Want more census data? Be sure to check out the Texas Tribune’s excellent statewide coverage. If you want even more detail, a good place to start is census.ire.org, a public project of Investigative Reporters and Editors created to make working with census data easier. Follow this link to jump straight to Tyler and a subset of tables that informed the map and this post.

If there is a particular aspect of Tyler’s demography you’re interested in, please leave a comment. There are many more maps to be made.

Hacking Tyler Transit

Why bus schedules? In my first post I named them at the top of my list of datasets I would like to build on. I also mentioned that I intended to avoid buying a car once I moved, a statement that provoked significant eye-rolling. I’ve been told that no one rides the bus in Tyler or that only poor people do. A fellow hacker who grew up Tyler told me he didn’t even know they had a bus system. This isn’t really a surprise—Tyler has low population-density (1,982 people per square mile, according to Wolfram Alpha) and a food desert in its urban core. I was stunned to discover that a transit system even existed. So why do I think its a good idea to digitize the bus schedule? Five reasons:

  1. I need it. Its not just that I don’t want to drive. It’s that I suck at driving. Having access to public transit is an immediately useful thing for me.
  2. Tyler has several colleges, but none of them even mention the bus system on their websites. If building this app means one student takes the bus instead of driving then it will be a success.
  3. It’s easy. (Mostly, more on this below.)
  4. It’s an excellent pilot project. The data is available (albeit in a terrible format) and the shape of the application I will build is relatively straightforward.
  5. Financial freedom, green living, world peace, etc.

The first thing I needed in order to build this app was to get data for routes, schedules and stop locations. The Tyler Transit agency publishes a route map as PDF, though it only includes a very small number of stops. They publish schedule data for weekdays and Saturdays as PDFs. These PDFs only include estimated arrival times for five stops per route, less than ten percent of the total number of stops. Stop location data isn’t available anywhere online, so I emailed Tyler Transit and asked for a complete list. I requested an Excel document; they sent me a PDF of a scan of a printout of a web application.

I don’t raise these data quality issues as an affront to Tyler Transit. Through my own experiences and those of my many friends in the open government community I’ve learned that this is the state of public data in much of the US. I want to help change that, but right now I’m not trying to open governments, I’m just trying to build a transit app, so I did what a pragmatic geek has to do sometimes:

I keyed them.

Lacking an obvious way to extract the data I needed from a scanned PDF I took two hours and re-keyed the spreadsheet. This also gave me the opportunity to correct numerous typos in street names that would have foiled any geocoder. The results:

Short version: Tyler has 236 bus stops serving virtually all significant public and private institutions, including both large colleges: UT Tyler and Tyler Junior College.

Using the route map, the street centerline GIS data available from Smith County, QGIS and a lot of patience I was able construct what is possibly the only digital map of Tyler’s bus routes. I then geocoded the above bus stops list and put those over the top, yielding:

The black outline is the Tyler city limits, the thin gray lines are streets, and the thick colored lines are the bus routes. The bus icons are individuals stops.

Fun fact: A simple buffer computation on the stops will tell you that over 70% of all streets in the city of Tyler are within a half mile of a bus stop. (That’s less than the distance I walk to and from the L every day.)

This is good progress, however, its far from perfect. The geocodes for the bus stops are not their actual location, but rather that of the next intersection following the stop. Worse, many of them didn’t geocode at all, forcing me into an ardous process of trying to manually locate them using Google Maps and Google Street View. Even then I wasn’t able to determine even an approximate location for some of the stops.

I have long-term plans for dealing with this and the other data quality issues. Better stop locations can be crowd-sourced by users. The arrival times present a more audacious challenge as I have to compute estimated times for all the stops which don’t have times in the official timetable. Fortunately, the street centerline data provides me with both distance and speed limit, so I should be able to make sound estimates and fine-tune those with user feedback.

Though much of it was painfully manual, most of the required data preparation is done at this point and I’ll can move on to prototyping the application. Interested coders can follow my progress at the hacktyler-transit repository on Github. Everyone else: speak your mind.