2010 Census: Racial diversity in Smith County (map!)

Note: It has been nearly six weeks since I last wrote on Hack Tyler. I expected some of this for the reasons outlined in my last post, however, the delay was extended by a two week period during which I thought I might not be moving to Tyler after all. This has turned out not to be the case. The exact dates are still undetermined, but I will be moving in the Fall.

Since starting Hack Tyler I’ve wanted to collaborate with locals who know the place better than I. For this post I invited Mike Rogers, a native of Tyler and recent graduate of the University of Richmond, to publish in tandem with me. Read his thoughtful reflections on race in Tyler at his blog, Highways and Hallowed Halls.

In the last decade the population of Tyler grew 15.8% to 96,900, not quite keeping pace with the growth of Texas or Smith County, both of which topped 20%. Over the same time period Tyler’s Hispanic population grew 55% to 20,511—the city’s most significant demographic shift of the decade. These and numerous other insights can be gleaned from the Summary File 1 (SF1) census release, which was made available for Texas on Thursday.

The SF1 is what is most commonly thought of as the “big” census release. It contains very granular population counts summarized by race, family status, age, sex, housing status and a variety other subjects. This is the data that is commonly used by newspapers, city planners, and demographers to make informative maps, plan services, and project population trends, respectively. I’ve spent much of the last six months analyzing census data for my work at the Chicago Tribune, which last week culminated in the release of detailed maps of same-sex relationships and children less than five years old for the Chicagoland area.

Over the last few evenings I’ve taken advantage of my access to the embargoed census data to use these same techniques to prepare race map for Tyler and wider Smith County. Many thanks to my fellow news applications hackers for allowing me to recycle our source code for generating map tiles and presenting them online. Click the screenshot to view the map. (Then come back and keep reading!)

2010 Census: Racial diversity in Smith County

2010 Census: Racial diversity in Smith County

Tyler, like most American cities, is visibly segregated along racial lines. Blacks and Hispanics occupy the areas north and west of downtown, though those two groups are themselves more integrated than I would have expected. (Chicago’s extreme segregation has hyper-sensitized me to trends such as these.) The eastern and southern parts of Tyler are predominantly white, though some areas are more racially integrated.

A few things to look for on the map:

  • Dense clusters of mixed race frequently indicate group quarters, such as the Smith County jail, or student housing on the UT Tyler campus. Others mark apartment buildings and developments of townhouses.
  • The racial trends continue beyond the city, with most Hispanics living to the north of the city and most whites living to the south.
  • Whites are both more spread out and more populous in the rural areas of Smith County, accounting for over 62% of the county’s total population (Tyler included).
  • White residents particularly cluster in the lakefront communities around Lake Tyler and Saline Bay.

Though its less visible in the map it’s also worth noting that the Asian population in Tyler spiked by over 125% to 1,807.  Though this represents only 1.9% of Tyler’s 2010 population it outpaces an already dramatic 71% surge in the total Asian population of Texas.

Want more census data? Be sure to check out the Texas Tribune’s excellent statewide coverage. If you want even more detail, a good place to start is census.ire.org, a public project of Investigative Reporters and Editors created to make working with census data easier. Follow this link to jump straight to Tyler and a subset of tables that informed the map and this post.

If there is a particular aspect of Tyler’s demography you’re interested in, please leave a comment. There are many more maps to be made.

Research and [Barriers to] Development

I’m somewhat reticent to admit that the pace of Hack Tyler development has slowed and will likely remain that way for a month or two. I spent the last week packing and cleaning. My wife and son have moved and transported the majority of my belongings with them. My things are now waiting for me in a storage unit in Tyler. As a consequence, I have only my netbook to hack on and no desk space to do even that.

I’m relatively used to limited accommodations, so I’m not particularly uncomfortable. However, it does take the edge off my capacity and encourages me to reach for other activities I haven’t found enough time for over the last year. I’ve also contracted some additional work to keep myself busy in the interim. In order not to completely lose momentum on this project, I’ve shifted my focus to research and communication tasks.

I’ve been in touch with Tyler Transit regarding Tyler on Time and learned a great deal of interesting things about their systems. Most notably, the current transit system is in the process of being completely overhauled and the existing bus routes will cease to exist sometime in August. The Transportation Operations Coordinator for the department has offered to provide me with updated shapefiles and timetable data in advance of the switchover, which will allow me to preemptively refactor Tyler On Time for the new routes. This opens up the possibility of Tyler on Time “launching” with the new routes, which seems eminently useful.

Unfortunately, this new data will not include timetable for all stops, but will continue to be “waypointed” as the current data is. This makes it very difficult to offer accurate intermediate stop times. I’ve yet to decide how to handle this, but I’m leaning toward to presentation solution rather than an algorithmic solution. Something like:

The nearest stop with scheduled departure times is 4 stops away, the next bus is scheduled to arrive at that stop in 5 minutes. The previous bus departed that stop 14 minutes ago.

Predicting stop times is likely not possible as Tyler is reputed to have significant traffic congestion problems, which would render estimates based on speed and distance inaccurate. I’m open to suggestions about how else I might handle this.

Learning about the details of Tyler’s changing transit system has also led me to a number of interesting documents related to Tyler’s municipal planning:

These documents present more information that I can possibly digest during the time I have left in Chicago, but I expect studying them to provide me with essential context for my own ideas. Additionally, the “Summary File 1” batch of census data for Texas will be released sometime in the next two months, providing further insight into the place and its people. I’ll be especially excited to write about this data, given how much time I’ve spent working with census data lately.

All in all, I expect I will write much less code this month than I did in June, but I will continue to inform myself and prepare for the things that come next. Best of all, I’ve now got my son’s comprehensive nightly reports:

It’s hot. We’re going to the pool again.

Delivering the beta

Under ordinary circumstances I would have released a first beta of this app weeks ago. I was dissuaded both by the shifting landscape of data as well as by my concern that someone in Tyler might actually try to use it to catch the bus and fail due to its incompleteness. I’m confident now that it sufficiently advertises its failures (lack of Saturday schedules, for example) to prevent this. Thus I present for commentary the first original Hack Tyler app:

Tyler On Time »

The application delivers the following features that I determined to be absolutely necessary in a transit app:

  • Tell me when the bus is coming.
  • Show me where the bus is going to be (maps).
  • Allow me to save my favorite stops.
  • Function acceptably on desktop, tablet, and mobile devices.
  • Be usable (via PhoneGap Build) as a native Android/iPhone app*.
  • Do not require an internet connection.

Of the items on this list, I’m perhaps most excited about having static maps for every stop. I owe their existence almost entirely to the fine folks at Development Seed who created TileMill.

Here is the map for the 1900 N Broadway Ave stop on the Red Line North:

1900 N Broadway

With these maps, I can provide a visual aid to navigation without compromising the app’s ability to run offline. The code for generating the maps can be found in the maps directory of the repository.

There are a number of worthwhile features that have not yet been developed, including a “Stops Near Me” geolocation feature, a crowd-sourcing mechanism for stop landmarks and a dynamic route/stop map for desktop and mobile users with internet access. You can see the complete list of issues and ideas on the project’s Github Issues page.

The most significant problem with the application is the relatively poor accuracy of the departure times. The coarse schedule information available from official sources requires that I estimate times for the vast majority of the stops. Although the estimations are likely good enough to be useful, the algorithm is crude. Consequently, my next step will be to ask Tyler Transit for more detailed timetable data. As I mentioned in my last blog post, it’s my belief that governments are much more likely to produce information if the utility of it is self-evident. Hopefully the existence of Tyler On Time justifies whatever investment would be required for them to release this data.

Though the basic functionality validates my time investment so far, this project also has a couple of significant stretch goals. First, I would like to build an SMS version of the app for users without smartphones. My friends at the awesome cloud-telephony service Tropo have expressed an interest in partnering on this project, which shouldn’t be particularly challenging to implement once better timetables are nailed down.

Second, I would like to convert the bus data into GTFS format and have Google Maps pick up the results. I suspect this would require an official endorsement from Tyler Transit, however, the value of doing so would be very high. It would allow Tylerites and visitors to get directions that include public transit as a navigation option. It would also allow Tyler On Time to provide “walk, ride, walk” directions to users of the application, like this.

Finally, some notes about the technology being used in the app. The stack was heavily inspired by a very successful sprint the Tribapps team executed for the Chicago Breaking News Live application. Similar to that app, Tyler On Time’s logic is entirely client-side, backed by a small amount of Backbone.js (for url routing) and a tremendous amount of Underscore.js (for everything else). The static files themselves are hosted on Amazon S3. Basic styles and responsive switchy design come from the Skeleton framework. It has HTML5 semantic markup. The data processing was scripted primarily with Python, GDAL and csvkit. Stop maps were produced using TileMill with a modified version of Development Seed's custom base layer for Washington D.C. and data from the Smith County Map Site and Open Street Map. The whole thing was developed on Ubuntu Linux. Everything is open source.

I expect to keep iterating this application for at least a month, so please leave your suggestions (especially those of you from Tyler). Hopefully by my next post I will have detailed timetable data and be ready to move forward with additional methods of delivering that information to users.

*The application has not yet been deployed to either the Android Market or the App Store, but those with comfortable installing unsigned Android packages can download a beta here.

Data, suddenly available

Hack Tyler is an idea born out of pragmatism and self-exorcism, but underlying that are my beliefs about open governments, open data and the power of public service. One of the more persuasive statements of this ethos I’ve heard is “Public Equals Online”, the name of the Sunlight Foundation’s 2010 campaign for government transparency. Its not enough that governments produce and warehouse data that is legally accessible to the public—this is the equivalent of building a park in the mountains and not telling anyone it exists. In order for data to be truly public it must be like the town square—open, accessible and obvious.  The corollary benefit is, of course, that someone can come along and build useful things with it.

So it is with great pleasure that I note that the Smith County Mapsite (that also warehouses GIS data for the City of Tyler) now holds official shapefiles for bus routes and bus stops. This is proper survey data and supersedes the information I described aggregating in my last blog post. This raises a few important points:

  • I was wrong. I should have asked for the data first. My desire to get things done probably cost me more time than it would have taken to ask for the data. In addition, I made an ill-founded assumption about what data existed. (The Tyler GIS department clearly has good maps.)
  • Public equals online. This data is now public, it wasn’t before. This is a success. Now its time to learn from this and ask for better timetable data.
  • I wasn’t wasting my time. It has always been my belief that you don’t influence governments by explaining how awesome things could be. You influence them by proving something is useful and then explaining how much more awesome it could be. Its clear that in some (perhaps indirect) sense Hack Tyler caused these files to become public. I’m putting that in the “win” category.
  • As far as hand-crafted shapefiles go, I didn’t do too bad:

Hack Tyler data:

Official data:

Using the official data, I can also revise another calculation: in fact, 72.7% of all streets in Tyler are within a half-mile of a bus stop. That’s not very far given that, according to a Tyler city planner, all buses in Tyler are equipped with bike racks.

Hacking Tyler Transit

Why bus schedules? In my first post I named them at the top of my list of datasets I would like to build on. I also mentioned that I intended to avoid buying a car once I moved, a statement that provoked significant eye-rolling. I’ve been told that no one rides the bus in Tyler or that only poor people do. A fellow hacker who grew up Tyler told me he didn’t even know they had a bus system. This isn’t really a surprise—Tyler has low population-density (1,982 people per square mile, according to Wolfram Alpha) and a food desert in its urban core. I was stunned to discover that a transit system even existed. So why do I think its a good idea to digitize the bus schedule? Five reasons:

  1. I need it. Its not just that I don’t want to drive. It’s that I suck at driving. Having access to public transit is an immediately useful thing for me.
  2. Tyler has several colleges, but none of them even mention the bus system on their websites. If building this app means one student takes the bus instead of driving then it will be a success.
  3. It’s easy. (Mostly, more on this below.)
  4. It’s an excellent pilot project. The data is available (albeit in a terrible format) and the shape of the application I will build is relatively straightforward.
  5. Financial freedom, green living, world peace, etc.

The first thing I needed in order to build this app was to get data for routes, schedules and stop locations. The Tyler Transit agency publishes a route map as PDF, though it only includes a very small number of stops. They publish schedule data for weekdays and Saturdays as PDFs. These PDFs only include estimated arrival times for five stops per route, less than ten percent of the total number of stops. Stop location data isn’t available anywhere online, so I emailed Tyler Transit and asked for a complete list. I requested an Excel document; they sent me a PDF of a scan of a printout of a web application.

I don’t raise these data quality issues as an affront to Tyler Transit. Through my own experiences and those of my many friends in the open government community I’ve learned that this is the state of public data in much of the US. I want to help change that, but right now I’m not trying to open governments, I’m just trying to build a transit app, so I did what a pragmatic geek has to do sometimes:

I keyed them.

Lacking an obvious way to extract the data I needed from a scanned PDF I took two hours and re-keyed the spreadsheet. This also gave me the opportunity to correct numerous typos in street names that would have foiled any geocoder. The results:

Short version: Tyler has 236 bus stops serving virtually all significant public and private institutions, including both large colleges: UT Tyler and Tyler Junior College.

Using the route map, the street centerline GIS data available from Smith County, QGIS and a lot of patience I was able construct what is possibly the only digital map of Tyler’s bus routes. I then geocoded the above bus stops list and put those over the top, yielding:

The black outline is the Tyler city limits, the thin gray lines are streets, and the thick colored lines are the bus routes. The bus icons are individuals stops.

Fun fact: A simple buffer computation on the stops will tell you that over 70% of all streets in the city of Tyler are within a half mile of a bus stop. (That’s less than the distance I walk to and from the L every day.)

This is good progress, however, its far from perfect. The geocodes for the bus stops are not their actual location, but rather that of the next intersection following the stop. Worse, many of them didn’t geocode at all, forcing me into an ardous process of trying to manually locate them using Google Maps and Google Street View. Even then I wasn’t able to determine even an approximate location for some of the stops.

I have long-term plans for dealing with this and the other data quality issues. Better stop locations can be crowd-sourced by users. The arrival times present a more audacious challenge as I have to compute estimated times for all the stops which don’t have times in the official timetable. Fortunately, the street centerline data provides me with both distance and speed limit, so I should be able to make sound estimates and fine-tune those with user feedback.

Though much of it was painfully manual, most of the required data preparation is done at this point and I’ll can move on to prototyping the application. Interested coders can follow my progress at the hacktyler-transit repository on Github. Everyone else: speak your mind.

Everything begins with data

A week’s gone by since I sparked an unexpected ruckus with my inaugural Hack Tyler post. I had no idea it was going to find resonance with so many people. I’ve received comments from coders, journalists, and government wonks of all stripes. Even more exciting, I’ve heard from a diverse cast of current and former citizens of Tyler—some wild about my ideas and some… less so. I even heard from a local high school student who wants to become a coder, but isn’t sure where to start.

I’ve tried to respond to everyone as best I can, however, I’ve made a conscious decision not to try to correct every misconception about what I’ve written. If folks are concerned I might to be about to embark on some carpetbagging idealistic crusade against the local government, I’m happy to try to sort those concerns out individually. I’m not about to turn this blog into a policy. I want to spend my time actually doing things.

To that end, I’ve spent the last week focusing on the data made available by the City of Tyler and its parent, Smith County. I’ve created a list of all data sources I’ve been able to identify. Its been heartening to see how much data actually is available (albeit often in less than ideal formats). A few items are of particular note:

  • Tyler and Smith County have a joint GIS repository that is quite extensive and (ostensibly) updated at regular intervals.
  • Tyler has a real-time list of where its police officers are responding to calls. I’ve never seen this in any other city. (It seems to have been a student capstone project.)
  • Smith County has put most, if not all, of its financial documentation online.

Since starting this project I’ve learned a little bit about Texas’ Public Information Law, which seems robust. Either because of the law or because of Texas culture there is a greater amount of “public by default” data than I’m used to. According to Texas Tribune reporter Matt Stiles in an On The Media interview this transparency is an effect of the state’s conservative culture. I’m hoping I can take proper advantage of this openness to get even more data, such as some of the datasets off Max Ogden’s authoritative, crowd-sourced civic datasets list.

In addition to creating my list of links I’ve also started building out infrastructure. So far I’ve:

There isn’t much data in the Boundary Service yet, but once populated it will allow me (and any other developers) to build apps using regional GIS data, without having to muck about with shapefiles and databases. I had a great deal of fun building the Boundary Service for Chicago so I’m excited to be able to repurpose it for Tyler. I believe strongly in building APIs like this one so that others can build on the things I make and I hope to create more of them as I go along.

Preparing the data is a crucial step toward building any applications, but I hope to get started building more generally useful products soon. I’ve got a number of ideas queued up and I’m saving the first of them for my next blog post, by which time I hope to have gotten a response from the City of Tyler Transit Department on my first data request.

To find myself in the other place

Tyler, TX

Somewhere along the way some important things got right and truly fucked up. My wife and I are getting divorced. At the end of June she and my son will move from Chicago to Tyler, Texas.

I’ve contemplated a lot of reactions to this change and I’ve decided to:

  1. Let her go.
  2. Follow him.

If the personal and moral intuitions aren’t obvious to you, I don’t think I can explain them.

I’ve become much more comfortable in Chicago than anywhere else I’ve ever lived. It is as close to a “home” as I’ve had in my adult life. By all accounts, Tyler will be different in nearly every way—small, provincial, culturally isolated—not unlike my hometown in Nebraska. I expect to spend a great deal of time actively disagreeing with people.

So far, I’ve spent probably 80% of 2011 being depressed about either the divorce or the move, but lately I’ve come into a new frame of mind. I wouldn’t call it a reconciliation, but basically I’ve decided:

I’m going to make this good.

You might call it a trite coming-to-terms, but I’ve decided to make this change positive using whatever means are available to me. It helps that my awesome job has provisionally agreed to allow me to telecommute (more on this later). Some other things are easy too:

  • No more wasting 20 hours a week commuting.
  • Cost of living ~30% cheaper than Chicago.
  • Excellent schools (…), low crime, yada, yada.
  • Quiet.

However, what I’m most interested in focusing on is how I can improve the things I don’t like, either through application of will or technology or both:

  • Tyler has a reasonably extensive bus system, but its online schedule is only available as a PDF. (You can probably already see where this is going, right?)
  • The local newspaper, The Tyler Morning Telegraph circulates to nearly a third of the city’s population according to the Access Bureau of Circulations, but I could find virtually no information about the politics of local government online.
  • Smith County’s only method of finding a polling place online is with a clumsy map viewing application.
  • The city and county both make significant amounts of data available online (surprising given their small size), but no one seems to have done any analysis of it (disappointing since the town has four colleges).

The list goes on. Tyler has information that could be freed. Tyler has government that could be opened. Tyler has news that could be hacked. Moreover, Tyler has an almost completely unexploited market. There are no hackers there. The small number of high-tech businesses that exist in the region are either web development shops serving local businesses or robotics companies.

I’ve come to realize that although this move is not something I want to do, it doesn’t need to be the low point of my life. How can I cope? I can take my ethic with me. I can use the skills I’ve developed in Chicago to make another place better. And I can use my time and freedom to improve myself. Plans for Tyler and for myself:

  • Apps for Tyler: a one-man, off-cycle project to bring my new hometown into modernity.
  • Learn about the place: analysis and presentation of any interesting information I can dig up about the place. Who knows, maybe I’ll find a lede or two?
  • Travel: I’m going to use my new freedom to get out of dodge as frequently as possible. I hear Austin is nice.
  • Public transportation. I’ll have access to a car, but I’m going to do my best to live without being a consumer of gasoline, just as I did in Chicago.
  • A better life: applying my extra time to improving my health, fitness, and living a more balanced life.

Is this all going to go off without a hitch? Not a chance. I expect to spend many nights being painfully underwhelmed with the place and with myself, but this is the best way I know how to deal with it. This new blog will serve as documentation of my progress on all fronts.

I don’t know exactly when I will be moving—it will likely be sometime in July or August—but I’ve already set myself on this new path. I’m not going to waste time wishing things weren’t about to change. Instead I’m going to hack Tyler to be what I need it to be.