Improving Language Detection

At Foursquare, we attempt to personalize as much of the product as we can. In order to understand the more than 70 million tips and 1.3 billion shouts our users have left at venues, each of those pieces of text must be run through our natural language processing pipeline. The very foundation of this pipeline is our ability to identify the language of a given piece of text.

Traditional language detectors are typically implemented using a character trigram model or a dictionary based ngram model. The accuracy of these approaches is directly proportional to the length of the text being classified. For short pieces of text like tips in Foursquare or shouts in Swarm (see examples below), however, the efficacy of these solutions begins to break down. For example, if a user writes only a single word like “taco!” or an equally ambiguous statement like “mmmm strudel,” a generic character- or word-based solution would not be able to make a strong language classification on those short strings. Unfortunately, given the nature of the Foursquare products, these sorts of short strings are very commonplace, and we needed a better way to accurately classify the languages in which they are written.

To this end, we decided to rethink the generic language identification algorithms and build our own identification system, making use of some of the more unique aspects of Foursquare data: the location of where the text was created and the ability to aggregate all texts by their writer. While there are many multilingual users on our platform, the average Foursquare user only ever writes tips or shouts in a single language. Given that fact, it seemed inefficient to apply a generic language classification model against all of the text that a single user creates. If we have 49 data points that strongly point to a user writing in English, and that user’s 50th data point is an ambiguous text that a generic language model thinks could be German or English (with 40% and 38% accuracy respectively), chances are that the string should correctly be tagged as English and not German, even if the text contains German loanwords. Our solution to this problem was to build a custom language model for every one of our users that leave tips or shouts, and then to allow those user language models to help influence the result of the generic language detection algorithm.

The first step in this process is to run generic language detection on every tip and shout in the database. Each tip and shout is associated with a venue that has an explicit lat/long associated with it. We then reverse geocode that lat/long to the country in which that venue is located, which lets us know the country that the user was in when they wrote the text. Next, we couple the generic language detection results with this country data to create a language model for every country. While this per-country language distribution model may not correctly resemble the real life language distributions of a given country, it does model the language behavior of the users that share text via Foursquare and Swarm in those countries.

Example of top 5 languages and weights calculated in the country language models:

With country models in hand, we then do a separate grouping of strings by user and are able to calculate a language distribution on a per-user basis. However, one of the problems with this approach is not every user has enough data to create a reliable user model. A new user who is multilingual will cause classification problems with this system early on due to the lack of data to produce a reliable model. To solve this particular problem we use the language model of the dominant country for that user as a baseline. When a user has little to no data for their user language model, we allow the country model to be merged into the low information user model. As more data becomes available for a given user, we slowly weight the user model higher than the dominant country model until we have enough data where the user model becomes the more dominant model between the two.

Finally, we create per country, orthographic feature models using the strings that are grouped by country. For this model, we have a set of 13 orthographic features that, when a string triggers one of them, the string’s generic language identification results are added to the other strings results that triggered for that feature, in a specific country. This allows us to have a feature “containsHanScript” and have a completely different language distribution in China than the one that is calculated for Japan, where both Chinese and Japanese contain characters from the Han script. Other examples of this are Arabic vs. Farsi with the “containsArabicScript” feature, Russian vs. Ukranian vs. Bulgarian with the “containsCyrillicScript” feature, and all romance languages with the “containsLatinScript” feature.

With the user models and the orthographic feature models in place, we then rerun language identification on all of our tips and shouts, using the appropriate user’s language model and applying any triggered orthographic feature model that the string matches, and we merge the 2 results together, along with the generic language detectors’ results for a given string and we’re left with a higher quality language classification. On preliminary analysis, we were able to correctly tag an additional ~3M tips and ~250M shouts using this method.

Examples of corrected language identification:

If these kinds of language problems interest you, why not check out our current openings!

— Maryam Aly (@maryamaaly), Kris Concepcion (@kjc9), Max Sklar (@maxsklar)

Personal recommendations for the Foursquare homescreen

Top Picks Screenshot

Earlier this summer, we shipped an update to Foursquare on Android and iOS focused on giving each user a selection of “top picks” as soon as they open the app. Our goals with this new recommendation system were to improve the level of personalization and deliver fresh suggestions every day. Under the hood, this meant a rethinking of our previous recommendations flow.

Previous work


Previous iterations of our recommendation flow relied on a fairly traditional search pipeline that ran exclusively online in our search and recommendations service.

  • O(100’s) of candidate venues are retrieved from a non-personalized store such as our venues ElasticSearch index

  • Personalized data such as prior visit history, similar venue visits, friend/follower prior history is retrieved and used to rank within these candidate venues.

  • For the top ranked venues we choose to show the user, short justification snippets are then generated to demonstrate why this venue matches the user’s search

This works well for an intentful searches such as “pizza” where a user is looking for something specific but is restricting for broader query-less recommendations. For broad recommendations, the depth of personalization becomes limited by the size of the initial set of non-personalized candidate venues in the retrieval phase. Simply increasing the size of this set of candidates online would be computationally expensive and increase request latencies past acceptable limits so we looked towards better utilizing offline computation.

Personalized retrieval

To establish a larger pool of personalized candidate venues, we have created an offline recommendation pipeline. For each user, we have a pretty detailed understanding of the neighborhoods and locations they frequent with the technology we call Pilgrim. Given these locations for a user, we generate a ranked personalized list of recommendations via a set of Scalding jobs on our Hadoop cluster. These jobs are then run at a regular interval by Luigi workflow management and then served online by HFileService, our immutable key-value data service which uses the HFile file format from HBase.

Offline recommendations flow

The personalized sources of candidate venues come from a set of offline “fetchers” also implemented in Scalding:

  • Places friends / people you follow have been, left tips, liked and saved

  • Venues similar to those you’ve liked in the past, both near and far

  • Places that match your explicit tastes

For our more active users, there can be thousands of candidate venues produced by these fetchers, an order of magnitude more than our online approach. We can afford to consider such a large set since we’re processing them offline, out of the critical request path back to the user.

Non-personalized retrieval

Several non-personalized sources of candidate venues are also used.

  • The highest rated venues of various common intents (dinner, nightlife, etc).
  • Venues that are newly opened and trending in recent weeks

  • Venues that are popular with out-of-town visitors (if the user is traveling)

  • Venues that are vetted by expert sources like Eater, the Michelin Guide, etc.

The non-personalized sources not only provide a robust set of candidates for new users whom we don’t know much about yet, but also provide novel and high quality venues for existing users. While personalization should skew a user’s recommendations towards those they’ll find relevant and intriguing, we want to avoid creating a “personalization bubble” that misses great places just because the user doesn’t have any personal relation to them.


For each homepage request, the recommendation server logs which venues have been shown to HDFS via Kafka. These server-side logs are combined with client-side reporting of scroll depth, giving us a combined impressions log of which venues we have previously shown users, so we can avoid showing repeated recommendations. This impression information is used for both ranking and diversification.


Each candidate venue is individually scored with a combination of signals, seeking to balance factors such as venue novelty, distance, personalized taste match, similarity to favorites, and friend/follower activity. The top ~300 candidates are then kept in an HFile and served on a nightly basis.


Some product requirements are difficult to fulfill solely via scoring each venue independently. For instance, it is undesirable to show too many venues of the same category or to show only newly opened restaurants. To introduce diversity, before selecting the final set of venues to show, we try and enforce a set of constraints while also maintaining the ranked order of the candidate venues.


Selecting a list of venues to show isn’t the end of the process. Every venue that makes it onto a user’s home screen comes with a brief explanation of what’s interesting (we believe) to them about this venue. These “justifications” are the connective tissue between the sophisticated data processing pipeline and the user’s experience. Each explanation provides not only a touch of personality but also a glimpse into the wealth of data that powers these recommendations.

To accomplish this, the “justifications service” (as we call it internally) is responsible for assembling all the information we know about a venue and a user, combining it, ranking it, and generating a human readable explanation of the single most meaningful and personalized reason the user may be interested in this place.

Broadly speaking, the process can be divided into four stages: Data fetching -> Module execution -> Mixing/Ranking -> Diversification/Selection. Each type of justification that the system can produce is represented by an independent “module”. The module interface represents a simple IO contract, describing a set of input data and returning one or more justifications, each with a generated string and a score. Each module is designed to run independently, so after all the data is fetched, the set of eligible modules runs in parallel. When each module has had an opportunity to produce a justification, the candidates are merged and sorted. A final pass selects a single justification per venue, ensuring that not only are the most relevant justifications chosen but that there is a certain amount of diversity in insights provided. And this happens at runtime on each request.

Here are just a few examples of how the finished product appears in the app:

Local institution justification


Uncle boons



Upcoming work

With the product in the hands of users, we’re working on learning from user clicks to improve the quality of our recommendations. We’re also running live traffic experiments to test different improvements to our scorers, diversifiers and justifications. Finally, we’re improving the online layer so recommendations can quickly update in response to activity in the app such as liking or saving a venue to a to-do list. If you’re interested in working on search, recommendations or personalization in San Francisco or New York, check out!

Ben Lee and Matt Kamen

How the World Looks to Your Phone

[Cross-posted from a Quora answer I wrote here.]

One of Foursquare’s most useful features is its ability to send you just the right notification at just the right time, whether it’s visiting a new restaurant, arriving in a city, or hanging out with old friends at the neighborhood bar:

Image 2015-07-01 at 3.32.22 PM

We take a lot of pride in our location technology (also known as Pilgrim) being the best in the industry, enabling us to send these relevant, high-precision, contextual notifications.

Pilgrim is actually not just a single technology, but a set of them, including:

  • The Foursquare database (7 billion check-ins at 80 million places around the world)
  • Stop detection (has the person stopped at a place, or is the person just stopped at a traffic light?)
  • “Snap-to-place” (given a lat/long, wifi, and other sensor readings, at which place is the person located?)
  • Client-side power management (do this all without draining your battery!)
  • Content selection (given that someone has stopped at an interesting place, what should we send you?)
  • Familiarity (has the person been here before? have they been in the neighborhood? or is it their first time?)
  • (and much more…)

We could write a whole post about each of these, but perhaps the most interesting technology is “snap-to-place.” It’s a great example of how our unique data set and passionate users allow us to do things no one else can do.

We have these amazing little computers that we carry around in our pockets, but they don’t see the world in the same way that you and I do. Instead of eyes and ears, they have GPS, a clock, wifi, bluetooth, and other sensors. Our problem, then, is to take readings from those sensors and figure out which of those 80 million places in the world that phone is located.

Most companies start with a database of places that looks like this:

Image 2015-06-29 at 8.42.32 PM

(That’s Washington Square Park in the middle, with several NYU buildings and great coffee shops nearby.)

For every place, they have a latitude and longitude. This is great if your business is giving driving directions or making maps. But what if you want to use these pins to figure out where a phone is?

The naive thing to do is to just drop those pins on a map, draw circles around them, and say the person is “at” a place if they are standing inside the circle. Some implementations also resize the circles based on how big the place is:

Image 2015-06-29 at 8.42.55 PM

This works fine for big places like parks or Walmarts. But in dense areas like cities, airports, and malls (not to mention multi-story buildings, where places are stacked on top of each other), it breaks down. All these circles overlap and there’s no good way to tell places apart.

So if that’s not working, you might spend a bunch of time and money looking at satellite photos and drawing the outline of all the places on the map:

Image 2015-06-29 at 8.43.10 PM

This is incredibly time consuming, but it’s possible. Unfortunately, our phones don’t see the world the way a satellite does. GPS bounces off of buildings, gives funny readings and bad accuracies. Different mobile operating systems have different wifi and cell tower maps and translate those in different ways into latitude and longitude. And in multi-story buildings, these polygons sometimes encapsulate dozens of places stacked vertically. The world simply doesn’t look like nice neat rectangles to a phone.

So what does Foursquare do? Well, our users have crawled the world for us and have told us more than 7 billion times where they’re standing and what that place is called. Each time they do, we attach a little bit more data to our models about how those places look to our phones out in the real world. To our phones, the world looks like this:

Image 2015-06-29 at 8.49.40 PM

This is just a projection into a flat image of a model with hundreds of dimensions, but it gives an idea of what the world actually looks like to our phones. We use this and many other signals, (like nearby wifi, personalization, social factors, and real-time check-ins) to help power the push notifications you see when you’re exploring the city. Glad you’re enjoying them!

Interested in the machine learning, search, and infrastructure problems that come with working with such massive datasets on a daily basis? Come join us!

Andrew Hogue, Blake Shaw, Berk Kapicioglu, and Stephanie Yang

Managing Table and Collection Views on iOS

As most iOS developers can tell you, dealing with UITableView or UICollectionView can be a pain. These UIKit views form the basis of most iOS apps, but the numerous callback methods that you have to coordinate can be a hassle to manage. There is a lot of boilerplate necessary to get even the simplest of views up and running, and it is easy to create a mismatch between delegate methods which causes the app to crash.

At Foursquare we’ve solved this issue the way we usually do: we built an intermediate layer with a nicer interface that talks to Apple’s interface for you. We call it FSQCellManifest, and today we are open sourcing it for use by the wider iOS developer community.

In short, FSQCellManifest acts as the datasource and delegate for your table or collection view, handling all the necessary method calls and delegate callbacks for you, while providing a much easier to use interface with less boilerplate. It has a ton of built in features, and is flexible enough that we use it on every screen in both our apps. It saves us a ton of engineering time and effort, and hopefully it will for you as well.

You can find more documenation and an example app on the project’s Github page. And check out Foursquare’s open source portal to find all the other projects we’ve released.

Brian Dorfman

Gson Gotchas on Android

This is Part 2 in our 2 part series on latency. In Part 1, we discuss some techniques we use for measuring latency at Foursquare. Here we’ll discuss some specific Gson related changes we made to improve performance in our Android apps.

Shortly after the launch of Foursquare for Android 8.0, we found that our app was not as performant as we wanted it to be. Scrolling through our homepage was not buttery smooth, but quite janky with frequent GC_FOR_ALLOC calls in logcat. Switching between activities and fragments was not as quick as it should be. We investigated and profiled, and one of the largest items that jumped out was the amount of time spent parsing JSON. In many situations, this turned out to be multiple seconds even on relatively modern hardware such as the Nexus 4, which is crazy. We decided to dig in and do an audit of our JSON parsing code to find out why.

In Foursquare and Swarm for Android, practically all interaction with the server is done through a JSON API. We use Google’s Gson library extensively to deserialize JSON strings into Java objects that we as Android developers like working with. Here’s a simple example that converts a string representation of a venue into a Java object:

This works, but we don’t actually need the whole JSON string to begin parsing. Fortunately, Gson has a streaming API that let’s us parse a JSON stream one token at a time. Here’s what a simple example of that would look like:

So we did this, but still didn’t see any significant speed up or smoother app performance. What was going on? It turns out that we were shooting ourselves in the foot with our usage of custom Gson deserializers. We use custom deserializers because there are times when we don’t want a strict 1:1 mapping between JSON and Java objects they deserialize to. Gson allows for this, and provides the JsonDeserializer interface to facilitate this:

The way you use this is you implement this interface and tell it what type you want it to watch out for. You then register this with the Gson instance you are using to deserialize, and from then on whenever you try to deserialize some JSON to a certain type Type typeOfT, Gson will check to see if a custom deserializer is set up to handle that type and if so, will call that custom deserializer’s deserialize method. We use this for a few types, one of which happens to be our outermost Response type that encapsulates all Foursquare API responses:

The problem here is that despite us thinking we were using Gson’s streaming API, our usage of custom deserializers would cause whatever JSON stream we were trying to deserialize to be completely read up into a JsonElement object tree by Gson to be passed to that deserialize method (the very thing we were trying to avoid!). To make matters worse, doing this on our outermost response type that wraps every single response we receive from the server prevents any kind of streaming deserialization from ever happening. It turns out that TypeAdapters and TypeAdapterFactorys are what are now preferred and recommended over JsonDeserializer. Their class definitions look roughly like this:

Note the JsonReader stream being passed to the read() method as opposed to the JsonElement tree. After being enlightened with this information, we updated our custom deserializers to extend TypeAdapters and TypeAdapterFactorys and noticed significant parse time decreases of up to 50% for large responses. More importantly, the app felt significantly faster. Scroll performance that was previously janky from constant GCs due to memory pressure was noticeably smoother.


  • Use GSON’s streaming APIs, especially in memory-constrained environments like Android. The memory savings for non-trivial JSON strings are significant.
  • Deserializers written using TypeAdapters are generally uglier than those written with JsonDeserializers due to the lower level nature of working with stream tokens.
  • Deserializers written using TypeAdapters may be less flexible than those written with JsonDeserializers. Imagine you want a type field to determine what an object field deserializes to. With the streaming API, you need to guarantee that type comes down in the response before object.
  • Despite its drawbacks, use TypeAdapters over JsonDeserializers as the Gson docs instruct. The memory savings are usually worth it.
  • But in general, avoid custom deserialization if at all possible, as it adds complexity

Interested in these kinds of problems? Come join us!

Matthew Michihara

Measuring user perceived latency

At Foursquare, tracking and improving server-side response times is a problem many engineers are familiar with. We collect a myriad of server-side timing metrics in Graphite and have automated alerts if server endpoints respond too slowly. However, one critical metric that can be a bit harder to measure for any mobile application is user perceived latency. How long did the user feel like they waited for the application to startup or the next screen to load after they’ve tapped a button? Steve Souder gives a great talk about the perception of latency in this short talk.

For a mobile application like Foursquare, user perceived latency is composed of several factors. In a typical flow, the client makes an HTTPS request to a server, the server generates the response, the client receives a response, parses the response and then renders it. Client Timing Diagram

We instrumented Foursquare clients to report basic timing metrics in an effort to understand user perceived latency for the home screen. Periodically, the client batches and reports these measured intervals to a server endpoint which then logs the data into Kafka. For example, one metric the client reports is the delta between when the client initiated a request and when the first byte of the response is received. Another metric the client reports is simply how long the JSON parsing of the response took. On the server-side, we also have Kafka logs of how long the server spent generating a response. By combining client-side measured timings with server-side measured timings using Hive, we are able to sketch a rough timeline of user perceived latency with three key components: Network transit, server-side time, and client-side parsing and rendering. Note that there are many additional complexities within these components, however this simple breakdown can be a useful starting point for further investigation. .

The above bar chart shows a composite request timeline that is built using the median timing of each component from a sample of 14k Foursquare iPhone home screen requests. In this example, the user might wait nearly two seconds before the screen is rendered, and most of it was actually due to network and client time rather than server response time. Let’s dive into network and client time deeper.

Network time

The next chart below splits out requests in Brazil versus requests in the US.

The state of wireless networks and the latency to datacenter are major factors in network transit time. In the above comparison, the median Brazil request takes twice as long as one in the US. At Foursquare, all API traffic goes through SSL, to protect user data. SSL is generally fine for a connection that has already been opened, but the initial handshake can be quite costly as it typically requires two round-trips additional to a typical HTTP connection. It’s absolutely critical for a client to reuse the same SSL connection between requests, or this penalty will be paid each time. Working with a CDN to provide SSL early termination can also be incredibly beneficial at reducing the cost of your first request (often the most important one, since the user is definitely waiting for it to finish). For most connections, the transmission time is going to dominate, especially on non-LTE networks. To reduce the number of packets sent over the wire, we eliminated unnecessary or duplicated information in the markup and we were able to cut our payload by more than 30%. It turns out, however, that reducing the amount of JSON markup also had a big impact on the time spent in the client.

Client time

The amount of time spent processing the request on the client is non-trivial can vary wildly depending on the hardware. The difference in client time in the US vs Brazil chart is likely due to the different mix of hardware devices in wide use in the market. For example, if we were to plot the median JSON parsing times across different iPhone hardware, we would see a massive difference from older iPhone 4’s to the latest iPhone 6’s. Although not as many users are on the older hardware, it’s important to understand just how much impact needless JSON markup can have.

In addition to JSON processing, another important topic for iOS devices is Core Data serialization. In our internal metrics, we found that serializing data into Core Data can be quite time consuming and is similarly more expensive for older hardware models. In the future, we are looking at ways to avoid unnecessary Core Data access.

A similar variation can be found across Android hardware as well. The chart below shows the median JSON parsing times of various Samsung devices, (note that the Android timing is not directly comparable to the iPhone timing, as the Android metric is measuring the parsing of the JSON markup to custom data structures while the iPhone measurement is parsing straight to simple dictionaries). android_parse

In our next engineering blog post, we will discuss some critical fixes that were made in Android JSON parsing.


Measurement is an important first step towards improving user perceived latency. As Amdahl’s law prescribes, making improvements on the largest components of user perceived latency will of course have the largest user impact. In our case, measurements pointed at taking a closer look at networking considerations and client processing time.

— Ben Lee (@benlee) & Daniel Salinas (@zzorba42)

Geographic Taste Uniqueness

Last August we launched Tastes to help our users customize their local search experience. Taste tags like “trendy place”, “pork buns”, or “romantic restaurant” not only help users find the kinds of places they like when out and about, but also allow us to answer, for the first time, the question of “What is this area known for?”.

Taste data is a 2-way street. Not only are our users making use of tastes to personalize their experiences within the app, but every venue that we have external and user generated content for has it’s own unique taste profile as well. Making use of many input sources we are able to reliably attach tastes to the venues within the Foursquare venue database and calculate how strongly affiliated each of the applied tastes are with a given venue with a single affinity score. Applying our NLP stack to analyze user tips at a venue, we are able to distill that data into several metrics and scores (i.e. sentiment score, quality score, spam-like measure, etc.) that feed directly into the affinity score. Additionally, explicit data from users in the form of ‘Rate Places’ votes that signal which tastes our users liked at a venue is also incorporated into that final score.

Once tastes and their affinity scores are applied to our venues we can dig into our data science tool chest and use Old Faithful, TF-IDF, to find the tastes that are most unique in a given geographic region. TF-IDF is typically used to measure the importance of a term within a particular document that belongs to a larger corpus of documents. However, for our geographic taste measurement scores we have to modify the traditional understanding of what terms, documents, and corpora mean. Given the task of trying to identify the most important tastes of a sub-region in comparison to the region as a whole, we treat each venue as a single document, the tastes that are attached to the venues as the terms, and the affinity for a specified taste as the term frequency. Finally, we aggregate the venues, v, by the sub-region R we wish to measure and apply the following customized formula to find the taste uniqueness of taste t in R:

GeoTaste TFIDF

This formula is applied to every taste for a specified sub-region, producing a ranking of how unique every taste is to that sub-region. We then took the top 50 tastes ranked by uniqueness and resorted based on their affinity to the sub-region in order to find the most frequently seen tastes among the most unique. Every week, as part of our Hadoop data processing pipeline, we calculate these scores on various pairings of region and sub-region (US vs US state, US vs US city, city vs neighborhood) and used the final rankings to produce the “top” tastes in each sub-region.

The tables below represent a sampling of the results from this work.

Screen Shot 2015-03-05 at 6.18.07 PM

When we first generated this data, we immediately knew it would make a great feature in the Foursquare app. With a few changes to our search pipeline, we were able to surface them as quick links for users visiting these neighborhoods:  


Chinatown, NY, NY


Mission District, San Francisco, CA


We’ve just scratched the surface of digging into this data. If tackling these kinds of data analysis problems and working with an amazing dataset (and incredible co-workers) interests you, come join us!

— Will Bertelsen (@wbertelsen) & Kris Concepcion (@kjc9)

Announcing the first Foursquare API Demo Day!

Every couple of weeks we have an internal demo day – an hour where people demo things they’ve been working on to the rest of the company. Demos can be anything from a prototype app feature to a cool data visualization to a command-line tool. It’s a fun way to get inspired and to find out what co-workers are up to.

Now we’re inviting you to share your creativity in the same way, by showcasing a mobile, web or wearables app at the first ever public Foursquare API Demo Day, on November 12th at our New York headquarters.

The idea is simple: You build a prototype that uses the Foursquare API, and then demo it to Foursquare CEO and co-founder Dennis Crowley, Foursquare executives, product managers and engineers, and of course other participants. Your demo doesn’t need to be polished, it just needs to sort-of work. Some of our hackiest demos have turned into major Foursquare and Swarm features.

You can see more details, and sign up, here:

This isn’t a hackathon: there’s no competition, no artificial time constraint, and no caffeine-fueled all-nighters. You can build your demo on your own time, using any existing code. You can work in a team or solo. And if you have questions about our API as you go, we’ll be happy to answer them.

Coding isn’t a competitive sport: At this event all participants get an equal opportunity to wow and be wowed, and to make connections with Foursquare engineers and PMs and with other participants. And, who knows? Maybe you’ll have the opportunity to take your idea to the next level and put it in the hands of millions of Foursquare users.

Got a great idea on how to use Foursquare data? We want to see it! Sign up now and we’ll see you, and your demo, on November 12th!

Exploring the Foursquare ‘Taste Map’

In order to deliver great personalized local recommendations, Foursquare needs to understand not only which places are the best, but also what makes places all over the world different from each other. Whether it’s a dive bar with great late night dancing or a molecular gastronomy restaurant with an amazing tasting menu, we want to categorize these places and understand the relationship that Foursquare users have with them. That’s why this summer we launched “Tastes,” short descriptive tags for venues to help users personalize their experience and find places that suit them. Tastes can be as simple as a favorite dish like “soup dumplings” or a vibe like “good for dates”.

To better understand what our taste data looks like, I created the “Foursquare Taste Map.” Here we see a visualization of the most popular three thousand English tastes. Each taste is connected with a line to others like it, and they are arranged so that similar tastes are closer together. For more technical folks, this is a spring embedding of the k-nearest neighbor graph of tastes using the cosine similarity metric (plotted in Gephi), where each taste is represented as a high-dimensional vector of venue affinities.


Taste Map (thumbnail)

Obviously it’s difficult to capture all of the relationships between these tastes on a single page, but you can still see amazing structure emerge like “wine island” on the far right, or various niches of Asian cuisine in the lower left hand corner, or a variety of different hubs emerge around common dishes like “seafood”, “chicken”, and “pizza.” We are so excited to have the opportunity to work with this unique data set to better understand all of the places in the world, and thought you’d enjoy this visualization.

Asian Cuisine Close-Up

The Foursquare Taste Map
(don’t forget to zoom in and scroll around, and add your favorite tastes to your Foursquare account)

Happy exploring!

Blake (@metablake)

Introducing Pants: a build system for large-scale codebases like Foursquare’s.

Foursquare and Swarm are written predominantly in Scala on the server side. But as we’ve grown, so have the size, complexity and diversity of our codebase:

  • We currently have around 700,000 lines of handwritten code in 6500 .scala files.
  • We generate about 1.9 million lines of Scala code from 1400 .thrift files using Spindle, our homegrown data model code generator.
  • We generate UI code from Closure Templates.
  • We compile CSS using Less.
  • We have a significant amount of Python code for running our deploy toolchain, data processing pipelines and other offline systems.
  • Like all large codebases, we also have little bits of other things here and there: some C, some Java, some Ruby, some Bash scripts, and so on.

Naturally there is a complex web of dependencies between different parts of the codebase. In fact our code dependency graph has about 2500 vertices and tens of thousands of edges.

We needed a build toolchain that would work well with this complexity, and the result is Pants, an open source build system we developed together with engineers from Twitter, Square and elsewhere.

Pants was designed to support large-scale, heterogeneous codebases. It uses fine-grained dependency management to build only what you need, keeping build times from getting unnecessarily long (a must when using Scala, with its slow compilation speeds). Pants also makes it straightforward to plug in custom build tools, such as code generators, and it supports every part of the build lifecycle: codegen, external dependency resolution, compilation, linting, test running, bundling deployable artifacts and more.

You can read more about Pants here, including the etymology of the name. If your codebase is growing beyond your toolchain’s ability to scale, you might want to give Pants a try. And of course we’re always looking for contributors to the project!