Geospatial searches with ThinkingSphinx
- Author: Jim Remsik
- Date/Time: September 12, 2008
- Category: Hashrocket News
- Comment(s): 2
Geospatial searches with ThinkingSphinx
One of our recent projects required geo-aware searches of business addresses. It is not news that Hashrocket prefers ThinkingSphinx for fulltext searches. Sphinx Search and as a result ThinkingSphinx also support searching around a geographical point.
Understanding How It Works
This was my first foray into geospatial searches. ThinkingSphinx makes this task fairly straight-forward and only requires that you have a data-set containing geocoded datums. Specifically, it is assumed that you have a latitude and longitude associated with the data you are searching. ThinkingSphinx will accept columns names lat or latitude for latitude and lon, long, or longitude for longitude. One obvious missing abbreviation is lng for longitude. I have submitted a patch, in the mean time the set_property method can be used if your data do not match these expectations. Sphinx stores these geographical columns in radians to avoid the overhead of crunching these numbers on the fly. Indexes and subsequent searches therefore are expected to provide the anchor point in radians and not floating point numbers.
define_index do
indexes :name
# Explictly convert column to radians at the time of indexing.
# Note: RADIANS() is run in the context of MySQL/PostgreSQL
has 'RADIANS(lat)', :as => :lat, :type => :float
has 'RADIANS(lng)', :as => :lng, :type => :float
# If your lat/lng data are in an associated table provide the full path.
# has'RADIANS(addresses.lat)', :as => :lat, :type => :float
# has'RADIANS(addresses.lng)', :as => :lng, :type => :float
# ThinkingSphinx expects your latitude and longitude attributes are named
# any of lat, latitude, lon, long or longitude. If that‘s not the case,
# you will need to explicitly describe them in your index.
set_property :longitude_attr => "lng"
set_property :delta => true
end
Putting It To Use
Our latitude and longitude data are stored in BigDecimal format. An extension to the class allowed the quick conversion from floating point to radian.
class BigDecimal
def to_radians
( self / 360.0 ) * Math::PI * 2
end
end
After defining the index, once you have stopped, (re-)indexed, and (re-)started your daemon you can run a search.
Model.search('term',
:geo => [@address.lat.to_radian, @address.lng.to_radian])
At this point the result will be eerily similar with the exception that the Sphinx result-set has returned a new attribute called @geodist.
Now we have a new arsenal, we can sort by distance, filter by a distance range, and even cluster results by geographical region.
Sorting
@geodist only exists in the context of the Sphinx result. Therefore we need refer to it as a string. As with all strings in ThinkingSphinx clauses you must provide the direction (ASC or DESC) or no results will be returned.
Model.search('term',
:geo => [@address.lat.to_radian, @address.lng.to_radian],
:order => '@geodist ASC')
Filtering
To filter you need an anchor point and a distance range. Again we provide the geodist argument as a string.
Model.search('term',
:geo => [@address.lat.to_radian, @address.lng.to_radian],
:with => { '@geodist' => Range.new(0.0, 10.0) })



Comments
Anderson 09.16.08 @ 1:32pm
Great post!
Congratulation
li0n 09.28.08 @ 8:12am
Very nice, BT!