[ACCEPTED]-Where is a good Address Parser-street-address

Accepted answer
Score: 25

The Googlemaps API works pretty well for 3 this. E.g., suppose you are given the string 2 "120 w 45 st nyc". Pass it into the Googlemaps 1 API like so: http://maps.google.com/maps/geo?q=120+w+45+st+nyc and you get this response:

{
  "name": "120 w 45 st nyc",
  "Status": {
    "code": 200,
    "request": "geocode"
  },
  "Placemark": [ {
    "id": "p1",
    "address": "120 W 45th St, New York, NY 10036, USA",
    "AddressDetails": {"Country": {"CountryNameCode": "US","CountryName": "USA","AdministrativeArea": {"AdministrativeAreaName": "NY","Locality": {"LocalityName": "New York","Thoroughfare":{"ThoroughfareName": "120 W 45th St"},"PostalCode": {"PostalCodeNumber": "10036"}}}},"Accuracy": 8},
    "ExtendedData": {
      "LatLonBox": {
        "north": 40.7603883,
        "south": 40.7540931,
        "east": -73.9807141,
        "west": -73.9870093
      }
    },
    "Point": {
      "coordinates": [ -73.9838617, 40.7572407, 0 ]
    }
  } ]
}
Score: 7

If you are looking for a address parser 6 with a simple solution, try this:

http://usaddress.codeplex.com/

Good: 1. No 5 database required 2. No internet lookup 4 required 3. Pretty accurate

Bad: 1. Can not 3 confirm if it is a real address 2. Only 2 works for US address 3. in C#, use .NET 1 3.5 or above

Score: 4

You could try Experian Address Verification. Has it's issues but pretty 1 much works as advertised.

Score: 4

As has been mentioned, this is not a trivial 23 problem. One of the biggest issues--apart 22 from international addresses--is that there 21 is no standard format for addresses and 20 the fact that an address can't tell you 19 if it's well-formed, i.e. it's not self-validating 18 like a credit card number.

Because of this, you 17 have to rely on an external source of truth 16 to ensure the address is real. This is 15 where an address verification service comes 14 into the mix. Depending upon your business 13 needs and application requirements, you 12 may be looking at a one-time "batch" scrub 11 of your address list, or perhaps a realtime/live 10 address validation service. There are a 9 number of good providers (which vary in 8 cost) that can easily solve this problem.

I 7 should mention that I'm the founder of SmartyStreets. We 6 do CASS-certified address verification. We'll take your unformatted/raw addresses 5 and turn them into addresses which have 4 been cleaned, standardized, and verified/confirmed. Depending 3 on the size of your list, the cost is usually 2 only a few dollars and the turnaround time 1 is nearly instant--usually a few minutes.

Score: 3

As there is no trivial solution like @duffymo 17 said, the next best thing might be to reconsider 16 the design. If it's a user form, make a 15 compromise and let the user fill it. If 14 you are retroactively parsing data, then 13 use a very strict regex to parse addresses 12 based on some criteria (country is US). Then 11 make a second pass at the ones that are 10 left over and so on. I have taken this approach 9 and it's the only reliable approach.

Another 8 design problem with taking a generic regex 7 approach is that it will generate false 6 positive for bad addresses. If you are sending 5 out snail mail to these people, it will 4 end up bouncing and you'll have more work 3 at your hands trying to sort out which ones 2 came back or continue to send mails to erroneous 1 addresses.

Score: 3

I tried RecogniContact recently. It is a 3 Windows COM component that parses US and 2 European addresses. You can test it from 1 the website.

http://www.loquisoft.com/index.php?page=8

Score: 0

For Canadian addresses, I have used one 3 called Street Perfect. We had to wrap the c++ code in some 2 .net to make it reusable for our purpose, but 1 that was fairly easy.

More Related questions