Write the Bullet

Human, Nerd, Oregonian

3 Ways Sites Know You’re Giving Them Fake Information


Forms on Web sites are certainly thorough and demanding. They require certain details to seem accurate, and won’t rest until they get it. But how do Web sites know that you’re providing a fake phone number or verify that your credit card is valid before you even submit the form?

Phone Numbers

Phone numbers are one of the many details you’re probably reluctant to give away to a stranger online. Like many pieces of information, the right tools can use a phone number to search through data related to the phone number, and so on, and they may arrive at your identity, demographic information, home address, employer, et al.

When someone doesn’t believe it’s necessary to give away their phone number, they may decide to use a fake phone number. They try “555-1234” and discover that the form knows!—well, sort of.

The phone numbers in 25 North American countries are detailed using the North American Numbering Plan, or NANP. That plan specifies the Numbering Plan Areas and their three-digit area codes; each Central Office (exchange) is given a three-digit prefix; and finally, there is a four digit subscriber number. The result is a ten-digit telephone number that looks something like:

(123) 555-4321

So how does a form know that’s an invalid number? NANP has rules about what phone numbers are valid. For example:

  • The first digit in an area code or prefix can only be 2–9.
  • The next two digits in an area code or prefix can be any digit from 0–9. (However, NANP isn’t assigning area codes with a 9 as the second digit.)
  • In geographic area codes (like 503 or 971, not like 800 or 888), a prefix cannot be N11 (where N is any number between 0–9).

So all the validation program needs to do is check to see if any of these are true, and if it’s not, just say “Sorry, that’s not valid.”

But there are limits to these programs. While it’s possible to make a system that will call or text you to verify the validity and ownership of the number (in fact, many services like Twitter or Facebook will confirm the number to make sure it’s really yours), it’s fairly uncommon and can be costly depending on the sheer volume of form submissions.

You can meet the requirements of NANP to produce a possible number, but there’s no guarantee that the number isn’t actually in service. When in doubt, use a company’s own telephone number. The form validator probably doesn’t check for that.

Credit & Debit Cards

Here’s another one that may surprise you. Credit and debit card numbers can be validated without contacting your card issuer’s payments system. Do note that a valid number and a number backed by an account are two separate things.

Let’s start with those forms that automatically detect which network your card is on before you’ve even finished typing your number. That’s actually very simple: the first six digits of your card number aren’t unique. They’re the Issuer Identification Number (IIN), and they show who issued your card. Each network is made up of many issuers, so they can often be detected in the first two digits. For example, any card number starting with 51–55 is a MasterCard; with 4, a Visa; 65, Discover. (There are other numbers for these and other networks.)

Now on to the next part: How can a Web form guess if you’re putting in a card number that could be right? After all, it’s a waste of time (and possibly money, through payment gateway fees) to check every card number with the issuers for validity; there should be some sort of sieve to cut most errors or egregious brute-force attempts out.

Your credit or debit card has a bit of mathematical magic in it, intended to prevent those errors. Not only is it in your credit or debit card, but it’s in all kinds of systems to verify that the integrity of a piece of information is not compromised; for example, if someone accidentally fat-fingers the wrong number during data entry.

In a credit or debit card number, the last digit is the check digit:

XXXX-XXXX-XXXX-XXXC (VISA, MasterCard)

The check digit is calculated using itself and all the other digits of the card number using something called the Luhn algorithm, also called the mod 10 algorithm. It’s designed to protect against just those kind of errors. To show how the validation works, I’ll check an account number.

The card number I’m going to check is 4242-4242-4242-4242. Seems like this wouldn’t be valid on first glance, but remember: a computer doesn’t see the number the same way we do. The card number validator is looking for an answer of 0; if it’s 0, it’s a valid number. If it’s any other digit, the number is bad. To find that answer, it will first take every second digit, counting backwards from the check digit, and multiply them by 2:

4 2 4 2 4 2 4 2 4 2 4 2 4 2 4 2

8 2 8 2 8 2 8 2 8 2 8 2 8 2 8 2

Then, the validator will sum all the digits:

8 + 2 + 8 + 2 + 8 + 2 + 8 + 2 + 8 + 2 + 8 + 2 + 8 + 2 + 8 + 2 = 80

Finally, the validator will get the modulus of the sum and the number 10. That’s a fancy way of saying divide 80 by 10 and look at the remainder. Because 80 is divided evenly by 10, the remainder—the modulus—is 0. The card number 4242-4242-4242-4242 is valid. It’s not associated with an account; actually, this number is a common test number for services like Stripe.

If the account number had totaled 144, the modulus would have been 4 (10 can go into 144 evenly only 14 times); because four isn’t zero, that card number is invalid.

These two checks—the Luhn algorithm and accepted card issuers’ IINs—combined is a sufficient and computationally inexpensive way to check card numbers before the form is even submitted and the final check between the Web site’s payment gateway and a card issuer is conducted.

Addresses

This one’s actually a bit tricky. Usually it relies on having a collection of real names for streets, cities, and states/provinces, alongside possible addresses. Because of the scale of that information, this kind of verification (or at least the data) is typically outsourced to a third party. These aren’t foolproof. There may be a house at 700 Main Street and 750 Main Street, but not 725 Main Street. If 725 Main Street was entered into a form, the validator might think it’s a real address because 700 and 750 are real, or because a range of addresses are valid (700–800).

In Short

These forms don’t actually know if information is real or not. There are ways to check if information is genuine, by using it: calling you, checking with your card issuer, or sending a postcard to your address. Those methods are typically cost prohibitive for most Web sites to use, so they rely on the methods you’ve seen here (and many others!) to essentially guess if the information you supply is genuine.

Some terms of service will actually prohibit supplying false information under penalty of perjury, and until there is change in how those implied contracts are handled, someone may actually end up in trouble for entering a fake phone number. More often than not, a Web site is either legitimately in need of your information to make a purchase, to verify your identity, or to complete a task you’ve given. Unfortunately, sometimes they may just be collecting data about you to sell you things or sell others your things.

One last thing: validation is not verification. Just because a number can be issued or used doesn’t mean it’s in service.