Editorial - Bannerline Communications

 

Bannerline Home
Up
Internet Marketing 101
Pushing Back
Personal Privacy Policy

How they figure out who is a "live" one

Part 1
Internet Marketing 101

The first thing to understand about the world in general and the Internet in particular is TANSTAAFL, an acronym made famous by Robert A. Heinlein in his book "The Moon is a Harsh Mistress." (page 129 in my pocketbook version.) In the words of his character, Mannie, "Oh, 'tanstaafl.' Means 'there ain't no such thing as a free lunch'. And isn't," ...pointing to a FREE LUNCH sign across the room, "or these drinks would cost half as much. Was reminding (her) that anything free costs twice as much in long run or turns out to be worthless."

In the pre-Internet/pre-computer world most of us figured out how the marketing droids manipulate us into telling them about ourselves so they can barrage us with advertising meant to part us from our hard-won dollars.

We know that if we fill in that "free draw" card at our local supermarket we're likely to get an offer from the local fitness center or vacuum salesman, probably by phone and probably at dinner time; but we fill it in anyway because there's a chance we might win and we can always tell the phone-droid to "f$%&-off." Of course they can also just send us "junk mail" to the address we've handily given them because they know we wouldn't lie or we couldn't collect the prize if we won.

Over the past ten or so years this (fill in the free offer card) style of getting information about you has been extended to, and in many ways perfected, using computers and the Internet. Today most people don't even know that they are leaving valuable information behind in their journey through Cyberspace.

Purchasing Habits for Sale

The massive growth in the capabilities of computers and their storage systems has meant that records of every single purchase transaction you've made using something that identifies you can be (and has been) tracked. It started with the credit card companies and the big chain stores with their automated cash register systems. Prior to the automation, the transactions were only tracked for cash audit purposes using the "audit" tape (second roll in the cash register, kept for the tax and corporate auditors so they could track fraud and theft) so were not easily analyzed for anything but the totals and tax. Even at the beginning of the credit card revolution, the use of the "flimsy" card slip meant difficulty in after-the-fact purchase analysis other than dollar amounts vs. month, or at best week, since the retailer deposit and card-issuer data entry cycle was up to 10 days.


Today however, you pass the clerk your card and in many cases either they scan it solely with the store's cash register, which is hooked directly to the credit card company, or they scan it twice - once for their own records and once to actually deal with the money transaction (watch out if they scan it three times - the extra may be for fraud). You don't really think they need your card number in their system for security purposes do you? Of course not - the card company indemnifies them as soon as the card is validated online and they don't even need your signature anymore the way the systems are set up (did you sign the last gasoline purchase made at the pay-at-the-pump outlet?) Matched with the record of SKUs (stock keeping units - the number on the item - the bar code number, etc.) these make an incredibly informative record of what you and the rest of their customers purchase.

Now admittedly, most of the credit card companies don't like them tracking your name from your credit card number, but just the fact that they know that 4503....... comes in each month and spends an average of $100 in the tool department is useful. The major department stores that run their own credit cards don't even have to worry about tying the number to a name (and address, phone number, etc.) since you gave them that and the right to use it when you signed up for the card. Gee, how did they know I'm a tool junkie - they're always sending me flyers for their next tool sale?

For those stores that don't run their own credit cards (and even for those that do since many of them will accept other cards as well and they want to track Everything!) the "affinity card" was invented. It started out with those little "stamp" cards you'd get from a retailer every time you purchased a pound of coffee or some other commodity. You kept coming back so you'd eventually get your "free" pound.

It progressed to things like the "Air Miles" (www.airmiles.ca) card which the retailers who couldn't afford to run their own credit cards buy into in return for accurate purchasing statistics on the customers who use such a card. Oh Goody! We get "free" air miles we can use to go for a holiday - eventually.

With the increasing use of standardized computers and networking in stores, even small stores and chains could afford to add their own affinity program - starting with the food stores and working out to all the rest of the commodities. It is to the point where personally I get a backache from the size of my wallet due to the number of such cards I'm expected to carry. I'm pushing back, but that is for another section.

Browsing/Viewing Habits for Sale

The same things done in the retail trade apply in spades in the world of the Web. Not only do the e-commerce vendors know what you bought (or though about buying), they know which pages you visited, how long you were there, and what advertising and other stuff you had in front of you prior to your choice. It's kind of like the local food store having a GPS system on your shopping cart hooked to a TV camera that watches you as you shop - and tracking your progress through the store. Note that you might in fact have been subject to such a survey unobtrusively as someone watched you either in person or over closed circuit TV. If matched to your credit card or affinity card information at the checkout, they would even know who you were. Most stores don't do this very often because it costs quite a bit - but web sites keep the information as a matter of course since it is generated as part of the process of handing you the pages you view!

host213-122-57-44.in-addr.btopenworld.com - - [28/Dec/2003:12:44:24 -0800] "GET /icons/camoglaze.jpg HTTP/1.1" 200 1443 "http://www.mystae.com/reflections/vietnam/proudmary2.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; {6A8348CA-A13E-4FFD-B10C-61D09B44A036})" www.mystae.com

hsdbrg64-110-224-169.sasknet.sk.ca - - [28/Dec/2003:12:44:26 -0800] "GET /restricted/streams/scripts/machine.html HTTP/1.1" 200 24433 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)" www.mystae.com

hsdbrg64-110-224-169.sasknet.sk.ca - - [28/Dec/2003:12:44:26 -0800] "GET /icons/bluebg.jpg HTTP/1.1" 200 4088 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)" www.mystae.com

hsdbrg64-110-224-169.sasknet.sk.ca - - [28/Dec/2003:12:44:26 -0800] "GET /icons/hr.jpg HTTP/1.1" 200 2542 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)" www.mystae.com

hsdbrg64-110-224-169.sasknet.sk.ca - - [28/Dec/2003:12:44:26 -0800] "GET /icons/zulubg.jpg HTTP/1.1" 200 3393 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)" www.mystae.com

hsdbrg64-110-224-169.sasknet.sk.ca - - [28/Dec/2003:12:44:26 -0800] "GET /icons/amazon7.gif HTTP/1.1" 200 2443 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)" www.mystae.com

hsdbrg64-110-224-169.sasknet.sk.ca - - [28/Dec/2003:12:44:26 -0800] "GET /icons/yline.gif HTTP/1.1" 200 419 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)" www.mystae.com

Section of the log from our web server generated as I write this.
Shows the address of the requestor, what they asked for, what browser they're using and what operating system even, as well as time and date

The logs can even track what site you visited before you come to the one you're viewing. This can include what search criteria you used at your favourite search engine. All of this can be analyzed and served up as statistics in aggregate or even individual by individual (although that's not typical on a busy site - just too much detail). We do this for David's own site www.centa.com so that we can judge what are the "hot" topics as time goes by. Of course we don't know who you are unless you've actually subscribed to the mail-list.

Search Query Report
(Go To: Top: General Summary: Yearly Report: Quarterly Report: Monthly Report: Weekly Report: Daily Report: Daily Summary: Hourly Summary: Domain Report: Organisation Report: Host Report: Host Redirection Report: Host Failure Report: User Report: User Failure Report: Failed Referrer Report: Referrer Report: Referring Site Report: Search Query Report: Search Word Report: Browser Report: Browser Summary: Operating System Report: Status Code Report: File Size Report: File Type Report: Directory Report: Redirection Report: Failure Report: Request Report)
This report lists which queries people used in search engines to find the site.

Listing queries, sorted by the number of requests.
reqs: search term
----: -----------
3: income tax immigration
2: canadian tax rates
2: immigration department of sydney to canada
2: revenue canada race horse
2: canadians working in usa social security taxes
2: canadian citizen living in us need to pay tax in canada??
2: americans living in canada
2: canadian tax us rental

This analysis was produced by analog 5.32.
Running time: 1 second.

A piece of a daily report - note that only the top 10 are shown.
There are actually several hundred such phrases in all on this report.

h24-80-116-254.sbm.shawcable.net - - [27/Dec/2003:09:13:43 -0800] "GET / HTTP/1.1" 200 41273 "http://www.google.ca/search?hl=en&ie=UTF-8&oe=UTF-8&q=income+tax+immigration&btnG=Google+Search&meta=" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20030925" www.centa.com

A log line showing the key words used in searching - in this case using Google.
The portion showing the site the request came from is the "referrer" section

As you can see in the box above, lots of interesting things can be read from the logs - and as you can also see - even on the old AMD 850 this site (along with several hundred more, some of which are MUCH larger) is hosted on only took less than a second to produce the report; a report that runs to about 129k of text plus graphs for this one day - you're only seeing one piece of one section. The same report is done as a monthly and yearly aggregate too. We don't track individual users' path through the site and we use "Open Source" log analysis software so the report is pretty basic. You can bet that the major sites collect far more data and do a far better job of analyzing it.

Note that even after this analysis is done the original log lines are still available for further analysis if needed. The lines for this year for the CEN-TA site total to about 44 Megabytes of compressed files. Even our largest site which gets over a million file views a day runs to only about 12 Gigabytes for the year. With disk space at about $1/Gig these days, storing them online is trivial.

The point is that the technology to track literally everything you do when sitting in front of your computer and interacting with it and the Internet's Web is available, and not all that expensive. Even at the best, you leave tracks in various computers as you browse; mostly "anonymous" but valuable none the less.

Taking Away the Mask of Anonymity

What David first asked me about - whether or not I'd seen a picture from the web page he'd sent me - is all about unmasking your anonymity. Much of what I've detailed in the previous section can only tell what computer address you were at when you looked at the pages. For most people this changes each day or so, so there is no real correlation to a person (I have a fixed IP address which adds spice to the problem as I'll tell you about below.)

In some cases this unmasking is subtle. In others it is blatant. In Canada after January 1, 2004 it had better be "by the book" or somebody could be in trouble; at least somebody other than you, the page viewer. Of course my opinion is that you're potentially in trouble no matter what you do.

I don't mean to sound completely paranoid, I'm not. On the other hand, maybe I (and you) should be. The number of incidents of identity theft and fraud is growing. So too is the number of online scams, spam e-mails, bogus web sites and what have you. They're not yet at the point where I'd call them a real epidemic - at least not for people who know there is no Easter Bunny, Santa Clause, 80% return on investment in a year or $200,000 bonus for getting "my" millions out of Uganda or wherever; in other words for people who have even a modicum of skepticism and common sense. All that is needed is a bit of education on what to watch out for - the subject of this article.

Web bugs

The original reason David asked me to write this article is an example of a "web bug" - a unique URL that is embedded in a message sent to you in some fashion that, when you view the message, confirms that you have done so.

The page David sent me (or caused the web site to send me as if it were from David) was done up in HTML and included a couple of unique image URLs, one of which ended with "__tn_pers2790347040.jpg?BCmegAABvemnfj9H"

If my browser had been set as most of yours are set, the first time this message appeared in my preview pane or was opened by me, the image would have been loaded from the sending website - leaving behind a log record including the full URL. Note that after the image's name (__tn_pers2790347040.jpg) there is a trailing "?" and something (BCmegAABvemnfj9H) that appears to be garbage characters. In fact, the garbage is a unique key to a record in a database that includes the fact that the page was mailed to both me and David, including the time it was sent, and probably linking to all the things that David had done in the session leading up to his sending it.

In this case the bug was attached to a "real" picture. In some cases it is as little as a single pixel (picture element - dot on the screen) so loads "instantly" and doesn't show you anything - but it's log record exists in the server none the less.

Freaky, eh?

And you thought you'd turned off "acknowledge reply request" (which causes an automatic reply e-mail to be sent which tells the original sender that you've read their message, but which some mail agents don't support well and most people outside of specific companies refuse to have turned on for privacy reasons if for no other reason than to deter the spammers)

We know you've seen our mail!

And in some cases (Windows specifically) because it is actually the main browser engine that interprets the HTML and retrieves the graphic, the sending site has the opportunity to send your computer a "cookie" that continues to identify you if you should again visit the site with your normal browser, even months in the future.

Cookies

When you're just web browsing, one of the ways a web site tracks you as distinct from some other viewer, for a few minutes or forever, is by sending your web browser a unique series of characters (somewhat like the web bug above) that your browser stores for some time, possibly permanently. This "cookie" concept is valuable to you the viewer in some cases - such as when you're working with a web site you've had to log onto with a user ID and password. If it were not for cookies, the otherwise simplistic design of the Hypertext Transport Protocol would mean you would have to re-logon for each page you wanted to view on the site - not something most would put up with.

The problem is that this viewer-helping web extension also can help the web site keep track of you and your travels through the site (or even across sites).

Unless you have told your web browser not to store cookies (see Pushing Back) a web site can deposit a cookie on your computer and later check to see if it is there. The cookie can contain either direct data or a key (like the one above on the image tag) that can be used to pull a record from a database and add more detail to it. At minimum, the cookie can be used to track which pages you've visited, in what order and for how long during the current viewing session with the site. In extreme cases, the cookie can allow the system to track your use of any web site that uses a common information database (and there are many such agglomerated site systems) and tie the information into answers you might give to seemingly innocuous "surveys" and questionnaires (see Verifications below) as well as purchases - eventually building up a wealth of data on your personal and financial life. In some cases enough is learned that the web site can tie their information to your credit record (even if you don't give them a credit card number or your SIN/SSN.)

One thing to note with this and many of the other methods used by legitimate companies to collect information on you; it is not looked at personally by anyone except in very extreme cases. The data is massaged and manipulated by programs which today bear a striking resemblance to Artificial Intelligence - with the goal of presenting you with advertising and offers as well as information that the system thinks is most likely to keep you coming back and hopefully to get you to part with some of your hard-earned cash - sell you things and services.

Verifications

I subscribe to a number of "free" magazines. Even though I've been around computers and the Internet for longer than most people my age, I still like to read from paper - a habit I'm working on breaking by adding screen real estate to my system, but which seems to be a losing battle as my eyesight deteriorates with age. For the techies out there, I run my main system with two 19" monitors, each running at 1600x1400 - problem is I have the font sizes cranked up to the point where I might just as well be running them at 800x600 when I'm actually reading.

Anyway, back to the free magazines. Every year or so, each of the magazines sends me a special issue wrapped in a verification questionnaire. Prior to the Internet I'd fill these in and either snail-mail them back or fax them back. Today however, all of them have fill-in web forms for this purpose; should be easier, right?

Well, yes it is easier. The problem is that the magazines get their advertising dollars based upon audited subscription statistics so they can't just print up thousands of copies and send them out to random people; they have to know that you "qualify" and are a real person. With the forms they send, there is a spot for a signature. Unfortunately, there is no way of signing a web fill-in form (at least not one they will accept) so the auditors (or the magazines' programmers maybe) came up with the concept of a "verification question" - something that is of a relatively personal nature that a random person probably would not know about you - kind of like asking your mother's maiden name when talking to the government about your passport or driver's license. (I have issues with this too but that's for another time)

The problem is that it seems that many/most of the magazines I get either have the same software for their questionnaires or use the same service provider to manage their subscriptions. Some of them even send me to the same web site but different sub-directory, although most have something under their own web name.

The curious thing is that all of these magazines have a similar set of questions they ask for "verification purposes". The questions seem to change every time I renew for a particular magazine but over all of them the questions in total remain fairly static:

bulletcolour of your hair?
bulletbirth city
bulletcolour of eyes
bulletfavourite colour
bulletfavourite pet's name
bulletmonth of birth
bulletday of birth
bulletyear of birth
bulletcolour of vehicle
bulletetc. etc. etc.

Notice anything? Each of the questions in itself doesn't give any particularly private information, but all of them in total do - and these are just a sampling of the ones I get. I know for a fact that at least 5 of the magazines I get are from the same publisher - they cross advertise and the web site is the same for the renewals; yet each asks a different question each year so the total of the information they can gather is large.

Of course I caught onto this years ago and have instituted my own "Privacy Policy" which I'll tell you about in another section. In general I have a set of answers that I use consistently but which are not even close to the "truth".

Surveys, Questionnaires and Stealth Questions

Several of the web sites I visit regularly have "informal polls", questionnaires, and other information gathering means. The magazine sites in the previous section all ask information about the kinds of business I do, including dollar volumes, projections, etc. In their case, this is to allow them to decide if I "qualify" as someone they want to send their "free" magazine to. At least the magazine publishers are fairly up front about it; other sites are not.

If you do any major browsing on the Web I'm sure you've come across sites that ask you questions in order to gain access to some of their areas. The questions can include personal information, even if cloaked as a range of values (Age: 18-25, 26-35, ...) but over time the accuracy of the data can be alarmingly precise. If you are asked the same question but with slightly different ranges the computer can narrow down the exact answer by detecting when you move from one range to another; (18-25, 19-30, 24-36, 26-35 - if you are 25 you'll end up in the first, second, third but not fourth) 

The fact that you choose a particular button to go to the next page can be informative; [English] [French] being one of the most common in Canada. In fact, your choice of click-through advertising is probably kept along with the rest of your profile. Did you click on the ad for music videos or tools? The next time you're presented with a couple of ads they may be specifically placed to determine your preference in tool or music artist, depending upon which you chose first.

You should also know that the same things apply to the information you fill into the software registration forms on your computer when you add something new. You're asked similar things each time you get an upgrade in some cases and of course when the inevitable happens and you have to re-install everything again.

Against all of these techniques, what can you do? You want to use the services, and in many cases don't mind that they are going to try to sell you things. You just don't want to give away enough that "they" can be more than minorly annoying if you can possibly help it.

On the other hand, you also don't want to get caught by the criminal side of the computer revolution either. Information you might actually be comfortable with giving to a company you know and trust might be just the thing an identity thief needs to get a new credit card issued with your name on it.
Somewhere you and the businesses and sites you deal with have to strike a balance that both can be comfortable with. The problem is that the guys at the other end of your Internet connection have all the tools and databases. 

Previous - Next "Pushing Back"

top of page

 

Copyright© -2008 Bannerline