Part 1
Internet Marketing 101
The first thing to understand about the
world in general and the Internet in particular is TANSTAAFL, an
acronym made famous by Robert A. Heinlein in his book "The Moon is a
Harsh Mistress." (page 129 in my pocketbook version.) In the words of his
character, Mannie, "Oh, 'tanstaafl.' Means 'there ain't no such thing
as a free lunch'. And isn't," ...pointing to a
FREE LUNCH sign across the room, "or
these drinks would cost half as much. Was reminding (her) that anything
free costs twice as much in long run or turns out to be worthless."
In the pre-Internet/pre-computer world
most of us figured out how the marketing droids manipulate us into
telling them about ourselves so they can barrage us with advertising
meant to part us from our hard-won dollars.
We know that if we fill in that "free
draw" card at our local supermarket we're likely to get an offer from the
local fitness center or vacuum salesman, probably by phone and probably
at dinner time; but we fill it in anyway because there's a chance we
might win and we can always tell the phone-droid to "f$%&-off." Of course
they can also just send us "junk mail" to the address we've handily given
them because they know we wouldn't lie or we couldn't collect the prize
if we won.
Over the past ten or so years this (fill
in the free offer card) style of getting information about you has been
extended to, and in many ways perfected, using computers and the
Internet. Today most people don't even know that they are leaving
valuable information behind in their journey through Cyberspace.
Purchasing Habits for Sale
The massive growth in the capabilities of
computers and their storage systems has meant that records of every
single purchase transaction you've made using something that identifies
you can be (and has been) tracked. It started with the credit card
companies and the big chain stores with their automated cash register
systems. Prior to the automation, the transactions were only tracked for
cash audit purposes using the "audit" tape (second roll in the cash
register, kept for the tax and corporate auditors so they could track
fraud and theft) so were not easily analyzed for anything but the totals
and tax. Even at the beginning of the credit card revolution, the use of
the "flimsy" card slip meant difficulty in after-the-fact purchase
analysis other than dollar amounts vs. month, or at best week, since the
retailer deposit and card-issuer data entry cycle was up to 10 days.
Today however, you pass the clerk your card and in many cases either they
scan it solely with the store's cash register, which is hooked directly
to the credit card company, or they scan it twice - once for their own
records and once to actually deal with the money transaction (watch out
if they scan it three times - the extra may be for fraud). You don't
really think they need your card number in their system for security
purposes do you? Of course not - the card company indemnifies them as
soon as the card is validated online and they don't even need your
signature anymore the way the systems are set up (did you sign the last
gasoline purchase made at the pay-at-the-pump outlet?) Matched with the
record of SKUs (stock keeping units - the number on the item - the bar
code number, etc.) these make an incredibly informative record of what
you and the rest of their customers purchase.
Now admittedly, most of the credit card
companies don't like them tracking your name from your credit card
number, but just the fact that they know that 4503....... comes in each
month and spends an average of $100 in the tool department is useful. The
major department stores that run their own credit cards don't even have
to worry about tying the number to a name (and address, phone number,
etc.) since you gave them that and the right to use it when you signed up
for the card. Gee, how did they know I'm a tool junkie - they're always
sending me flyers for their next tool sale?
For those stores that don't run their own
credit cards (and even for those that do since many of them will accept
other cards as well and they want to track Everything!) the "affinity
card" was invented. It started out with those little "stamp" cards you'd
get from a retailer every time you purchased a pound of coffee or some
other commodity. You kept coming back so you'd eventually get your "free"
pound.
It progressed to things like the "Air
Miles" (www.airmiles.ca) card which the retailers who couldn't afford to
run their own credit cards buy into in return for accurate purchasing
statistics on the customers who use such a card. Oh Goody! We get "free"
air miles we can use to go for a holiday - eventually.
With the increasing use of standardized
computers and networking in stores, even small stores and chains could
afford to add their own affinity program - starting with the food stores
and working out to all the rest of the commodities. It is to the point
where personally I get a backache from the size of my wallet due to the
number of such cards I'm expected to carry. I'm pushing back, but that is
for another section.
Browsing/Viewing Habits for Sale
The same things done in the retail trade
apply in spades in the world of the Web. Not only do the e-commerce
vendors know what you bought (or though about buying), they know which
pages you visited, how long you were there, and what advertising and
other stuff you had in front of you prior to your choice. It's kind of
like the local food store having a GPS system on your shopping cart
hooked to a TV camera that watches you as you shop - and tracking your
progress through the store. Note that you might in fact have been subject
to such a survey unobtrusively as someone watched you either in person or
over closed circuit TV. If matched to your credit card or affinity card
information at the checkout, they would even know who you were. Most
stores don't do this very often because it costs quite a bit - but web
sites keep the information as a matter of course since it is generated as
part of the process of handing you the pages you view!
host213-122-57-44.in-addr.btopenworld.com - - [28/Dec/2003:12:44:24
-0800] "GET /icons/camoglaze.jpg HTTP/1.1" 200 1443
"http://www.mystae.com/reflections/vietnam/proudmary2.html"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;
{6A8348CA-A13E-4FFD-B10C-61D09B44A036})"
www.mystae.com
hsdbrg64-110-224-169.sasknet.sk.ca - - [28/Dec/2003:12:44:26 -0800]
"GET /restricted/streams/scripts/machine.html HTTP/1.1" 200 24433 "-"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR
1.0.3705)" www.mystae.com
hsdbrg64-110-224-169.sasknet.sk.ca - - [28/Dec/2003:12:44:26 -0800]
"GET /icons/bluebg.jpg HTTP/1.1" 200 4088 "-" "Mozilla/4.0
(compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"
www.mystae.com
hsdbrg64-110-224-169.sasknet.sk.ca - - [28/Dec/2003:12:44:26 -0800]
"GET /icons/hr.jpg HTTP/1.1" 200 2542 "-" "Mozilla/4.0 (compatible;
MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"
www.mystae.com
hsdbrg64-110-224-169.sasknet.sk.ca - - [28/Dec/2003:12:44:26 -0800]
"GET /icons/zulubg.jpg HTTP/1.1" 200 3393 "-" "Mozilla/4.0
(compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"
www.mystae.com
hsdbrg64-110-224-169.sasknet.sk.ca - - [28/Dec/2003:12:44:26 -0800]
"GET /icons/amazon7.gif HTTP/1.1" 200 2443 "-" "Mozilla/4.0
(compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"
www.mystae.com
hsdbrg64-110-224-169.sasknet.sk.ca - - [28/Dec/2003:12:44:26 -0800]
"GET /icons/yline.gif HTTP/1.1" 200 419 "-" "Mozilla/4.0 (compatible;
MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"
www.mystae.com
Section of the log from our web server
generated as I write this.
Shows the address of the requestor, what they asked for, what browser
they're using and what operating system even, as well as time and date |
The logs can even track what site you
visited before you come to the one you're viewing. This can include what
search criteria you used at your favourite search engine. All of this can
be analyzed and served up as statistics in aggregate or even individual
by individual (although that's not typical on a busy site - just too much
detail). We do this for David's own site
www.centa.com so that we can judge what are the "hot" topics as time
goes by. Of course we don't know who you are unless you've actually
subscribed to the mail-list.
Search Query Report
(Go To: Top: General Summary: Yearly
Report: Quarterly Report: Monthly Report: Weekly Report: Daily Report:
Daily Summary: Hourly Summary: Domain Report: Organisation Report:
Host Report: Host Redirection Report: Host Failure Report: User
Report: User Failure Report: Failed Referrer Report: Referrer Report:
Referring Site Report: Search Query Report:
Search Word Report: Browser Report: Browser Summary: Operating System
Report: Status Code Report: File Size Report: File Type Report:
Directory Report: Redirection Report: Failure Report: Request Report)
This report lists which queries people used in search
engines to find the site.
Listing queries, sorted by the number of requests.
reqs: search term
----: -----------
3: income tax immigration
2: canadian tax rates
2: immigration department of sydney to canada
2: revenue canada race horse
2: canadians working in usa social security taxes
2: canadian citizen living in us need to pay tax in canada??
2: americans living in canada
2: canadian tax us rentalThis
analysis was produced by analog 5.32.
Running time: 1 second.
A piece of a daily
report - note that only the top 10 are shown.
There are actually several hundred such phrases in all on this report.
h24-80-116-254.sbm.shawcable.net - -
[27/Dec/2003:09:13:43 -0800] "GET / HTTP/1.1" 200 41273
"http://www.google.ca/search?hl=en&ie=UTF-8&oe=UTF-8&q=income+tax+immigration&btnG=Google+Search&meta="
"Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20030925"
www.centa.com
A log line showing
the key words used in searching - in this case using Google.
The portion showing the site the request came from is the "referrer"
section |
As you can see in the box above, lots of
interesting things can be read from the logs - and as you can also see -
even on the old AMD 850 this site (along with several hundred more, some
of which are MUCH larger) is hosted on only took less than a second to
produce the report; a report that runs to about 129k of text plus graphs
for this one day - you're only seeing one piece of one section. The same
report is done as a monthly and yearly aggregate too. We don't track
individual users' path through the site and we use "Open Source" log
analysis software so the report is pretty basic. You can bet that the
major sites collect far more data and do a far better job of analyzing
it.
Note that even after this analysis is done
the original log lines are still available for further analysis if
needed. The lines for this year for the CEN-TA site total to about 44
Megabytes of compressed files. Even our largest site which gets over a
million file views a day runs to only about 12 Gigabytes for the year.
With disk space at about $1/Gig these days, storing them online is
trivial.
The point is that the technology to track
literally everything you do when sitting in front of your computer and
interacting with it and the Internet's Web is available, and not all that
expensive. Even at the best, you leave tracks in various computers as you
browse; mostly "anonymous" but valuable none the less.
Taking Away the Mask of Anonymity
What David first asked me about - whether
or not I'd seen a picture from the web page he'd sent me - is all about
unmasking your anonymity. Much of what I've detailed in the previous
section can only tell what computer address you were at when you looked
at the pages. For most people this changes each day or so, so there is no
real correlation to a person (I have a fixed IP address which adds spice
to the problem as I'll tell you about below.)
In some cases this unmasking is subtle. In
others it is blatant. In Canada after January 1, 2004 it had better be
"by the book" or somebody could be in trouble; at least somebody other
than you, the page viewer. Of course my opinion is that you're
potentially in trouble no matter what you do.
I don't mean to sound completely paranoid,
I'm not. On the other hand, maybe I (and you) should be. The number of
incidents of identity theft and fraud is growing. So too is the number of
online scams, spam e-mails, bogus web sites and what have you. They're
not yet at the point where I'd call them a real epidemic - at least not
for people who know there is no Easter Bunny, Santa Clause, 80% return on
investment in a year or $200,000 bonus for getting "my" millions out of
Uganda or wherever; in other words for people who have even a modicum of
skepticism and common sense. All that is needed is a bit of education on
what to watch out for - the subject of this article.
Web bugs
The original reason David asked me to
write this article is an example of a "web bug" - a unique URL that is
embedded in a message sent to you in some fashion that, when you view the
message, confirms that you have done so.
The page David sent me (or caused the web
site to send me as if it were from David) was done up in HTML and
included a couple of unique image URLs, one of which ended with
"__tn_pers2790347040.jpg?BCmegAABvemnfj9H"
If my browser had been set as most of
yours are set, the first time this message appeared in my preview pane or
was opened by me, the image would have been loaded from the sending
website - leaving behind a log record including the full URL. Note that
after the image's name (__tn_pers2790347040.jpg) there is a trailing "?"
and something (BCmegAABvemnfj9H) that appears to be garbage characters.
In fact, the garbage is a unique key to a record in a database that
includes the fact that the page was mailed to both me and David,
including the time it was sent, and probably linking to all the things
that David had done in the session leading up to his sending it.
In this case the bug was attached to a
"real" picture. In some cases it is as little as a single pixel (picture
element - dot on the screen) so loads "instantly" and doesn't show you
anything - but it's log record exists in the server none the less.
Freaky, eh?
And you thought you'd turned off
"acknowledge reply request" (which causes an automatic reply e-mail to be
sent which tells the original sender that you've read their message, but
which some mail agents don't support well and most people outside of
specific companies refuse to have turned on for privacy reasons if for no
other reason than to deter the spammers)
We know you've seen our mail!
And in some cases (Windows specifically)
because it is actually the main browser engine that interprets the HTML
and retrieves the graphic, the sending site has the opportunity to send
your computer a "cookie" that continues to identify you if you should
again visit the site with your normal browser, even months in the future.
Cookies
When you're just web browsing, one of the
ways a web site tracks you as distinct from some other viewer, for a few
minutes or forever, is by sending your web browser a unique series of
characters (somewhat like the web bug above) that your browser stores for
some time, possibly permanently. This "cookie" concept is valuable to you
the viewer in some cases - such as when you're working with a web site
you've had to log onto with a user ID and password. If it were not for
cookies, the otherwise simplistic design of the Hypertext Transport
Protocol would mean you would have to re-logon for each page you wanted
to view on the site - not something most would put up with.
The problem is that this viewer-helping
web extension also can help the web site keep track of you and your
travels through the site (or even across sites).
Unless you have told your web browser not
to store cookies (see Pushing Back)
a web site can deposit a cookie on your computer and later check to see
if it is there. The cookie can contain either direct data or a key (like
the one above on the image tag) that can be used to pull a record from a
database and add more detail to it. At minimum, the cookie can be used to
track which pages you've visited, in what order and for how long during
the current viewing session with the site. In extreme cases, the cookie
can allow the system to track your use of any web site that uses a common
information database (and there are many such agglomerated site systems)
and tie the information into answers you might give to seemingly
innocuous "surveys" and questionnaires (see
Verifications below) as well as purchases - eventually building
up a wealth of data on your personal and financial life. In some cases
enough is learned that the web site can tie their information to your
credit record (even if you don't give them a credit card number or your
SIN/SSN.)
One thing to note with this and many of
the other methods used by legitimate companies to collect
information on you; it is not looked at personally by anyone except in
very extreme cases. The data is massaged and manipulated by programs
which today bear a striking resemblance to Artificial Intelligence - with
the goal of presenting you with advertising and offers as well as
information that the system thinks is most likely to keep you coming back
and hopefully to get you to part with some of your hard-earned cash -
sell you things and services.
Verifications
I subscribe to a number of "free"
magazines. Even though I've been around computers and the Internet for
longer than most people my age, I still like to read from paper - a habit
I'm working on breaking by adding screen real estate to my system, but
which seems to be a losing battle as my eyesight deteriorates with age.
For the techies out there, I run my main system with two 19" monitors,
each running at 1600x1400 - problem is I have the font sizes cranked up
to the point where I might just as well be running them at 800x600 when
I'm actually reading.
Anyway, back to the free magazines. Every
year or so, each of the magazines sends me a special issue wrapped in a
verification questionnaire. Prior to the Internet I'd fill these in and
either snail-mail them back or fax them back. Today however, all of them
have fill-in web forms for this purpose; should be easier, right?
Well, yes it is easier. The problem is
that the magazines get their advertising dollars based upon audited
subscription statistics so they can't just print up thousands of copies
and send them out to random people; they have to know that you "qualify"
and are a real person. With the forms they send, there is a spot for a
signature. Unfortunately, there is no way of signing a web fill-in form
(at least not one they will accept) so the auditors (or the magazines'
programmers maybe) came up with the concept of a "verification question"
- something that is of a relatively personal nature that a random person
probably would not know about you - kind of like asking your mother's
maiden name when talking to the government about your passport or
driver's license. (I have issues with this too but that's for another
time)
The problem is that it seems that
many/most of the magazines I get either have the same software for their
questionnaires or use the same service provider to manage their
subscriptions. Some of them even send me to the same web site but
different sub-directory, although most have something under their own web
name.
The curious thing is that all of these
magazines have a similar set of questions they ask for "verification
purposes". The questions seem to change every time I renew for a
particular magazine but over all of them the questions in total remain
fairly static:
Notice anything? Each of the questions in
itself doesn't give any particularly private information, but all of them
in total do - and these are just a sampling of the ones I get. I know for
a fact that at least 5 of the magazines I get are from the same publisher
- they cross advertise and the web site is the same for the renewals; yet
each asks a different question each year so the total of the information
they can gather is large.
Of course I caught onto this years ago and
have instituted my own "Privacy
Policy" which I'll tell you about in another section. In general
I have a set of answers that I use consistently but which are not even
close to the "truth".
Surveys, Questionnaires and Stealth Questions
Several of the web sites I visit regularly
have "informal polls", questionnaires, and other information gathering
means. The magazine sites in the previous section all ask information
about the kinds of business I do, including dollar volumes, projections,
etc. In their case, this is to allow them to decide if I "qualify" as
someone they want to send their "free" magazine to. At least the magazine
publishers are fairly up front about it; other sites are not.
If you do any major browsing on the Web
I'm sure you've come across sites that ask you questions in order to gain
access to some of their areas. The questions can include personal
information, even if cloaked as a range of values (Age: 18-25, 26-35,
...) but over time the accuracy of the data can be alarmingly precise. If
you are asked the same question but with slightly different ranges the
computer can narrow down the exact answer by detecting when you move from
one range to another; (18-25, 19-30, 24-36, 26-35 - if you are 25 you'll
end up in the first, second, third but not fourth)
The fact that you choose a particular
button to go to the next page can be informative; [English] [French]
being one of the most common in Canada. In fact, your choice of
click-through advertising is probably kept along with the rest of your
profile. Did you click on the ad for music videos or tools? The next time
you're presented with a couple of ads they may be specifically placed to
determine your preference in tool or music artist, depending upon which
you chose first.
You should also know that the same things
apply to the information you fill into the software registration forms on
your computer when you add something new. You're asked similar things
each time you get an upgrade in some cases and of course when the
inevitable happens and you have to re-install everything again.
Against all of these techniques, what can
you do? You want to use the services, and in many cases don't mind that
they are going to try to sell you things. You just don't want to give
away enough that "they" can be more than minorly annoying if you can
possibly help it.
On the other hand, you also don't want to
get caught by the criminal side of the computer revolution either.
Information you might actually be comfortable with giving to a company
you know and trust might be just the thing an identity thief needs to get
a new credit card issued with your name on it.
Somewhere you and the businesses and sites you deal with have to strike a
balance that both can be comfortable with. The problem is that the guys
at the other end of your Internet connection have all the tools and
databases.
Previous - Next "Pushing Back"