2.5TB, 53.5 Billion Clicks Dataset Available for Clickstream Analysis

To foster the study of the structure and dynamics of Web traffic networks, Indiana University has made available a large dataset (‘Click Dataset’) of about 53.5 billion HTTP requests made by users at Indiana University. Gathering anonymized requests directly from the network rather than relying on server logs and browser instrumentation allows one to examine large volumes of… Continue reading 2.5TB, 53.5 Billion Clicks Dataset Available for Clickstream Analysis

Meetup for Tampa Analytics Professionals

I’ve started a meetup for local professionals in the decision science field around the Tampa Bay area to come together and learn about what’s happening in our area.  If you are a data science professional, come join us and be a part of making the Tampa-St. Petersburg metro area the southeast center of excellence in… Continue reading Meetup for Tampa Analytics Professionals

Predicting the Best Parameters for Federal Business Capture using WEKA

Which contract parameters should I choose? What combination of features might I pursue to raise my probability of contract award? Open WEKA explorer On pre-process tab find the government_contracts.arff file. Perform pre-processing Escape non-enclosure single- and double-quotes (\’, \”) if using a delimited text version. Check ‘UniqueTransactionID’ and click ‘Remove’.  Stating the obvious, there is… Continue reading Predicting the Best Parameters for Federal Business Capture using WEKA