Tag Archives: Federal Government

US Government Business Capture Data Mining in Microsoft Excel

View agency activity clustering on geography in Excel using Excel Data Mining Add-ins

By Don Krapohl

1.       Ensure you have downloaded the Excel Data Mining Add-ins from Microsoft at http://www.microsoft.com/en-us/download/details.aspx?id=35578 .  The article assumes you have a working version of the DM Addins and a default Analysis Services (SSAS) instance defined.  Search for getting started with SQL Server Data Mining Add-ins for Excel if you are not familiar with this process.

2.       Open the Excel sample file for Federal contract acquisitions in Wyoming (2012) from http://www.augmentedintel.com/content/datasets/government_contracts_data_mining_addins.xlsx

3.       On the wy_data_feed tab, select all the data.

4.       In the Home tab on the ribbon in the Styles section select “Format as Table”.  Pick any format you wish.

5.       A new tab will appear on the ribbon for Table Tools with menus for Analyze and Design as below.

Microsoft Excel Table Tools menu
Table tools to format data as a table in Excel


6.       On the Analyze menu, select “Detect Categories”.  This is will group (cluster) your information on common attributes, particular commonalities that are not obvious or immediately observable.

7.       Deselect all checkboxes except the following:

a.       Dollars Obligated

b.      Award Type

c.       Contract Pricing

d.      Funding Agency

e.      Product Or Service Code

f.        Category

8.       Click ‘run’

9.       The output will show you categories of information showing strong affinities.  Explore the model by filtering the charts and tables by the category/ies generated.  Do this by selecting the filter icon (funnel) next to Category on the table or the Category label at the lower left of the graph.

10.   Interesting information may be derived from the groups with fewer rows that may show particularly interesting correlations for a targeted campaign.  For example, filter the table and chart on Category 6.  This group indicates a group affinity for the attribute values ProductOrServiceCode = “REFRIGERATION AND AIR CONDITIONING COMPONENTS”, fundingAgency = “Veterans Affairs, Department Of”, and a contract award value of $61,148 to $1,173,695 as shown below:


Importance of data categories in Excel Data Mining Add-ins
Factor Analysis in Microsoft Excel

For my organization’s business development activities, if I am in the heating and air business I may elect to focus efforts on medium-sized contracts with Veterans Affairs.

My Google+

Predicting the Best Parameters for Federal Business Capture using WEKA

Which contract parameters should I choose?

What combination of features might I pursue to raise my probability of contract award?

  1. Open WEKA explorer
  2. On pre-process tab find the government_contracts.arff file.
  3. Perform pre-processing
    1. Escape non-enclosure single- and double-quotes (\’, \”) if using a delimited text version.
    2. Check ‘UniqueTransactionID’ and click ‘Remove’.  Stating the obvious, there is no value in analysis of a continuous random transaction ID, discretization and local smoothing  can lead to overfitting, and it has no predictive value.
    3. If you have saved the arff back into a csv you will have to filter the ZIP code fields RecipientZipCode and PlaceOfPerformanceZipCode back to nominal with the unsupervised attribute filter StringToNominal and DollarsObligated to numeric.
    4. On the Associate tab, select the Apriori algorithm and click ‘start’.  The results:


WEKA association rules for contract feature prediction
Predicting Award Parameters


This indicates that selecting for Firm Fixed Price contracts for the VA, if you are located in ZIP 83110 and the work will be performed within ZIP 83110 you may have an advantage in the acquisition.

My Google+

SOFs and Big Data – A Not a Cultural Shift

NOTE: This is a repost by permission of an article by Mr. Richard Marshall. Mr. Marshall provides big data and analytics capabilities to the Special Operations community through his company, Blackstorm International. His website is http://blackstorm-int.com.

SOF Warriors


When you think of Special Operations Forces you think of the hard men that stormed the Osama Bin Ladin compound in the middle of the night, successfully delivering Justice and Honor. You do not think of tall thin kid, barely out of college with a European man-bag, converse shoes drinking a vanilla latte as the next warrior against the enemies of freedom.

Special Operations has always looked to gain the advantage in every action, seeking especially adept groups as seeking out competitive advantage. Too often these groups focus on the bleeding edge of operations and are often scarce resources used for a limited purpose.

In a situation that is not unique to SOF, there is a condition where the supportive functions of the organization do not benefit from the same attention the primary mission holders receive. While this is to be expected, organizations also need to ensure that the supporting elements’ business systems and processes are improved over time to avoid organizational drag.

In essence the lack of proliferation of qualified data scientists in all levels of the organization result in a lack of consistent business practices and a myopic focus in isolated business areas severely limits the value big data and analytics can bring to SOF.  What is needed is a set of practices and processes that are repeatable, can be expanded upon and easily translated across organizational boundaries. The potential for subordinate units being able to leverage Headquarters practices and resources thereby lowering the barriers successful analytics utilization is an ability not yet realized for most commands.

In fact, many consultants in this space will assert that commoditization is not possible within the discipline of BI/BA as every problem is different and that it takes different skills and approaches to solve the identified problems. This is a fallacy and is a stance usually designed to prolong consulting engagements and profitability.

It is a simple fact that much of the technology needed to develop an analytics program are already in existence within the organizations desiring analytics capability. There are benefits to purchasing scalable distributed storage solutions supporting big data applications; however these need to be balanced against the benefits of license optimization within the current infrastructure. Seldom is scalability a driving issue in COCOMS the way it is for other industries such as banking. The data are simply not that large.

Eventually we will begin to learn to utilize the additional deluge of data off our sensor platforms necessitating the need for a scalable infrastructure however the practice of working with the data must come first. Most likely, big data sets that are available in DoD will be more focused on efficiencies and utilization (performance management) rather than finding a bad guy. In fact, much of the data that fits the big data profile will be platform specific data that has little to do with SOF’s 8 primary mission areas.

So what will DoD organizations as the Combatant Command and subordinate organizations need to change to take advantage of this emergent approach to competitive advantage? SOF only needs to do what they have always done—operate outside their comfort zone:

  1. Realize that the contracting groups that are most likely to assist in this field will not come from their old ops buddies. The groups that will bring this success will have little or no knowledge of SOF Missions. They will have a deep knowledge of data, statistical analysis and presentation.
  2. Look to develop a set of business practices and policies that support decision making for the command that can be shared with subordinate units.
  3. Question Solutions. Look critically at the offerings within the community. Many organizations are trying to sell applications and hardware as bundled sets. Analyze the benefits of these platforms and what capability it will bring. Most organizations running a Microsoft infrastructure already have all the tools they need to develop an analytics capability.
  4. Focus on the practice. Build a framework and integrate the capability into every J-Code/staff section. Hire the personnel that can train and guide Command staff asking the questions that will lead to analytics solutions.
  5. Focus on the data. The practice of working with data has academically been reserved for a small group of science majors and professionals. As the data sets expand, staff members can assist the command in being mindful of the importance of all data and ensure that the organizations information is properly constructed and cared for.
  6. Knowledge Management. Knowledge Management offers a unique position for developing a global analytics solution due to the scope of their reach within CCMD’s. Though underutilized now, KM’s will mature into the focal point for future analytics operations, as keepers of the index.

There are plenty of opportunities for SOF warriors to squeeze more out of their data and current systems. The habit of consistently reaching outside existing comfort zones is a hallmark of profession. What SOF needs is a practice and a framework that can be shared and grown and a vehicle to deliver the tools needed by the new generation of leaders and operations specialists.  The nondescript, European man-bag-carrying warrior will be on point in our unconventional war against our enemies with enhanced, analytics-driven information as a key weapon in her arsenal.