Jboss enterprise has a free data virtualization (NOT server virtualization) platform called Teiid. Capabilities of this include service of data from multiple technologies (jdbc, odbc, Thrift, REST, SOAP, etc.), merging/transformation of data, fault tolerance, scalability, and other capabilities one would require of an enterprise service. This can stand in the technology portfolio as part of… Continue reading Big Data Virtualization
Content on Knowledge Discovery in Databases (KDD), analytics, decision support, or data mining ranging from the user-approachable to the technically focused.
Example Big Data dev cluster topology
Below is an example dev cluster topology for a Big Data development cluster as I’ve actually used for some customers. It’s composed of 6 Amazon Web Service (AWS) servers, each with a particular purpose. We have been able to perform full lambda using this topology along with Teiid (for data abstraction) on terabytes of data.… Continue reading Example Big Data dev cluster topology
The Structure of an OpenNLP NameFinder Model
Named Entity Models Research labs and product teams intent on building upon openNLP and SOLR (which can consume an openNLP NameFinder model) frequently find it important to generate their own model parser or model builder classes. openNLP has in-built capabilities for this but in the case of custom parsers the structure of the openNLP NameFinder model… Continue reading The Structure of an OpenNLP NameFinder Model