Tag Archives: Big Data

Actions & Facts Revisited

When we initially introduced our data fragment model, for reasons of simplicity, some details were not elaborated because they were not essential to the understanding of the principles the data integration approach is based upon. However, these details are necessary … Continue reading

Posted in Data Modeling | Tagged , , , , , , , | Leave a comment

A Hands-on Introduction to MapReduce in Python

MapReduce is simple. Some MapReduce algorithms can definitely be more difficult to write than others, but MapReduce as a programming approach is easy. However, people usually struggle the first time they are exposed to it. In our opinion, this comes … Continue reading

Posted in Tutorial | Tagged , , , , , , | 4 Comments

Data Integration, Part 4: Architectures

This post concludes our series on data integration by reviewing the architectural components necessary to support the data acquisition and integration approach presented earlier. Data Acquisition Action Registry Each data fragment in the fragment store was added as the result … Continue reading

Posted in Data Integration | Tagged , , , | Leave a comment

Data Integration, Part 3: Data Integration, Data Curation, and Data Views

In recent posts, we have introduced a data acquisition workflow during which data from multiple sources are collected and used to uniquely identify resources of interest and fragment their descriptions into normalized action-fact data fragments. Once this is done, data … Continue reading

Posted in Data Integration | Tagged , , , , , | Leave a comment

Data Integration, Part 2: Data Acquisition Workflow

In this post , we will look at the workflow that leads from the acquisition of raw data about resources to its storage as action-fact data fragments introduced in part 1. As depicted on the diagram above, this workflow consists … Continue reading

Posted in Data Integration | Tagged , , , | Leave a comment

Data Integration, Part 1: Actions & Facts

We have already presented the facetted mechanism that allows us to describe resources by encapsulating in a precisely-defined manner all the information necessary to describe a specific aspect of these resources so that this information can be easily consumed and … Continue reading

Posted in Data Modeling | Tagged , , , , , | Leave a comment

Inside MongoDB SF 2014

Earlier this month, I attended MongoDB SF, the last stop of the MongoDB Days 2014 tour. Usually, I am reluctant to attend vendor-organized conferences because they tend to be more marketing opportunities than venues for gaining in depth understanding of … Continue reading

Posted in Database | Tagged , , , | Leave a comment