Author Archives: dmassart
Actions & Facts Revisited
When we initially introduced our data fragment model, for reasons of simplicity, some details were not elaborated because they were not essential to the understanding of the principles the data integration approach is based upon. However, these details are necessary … Continue reading
A Hands-on Introduction to MapReduce in Python
MapReduce is simple. Some MapReduce algorithms can definitely be more difficult to write than others, but MapReduce as a programming approach is easy. However, people usually struggle the first time they are exposed to it. In our opinion, this comes … Continue reading
Data Integration, Part 4: Architectures
This post concludes our series on data integration by reviewing the architectural components necessary to support the data acquisition and integration approach presented earlier. Data Acquisition Action Registry Each data fragment in the fragment store was added as the result … Continue reading
Data Integration, Part 3: Data Integration, Data Curation, and Data Views
In recent posts, we have introduced a data acquisition workflow during which data from multiple sources are collected and used to uniquely identify resources of interest and fragment their descriptions into normalized action-fact data fragments. Once this is done, data … Continue reading
Data Integration, Part 2: Data Acquisition Workflow
In this post , we will look at the workflow that leads from the acquisition of raw data about resources to its storage as action-fact data fragments introduced in part 1. As depicted on the diagram above, this workflow consists … Continue reading
Data Integration, Part 1: Actions & Facts
We have already presented the facetted mechanism that allows us to describe resources by encapsulating in a precisely-defined manner all the information necessary to describe a specific aspect of these resources so that this information can be easily consumed and … Continue reading