Merrick Lex Berman

Researcher at Diamond Bay Research

Data centricity for organizations was clarified for me in conversations with Fabio Carrera, whose thesis on "City Knowledge" (MIT, 2004) [1] made a very strong case for capturing, indexing, and exposing data at the the moment when data is first entered into the administrative functions of a city.

In Carrera's example, the creation of a property record would be annotated and linked to any subsequent actions related to it over time, such as assessments, taxes, repairs, renovations, etc. Further, Carrera's field work (mapping the canals, sculptures, bridges, and gondola routes of Venice) [2] only served to reinforce the vital need for holistic data about the city, rather than each sector keeping the best data inside of a self-contained silo. In many cases, it may be impossible to avoid building a silo, but with a data centric perspective, you can avoid the worst problems when dealing with interoperability and migration later on.

This led me to focus on channelizing communications, and indexing content (both informally, as with SOLR, and formally, with tags, metadata, or ontologies).

The holistic approach to data development was a big influence in the development of the China Historical GIS, [3] and the ChinaMap [4] resources. Similarly the need for open linked data drove the development of the CHGIS XML Webservice (2006) and eventually the Temporal Gazetteer (2014).[5] Free access and machine-actionable web services are essential for our data-centric future, though we should also realize that curated data that exposes its methodologies and operable code has long-lasting value that is not superceded by generation of infinite streams of unclassified, unorganized, raw data.

Data itself, without proven methods for processing and interpretation, is a mountain of undifferntiated rock. It's up to us to find the jewels, and during the search keep our minds open to new perspectives. It is the interstitial and unanticipated relationships between datasets gathered for unrelated purposes that prove to be quite interesting.

The data centric approach, to my mind, is synonymous with focus, with avoiding redundancy, with compatibility, with convergence of efforts within an organization, and finding modes of contact and support with collaborators outside.


References:
1. Carrera, Fabio. "City Knowledge." Thesis, MIT, 2004.
http://users.wpi.edu/~carrera/MIT/Dissertation/Final%20PhD%20Dissertation.pdf
2. Venice Project. Worcester Polytechnic Institute. http://www.veniceprojectcenter.org/vpc
3. China Historical GIS https://sites.fas.harvard.edu/~chgis/
4. ChinaMap (Modern China GIS Archive) http://worldmap.harvard.edu/chinamap/
5. Temporal Gazetteer TGAZ http://maps.cga.harvard.edu/tgaz/