George Kobak

Test Automation Lead

I have found it proven time and time again that a data-centric approach to software development becomes particularly essential when it comes to automation. In a more abstract sense, we can call it being information-centric. In turn, genericizing the concept even further, we can call it being logic-centric. All this makes sense because code is simply the translation device of the logic or at least should be. These concepts were well understood years ago when I.T. used to be called data processing. COBOL is one of the most data-centric computer languages out there. It is not by accident that COBOL programs are still running although being decades old. Closely associated with the data-centric concept is the importance of set theory. Even in the field of pure mathematics research, it is being revisited. Also, Big Data has forced the issue of putting the focus back on what's really important. Data, data everywhere and not a drop of meaningful information to process teaches us that data without context is meaningless. If the data is to be centralized, the context needs to be known which can be self-described by the data itself, true. Where the data sits is simply the data container. But the context meant here is what provides the structure at the conceptual and logical level. In that sense then, physical structure is not needed. It all boils down to the ability of recognizing the data relationships (at the conceptual and logical levels) that are inherent in the information flow. Thus, modeling of these relationships at those levels is vital. So almost all application bugs occur because of things getting lost in translation in this information flow. But with a data-centric approach, an amazing thing happens in that code refactoring becomes automatic. So whatever code is initially written although appearing customized, is actually modular from the beginning because it is tapping into those abstract information layers that it is focused on. Therefore, the code refactoring comes from the architectural level of the information itself whi h is why it occurs automatically. To illustrate, it's like wool that keeps shrinking every time it's washed. And a side benefit is that it works directly against software entropy. The only disadvantage of this approach is that it is much harder to do and it takes much longer up front. But in the long run, it is much easier to maintain, is more stable, and saves more money than taking an application-centric or what we can call a code-centric approach to software development.