Mario De Francisco from DAMA Spain and Anjana Data will demystify the concept of DATAOps in his presentation at the Data Management Summit 2020
Mario De Francisco from DAMA Spain and Anjana Data will demystify the concept of DATAOps in his presentation at the Data Management Summit 2020
The “panel” of speakers concludes today with another professional who is going to participate in his speech within the most important event of Data Management. For the first time, a forum tries to focus on Data Management as a whole, from governance, through security, cloud, machine learning, data virtualization, and much more. Today we present Mario de Francisco Ruiz, CEO of Anjana Data and Head of the Data Governance and Metadata Working Group of DAMA Spain.
Tell us a little about your professional career
I’m an engineer and entrepreneur who come from Zaragoza. I studied Teleco at the University of Zaragoza and the last year of my career I spent in Sweden where I discovered the years of advantage that we have in many things, but also all the good things that we have in Spain and that we do not usually value. I returned to Zaragoza and started my professional career in a recently set up telemedicine start-up, where, looking to enter the world of bioengineering that I discovered in Sweden, I ended up unknowingly finding my first real experience working with data. Together with my battle partner, we spent a year developing a Matlab algorithm based on logistic regression that was able to predict quite accurately childhood obesity in adolescent children with only a few anthropometric measurements to be taken during the first 3 years of life. Within this start-up, little by little I moved towards more business development functions (without leaving aside technology because it was our thing) and I discovered that this was really my thing, so, after the first failed entrepreneurial attempt (the start-up ended up closing), I went to Madrid to work for a multinational consulting firm. Business consulting (suit included), the financial sector (until then I had gone to the ATM to withdraw money and little else), technology (which I always carry with me) and data (which I took to discover had always been present and were one of my passions) occupied my professional life during the following years in Spain and Argentina to finally take me where I am now. Currently, I manage Anjana Data, a recently created Spanish company that we set up a little over a year ago among a few data freaks, totally focused on product development and with which we have launched our own data governance solution, for now, focused on the Iberian and Latin American market (and with which, by the way, we are having a lot of success despite our short history as an independent company). Additionally, my passion for data and my desire to disseminate and share knowledge and experiences about strategy, its management and governance, have continuously moved me to participate as a speaker in events, write articles, hold webinars, collaborate with business schools and associations and, last but not least, have also led me to DAMA Spain, where I collaborate as Head of the Data Governance and Metadata Working Group.
What are your challenges, what do you like to do with the data?
In my Final Project, I discovered that more than 80% of the time that someone who works with data spends on a data project goes into tasks prior to working with the algorithms and obtaining models. The vast majority of the FFP is spent on this:
Searching for data: It took us months to find and be given access to a longitudinal study with more than 30 measurements of different kinds that had been taken of about 400 Aragonese children systematically from birth to 18 years. And, in addition, we also had to digitize them because the data were collected with pen and paper in hundreds of folders.
Trying to understand the data: We did an advanced course in children’s medicine and spent many days in the pediatric area of the Miguel Servet Hospital in Zaragoza reading articles about something we did not know before.
Analyzing the data: We took out a lot of statistics (averages, medians, variances, deviations, …) to draw some first conclusions about the data we were going to face.
Cleaning data: In spite of the great effort involved in carrying out this study for the doctors who carried it out, we found in not a few cases incomplete or empty fields, a high number of outliers and incoherent values, we had to reject complete records, we had to apply different extrapolation techniques to infer missing data, etc.
Preparing data: From the datasets we were working within Matlab, we had to model them in huge unnormalized tables with a lot of new variables calculated from the original ones so that the algorithms we were going to try later could work their magic in a supercomputer at the University of Zaragoza.
Many months later, we spent the last weeks tuning the parameters of the different algorithms that were throwing us, models, with more or less success in order to iterate millions of times seeing how the models improved or got worse according to a lot of variables. And, finally, we kept the one with the highest sensitivity and specificity, the one that best fit the data and the one that had the greatest medical explanation.
Years later, I discovered that the same thing that I had experienced in a totally different environment to the one I found later was happening exactly the same in the large multinationals, even with the best scientists and data analysts, with the most advanced technologies and in highly demanding regulatory and normative environments that had to respond to the data they presented. So, little by little, I began to direct my career towards what I most like to do with data: govern and manage it to generate value by turning it into a strategic asset for organizations. Or, in my current case, and this is where my greatest challenge lies, offering organizations a technological solution that can help them achieve this.
Do you think that companies have an adequate culture to manage data in a different way?
I think that the data age, although it has been talked about for some time, has only just begun, and therefore there is still a long way to go. Raw materials that have come to occupy an important position in our global society require time for society to adapt to the new status-quo and, in this case, data is also a different asset? it is not like gold or oil, but something much more powerful than, when converted into information, can move a lot with very little. Little by little we are all becoming aware of the importance of data in our daily lives, starting with ours as individuals, and this, in turn, is permeating the companies because in the end, companies are nothing more than organizations of people. Having said that, there are organizations that obtain a very large value (whether it is an economic, social, or political benefit) from the data they manage and, on the other hand, there are others that have just enough to make a decision based on a quantitative analysis of the information they have. In my opinion, when we talk about managing the data in a different way, rather than thinking about obtaining value for the organizations, we should think about whether we are doing it ethically, so I think the cultural change is there and I think we still have a long way to go.
What are the most important challenges for the CIOs, CDOs, CTOs in 2021?
- To serve as facilitators of change to help their organizations adapt as quickly as possible to the current situation that has brought the pandemic, to adopt agility as a lever for growth, to seek new business models, and to promote digitalization, innovation, and transformation towards data-driven organizations
- To build bridges between Business and Technology, aligning the objectives of the different levels of the organization with the strategic objectives and favoring collaboration, bringing the data much closer to those who consume it and understand it so that they can draw the right conclusions that lead to the right decisions in the shortest time possible
- Ending information silos and promoting a collaborative data culture that favors the democratization of information and its ethical exploitation in governed and automated environments
The times of greatest crisis are the ones that bring the greatest innovations and disruptions and, at the moment we are living, the data are (if they were not already) the raw material that every organization must take care of from its generation to its exploitation in order to stay alive or even to become a differential, so the CIOs, CDOs, and CTOs have to be in the front line along with the rest of C-Levels to give the do of the rest.
.
What do you want to talk about inside the Data Management Summit?
My presentation will focus on a new concept that has begun to emerge not long ago and for which there is still no common consensus, DataOps. In the end, like almost everything, it is simply a new name that has been given to something that has been around for a long time and something that should be inherent in technology and is nothing more than the automation of processes throughout the data life cycle.
First, I will try to demystify the concept, as I have just done in these first lines, and then I will explain why DataOps is still the natural evolution of the implementation and operation of a proactive and preventive Data Governance integrated with demand management, which serves as a lever for effective and efficient data management.
Finally, I will approach some practical cases of DataOps implementation giving some brushstrokes on how to propose a DataOps initiative.
From the beginning, you have supported the event. Why?
The DMS is an event of data professionals for data professionals, a place where a lot of gray matter with a lot of knowledge and experiences around the world of data is gathered in a very little space (this virtual year) and from which you can take a lot of learning in your backpack in a very short time. In addition, I think what makes it really different is that it is not a trade fair full of stands of suppliers or a farmhouse where there is no room for different profiles but it is a participatory event where the quality of content is very high and is cared for with great care and where it opens to all kinds of roles in the world of data without restrictions other than those of the capacity to facilitate the dynamics.