The Data World – Who’s who in the zoo?
Who Are Data People?
There is a very big misconception of what the data field is and who does what in the data world.
The word “IT” (Information Technology) is a very diverse term. It is not a specific field that happens to be what individuals all fit into. IT branches into many categories. One being Data.
Diving further into in data analytics and data development world, I have realised that data roles are not always clear to those who are not involved in the data field and there is a lack of awareness of how important data is.
Despite referring to most technological roles as “IT”, it is crucial to understand the different roles involved in the data field.
Roles such as a:
📐 Data Architect
⚙️ Data Engineer
🔎 Data Analyst
📊 BI Developer
🧮 Data Scientist
… are all different roles that require various skillsets. To get the best out of your data, the right data team is vital.
Before explaining what these roles are, it is imperative one understands their environment to ensure the correct roles are placed and have the right tools to achieve their tasks.
A Data Architect are the pillar stones of the data operation. They are the designers of the database structure and how to data flows accordingly. They are in charge of the database blueprints, diagrams and documentation specifying how the data moves around from system to system. They ensure the data is following business rules (this can also be done through Microsoft’s Dataverse) and that the data meets required security measures.
A Data Engineer are the ones who model/build the data. These guys lay the foundation for the rest of the team such as Data Analysts, BI Developers and Data Scientists. Data Engineers manage production data and model/build it to identify trends using languages such as T-SQL, XML and more. Most Data Engineers utilise systems such as MS SQL Server, Oracle, MySQL and so forth. Using methods such as ETL and other tools they are required to model the data to make it “readable” for other data roles and usually load this data in Data Lakes, Warehouses or Cubes.
The Data Analysts responsibility to extract the data provided by Data Engineers and identify business trends and questions that need answering. They analyse the data using multiple methods such as:
- Descriptive Analytics
- Predictive Analytics
- Prescriptive Analytics
Many skills such as Python, R, SQL and ETL are used to gather the data, organise it and present it in visual forms allowing those to analyse what is being presented and make informed business decisions.
BI Developers play somewhat of a similar role to Data Analysts. The slight difference between the two is due to BI slowly shifting towards AI. BI Developers also extract data and model it into analytical reports using Python, R, SQL etc, however, further development in applications such as Power BI, Tableau and other visualisation tools are used. More languages such as Dax are then used within these reporting apps to model and filter data accordingly. BI Developers manage these reports from a development role to a security and sharing role. This includes automated reports, access to modeled datasets and understanding OLAP and ETL.
The way I see it, Data Scientists are a jack of all trades. As I’m sure you are well aware, Data Science is a recent craze and is one for the most sought out positions right now. Data Scientists use mathematical algorithms and statistics embedded in code such as Python & R to structure and model data to the next level. This includes building data models, analysing trends in data, identifying data errors or data anomaly’s, cleansing data and so much more. In addition to all the data modelling, Data Scientists also shift into the realm of Machine Learning and Artificial Intelligence. They build scalable machine learning models and personalised business data to analyse business needs. Ironically, Data Scientists spend 80% of their time finding data, cleaning it and modelling it before they even get to the analysis side.
What Else Is There?
There are multiple Data Roles such as Business Analysts, Database Administrators, ETL Developers and so forth. I personally feel that the roles I touched on today are the starting point to building a successful Data Environment in a company.
I recently wrote a post on LinkedIn about these various roles and other Data factors within a business. One of my topics was based on Data Culture. You can have the best Data team in your city, but, if your data is messy and you have no defined culture and business rules, 80% of your Data resources will be spent on cleaning data rather than analysing it. No out of the box system will fit every company. Of course, you are going to have to modify it according to your business needs. But when you are building your environment and establishing your business workflows, have I mind how your data will flow so you can ensure your environment is built right from start and you can then minimise the time spent doing data fixes and cleanses. By building your best suited Data Team and defining your data flows, you can make your data work for you!