In a recent project, we were tasked with improving the data management and analysis for a chicken test farm in the province of Antwerp. The farm houses over 60,000 chickens in large stables, with a section for chickens that produce eggs and another for chickens that are meant for meat consumption. The stables are split into compartments, with a selection of chickens being held under different testing circumstances, including environmental and nutritional factors.
Previously, the data collected from these tests is sent in CSV files to a central server and processed in a relational SQL database with many nested tables and foreign keys. However, this method of data management is not scalable and requires a significant amount of manual preprocessing. This makes it difficult to generate thorough reports and conduct accurate analysis of the data.
To address these issues, we decided to construct a data warehouse using an SSIS pipeline in Visual Studio and Python. The data is imported into a dimensional model with facts and dimensions, which allows for more efficient querying and reporting. This structure also makes it easier to add new data or modify existing data without affecting the entire system.
The data warehouse also enables us to separate the data into facts and dimensions, which are the fundamental components of a dimensional data model. Facts are measures of the business process, such as the chicken weight, fluid and food consumption. Dimensions provide context to the facts, such as the time, location, and environment in which the measurements were taken. This separation allows for more flexible reporting and analysis.
Once the data warehouse was in place, we utilized PowerBI for reporting and analysis. PowerBI is a business intelligence tool that allows us to easily connect to the data warehouse and create interactive visualizations and reports. With PowerBI, we can quickly identify patterns and trends in the data that would have been difficult to discern with traditional methods of data analysis.
For example, we were able to create a report that shows the average food consumption of chickens by compartment over time. This report helped us identify any fluctuations in consumption and make adjustments to the testing conditions as necessary. We also created a report that shows the loss of chickens (due to death or selection) by compartment, which helped us to identify trends and/or outliers.
Overall, the project was a success in improving the data management and analysis for the chicken test farm. The use of a data warehouse and PowerBI allowed us to efficiently process and analyze large amounts of data, resulting in valuable insights that can be used to optimize the testing conditions for the chickens. Additionally, the assistance of senior data engineers with extensive experience in the field played a crucial role in the success of the project.