I'd like to share my personal view on this. There 4 principals to build engineering data lakes: 1. Prefer unstructured data to no data at all
- it's hard to know in advance what kind of question you can answer with data. For instance, your CI log data. This data can show you the server utilization even though you don't need this data right now. 2. Join all data records via connections
- connecting your Jira issues with data from HR system will help you filter your data not only by names but by positions also. 3. Data should be accessible for everyone 4. Use the goal question metric approach
- this approach saves a ton of time. What I see is that we have a tendency to create metric and only then connect it to the team' performance. Always start from why.
You can read more about metrics in Lean Analytics (https://www.amazon.com/Lean-Analytics-Better-Startup-Faster/dp/1449335675