Talk: From Legacy Systems To Data Science
Speakers directory
Speaker:
Ron Ballard
Talk description
Title:
From Legacy Systems To Data Science
Short synopsis:
What Data Science needs most of all is good data. Very often that good data is buried in the operational systems of our organisations, along with some not-so-good data. This talk is about an agile approach to building the best data we can from what we have, and making it available in a convenient form for Data Scientists to draw trustworthy conclusions from it.
Max size: 500 chars
Long synopsis (optional):
My talk is about getting more out of the data that each organisation keeps. This data is, in most organisations, even small ones, scattered between different systems. These "systems" may be spreadsheets, documents and data feeds, or may include commercial applications such as Customer Relationship Management and Accounting software. To use this data effectively we need to pull it together, make it consistent and find out how much we can trust it. The resulting database is usually called a Data Warehouse. Unfortunately the term Data Warehouse too-often implies heavyweight methodologies and huge cost. Based on my own experience of many data warehouse projects, I know that we can build a good data warehouse with lower costs and with benefits starting just a few weeks into the project and growing over the coming years. We use an agile, incremental approach, driven by the users' most pressing needs. We use the disciplines of database design, data profiling and data quality as part of our everyday work, rather than as separate phases that delay delivery. Once we have access to the accurate and reliable data from our existing systems, we may supplement the data warehouse with data from external sources. The aim is always to serve our users needs for data, and those needs grow as users see the benefits they can get from their own data warehouse.
Max size: 5000 chars
Tags:
Speaker directory:
Listed in directory
Not listed
Speakers directory