Dell EMC: We built this city on a data lake
Smart cities are rapidly moving from concept to reality across Australia and New Zealand, with a number of cities hiring chief digital officers, the recent launch of a local smart cities council, and a region-wide race to be the first city to go “smart”.
With such a futuristic concept, it’s easy to get caught up in all the hype and excitement. But as with any new technology in a mostly conceptual phase, it’s important for us to take a moment to stop and think about what implementation will look like.
The United Nations predicts that by 2050 around 70 per cent of the world’s population will be concentrated in urban areas, a whopping 16 percentage point increase from 2014. Couple this with Cisco’s expectation that each person will have an average of 25 connected devices in 2050, and the amount of data generated by smart cities of the future is almost incomprehensible.
A city built around trillions of 0s and 1s rather than physical infrastructure is a foreign concept, so naturally several questions spring to mind. Where will all that data live? How will we be able to quickly convert data into valuable insights? Will it all be worth the investment?
The answers to those questions? Well, it depends on the technology behind the city. For a city to be truly smart, its systems need to be able to access and process vast amounts of data on the fly. With roughly 80 per cent of today’s data growth ‘unstructured’, traditional approaches to data storage and analytics tools simply won’t work. What is needed is a modern data centre that keeps pace with rapid data growth, simplifies management, lowers costs and enables easy access to analytics.
This is where data lakes come in. A modernised version of the data warehouse, a data lake is scale-out storage for data consolidation. It spans from edge locations to the core data centre and right up to the cloud, and stores, manages and protects unstructured data. So, what would a smart city built on a data lake look like? And why are data lakes looking like the way for us to get smart with our cities?
Hosting diverse data
In today’s data-driven world, we’re dealing with data of multiple sources and types. There’s structured data, such as corporate databases, with its well-defined structure that means estimating future data storage needs is a fairly straightforward process.
Then we have unstructured data, which includes documents, videos, images, internet data and telemetry from myriad IoT devices. Such data can range from a few bytes to terabytes, meaning predicting when or how much of this data will need to be stored can be almost impossible.
Think about all the different types of data generated across a single city. Take Sydney, with its population of around five million people. You’ve got data being collected from millions of individuals across infrastructure, transport, communications, and urban planning. Or consider just the CCTV surveillance system in Brisbane, which generates data across 200 cameras, 200 access points, 12 monitors and 130 channels of analytics. This data will take multiple forms and won’t be of equal importance.
Data lakes can address this challenge by automatically moving data to its most appropriate storage tier based on parameters set around frequency of data access, age of data, type of application supported and file size.
Keeping things secure
That smart cities will be a target for hackers is not a difficult conclusion to draw. With CCTV cameras, power grids, water supply and several other vital city services at risk of malicious intent, security must and will be a paramount consideration in the implementation of smart city initiatives. With such sensitive data at stake, our smart cities will demand robust and secure IT built around more sophisticated storage technologies such as data lakes.
In the case of storage silos, it is more difficult to maintain control of security as the varying silos force duplication and inconsistent application of governance and security policies. By consolidating data and eliminating silos, data lakes afford better oversight and control of data protection and security policies.
Faster time to insights
Smart cities need to do more than just collect mass amounts of data, to deliver real value for our society they must be smart with it. There will need to be a huge focus on generating insight as quickly as possible to allow infrastructure and services to react.
Consider the potential for emergency services in a smart city. In the case of a road incident, emergency services will be alerted and receive the necessary information, traffic will be intelligently redirected to avoid the affected area, communications networks will be flexible in directing capacity to where it is needed, and information will be collated to inform citizens about the incident.
Data lakes have in-place analytics and afford faster time to insight as there is no need to chew time by making multiple copies or moving large data sets.
The scale, complexity and demands that will be placed on urban infrastructure and services in the future are vast. Smart initiatives will drastically alter the distribution of cities, with technologies such as driverless cars, automation and augmented reality reducing the need for citizens to physically be in a city centre, and allowing the freedom to live and work anywhere. When this unfolds, people will need technology to connect them on a level we can’t yet imagine.
Data lakes can underpin the remote workforce and smart city of the future. But in the case of new technologies, we need to ensure our heads aren’t stuck in the clouds and clear up the base level questions necessary to successful implementation and return on investment.
Article by Matt Zwolenski, chief technology officer, Dell EMC ANZ.