- The energy management system had to deal with petabytes of data. It was to start collecting information on electricity consumption from devices and appliances situated in 1,000 houses. This would result in 50,000 records added to the database every 15 seconds.
- Each record was to be stored and processed to allow for power consumption analysis by room, house, or device.
- The company was planning to add 4,000 houses to the system in the nearest future. This would increase the number of records sent to the database every 15 seconds to 200,000. It was planned that the system would be used by electricity companies across the US, so it had to feature near endless scalability.
- The servers were to be located in a single time zone with the customers spread across different parts of the country. Despite this, the application was to provide accurate data for any location and time.
- The NoSQL development team decided to use Cassandra for large-scale data storage, as it can write to a data store 2,500 times faster that MySQL solutions. Cassandra allows for power consumption data to be gathered and aggregated in a fraction of a second.
- Cassandra also makes it possible to distribute data across multiple storages that can be easily scaled with no single point of failure.
- Hadoop was implemented to cope with the big data management challenge. It uses clusters of computers and distributed processing to analyze petabytes of power consumption records.
- In order to solve the localization issue, the team offered to add the location of each house and user to the database. Inquiry time is calculated taking into account the location of the user’s computer as well as the time difference with the server. This approach ensures that the data received at the time of inquiry is accurate.
The system can be scaled next to endlessly as the number of connected houses grows.
The solution helps the customer achieve its Green IT goals by continuously reducing the environmental impact of their products and operations.