Case Study
Overview
A company currently works with the local transportation sector. The company is responsible for building solutions that will help in improvement of traffic and safety.
Current Requirements
The company is currently managing the following systems
System |
Description |
Company_Response |
This system is used to monitor the traffic flow. It would be used to detect sudden changes in traffic flow patterns. It would detect any sudden stops or slowdowns. It would make use of PySaprk script to detect changes in traffic flow. Whenever there are any sudden changes in the traffic, a response team needs to be immediately notified. |
Company_track |
This system would be used to report changes in real-time based on traffic data |
Company_planning |
This system would be used to analyze data. Here, business users would create reports which would be used to analyze traffic load, correlation events, historical traffic data. |
Traffic sensors have been installed at different road junctions to monitor traffic. The sensors capture the following data
• Time
• Location in terms of latitude and longitude
• Speed in km per hour
• The license plate number
• The Length of the vehicle in meters
A sample of the data sent is shown below
{
“time” : “2019-10-15T22:15:25.72511745”,
“location” : {
“type” : “Point”,
“coordinates”:[ 22.4, 2.4] },
“speed” : 44.5,
“plate_number” : “22333W”,
“length” : 5.2
}
The traffic sensors will also capture images of the vehicle from time to time
Requirements
Below are the various requirements for the various systems
The Traffic Sensor data
• The sensors must only be able to add items to a collection named “CompanySensorDt”
• The insertion rate of the data must be maximized
• Once every 2 months, traffic sensor data must be analyzed to check for any malfunctions
• The collection “CompanySensorDt” must reside in Cosmos DB
• The impact of storing the vehicle images on the collection throughput must be minimized
The Company_track system
• This system should be able to report on data that is related to specific plate number.
• The report data would come from the “CompanySensorDt” collection
• Business users must be able to filter on vehicles that drive over a specific speed limit
The Company_planning application
• The data for this application must be stored in an Azure SQL database
• The data in the database would be sharded
• The data would be loaded from the “CompanySensorDt” collection once a week using Azure Data Factory
• A manual trigger would be used to transfer the data
Security
• Azure Active Directory should be used for all services wherever possible
• The license plate information must not be visible in the “Company_planning” database
• If there is an unauthorized usage of the “Company_planning” database, the security team must be notified immediately
• Data must be stored for a maximum of 5 years
Other requirements
• Costs should be minimized wherever possible
• The reports for the “Company_track” application must execute as quickly as possible
• An SLA of 75% is permitted for the “Company_planning” application
• It must be ensured that all data is replicated to multiple geographic locations to prevent any sort of data loss.
You have to design the “Company_planning” database. You decide to include a non-clustered columnstore index to improve performance. Is this an ideal implementation?