About & Help

Everything you need to know about the Smart Connected Vehicles system

Smart Connected Vehicles

An AI-powered road hazard detection platform that automatically identifies bumps and potholes from vehicle sensor data. Upload a CSV from your vehicle's sensors and get instant, GPS-mapped predictions — no manual labelling needed.

Research & Datasets

Training Dataset

The models were trained on the dataset introduced in:
arxiv.org/abs/2510.25211

External Validation Dataset

Model generalisation was externally validated using:
nature.com/articles/s41597-024-04193-0

Key Features

4 Specialist Models

Automatically routes to the right model based on the columns present in your CSV

GPS Mapping

Hazards are pinned on an interactive Leaflet map using GPS coordinates from your data

Analytics Dashboard

Confidence charts, prediction distribution, timeline, and class-level insights

Downloads Page

Ready-made sample CSVs for each model available from the navbar and left panel

Sample Generator

Generate synthetic data for any of the 4 CSV types directly in the browser

Fast Inference

Processes thousands of sensor readings in seconds on CPU or GPU

Prediction Classes

Overall Performance

4
Specialist Models
97.68%
Top Accuracy
3
Prediction Classes
108
Max Features

The Four Models

The system automatically detects which columns are present in your uploaded CSV and routes to the most appropriate model — no configuration required.

Automatic Routing Logic

Weather Cols GIS Elevation Model Used CSV Columns
✅ Yes ✅ Yes 🌦️ sensor_gis_weather.pth 43
❌ No ✅ Yes 📡 sensor_gis.pth 39
✅ Yes ❌ No 🌤️ sensor_weather.pth 42
❌ No ❌ No 📟 sensor.pth 38

Weather Columns (4)

GIS Column (1)

Architecture — TCN-KAN-LSTM Fusion

All four models share the same architecture but are trained on different feature sets:

TCN

Temporal Convolutional Network with dilated convolutions captures local temporal patterns

KAN

Kolmogorov-Arnold Network models complex non-linear sensor relationships

LSTM

Long Short-Term Memory captures long-range sequence dependencies

Fusion

Learned attention weights blend TCN, KAN and LSTM outputs for final prediction

💡 Tip: The model badge shown after uploading tells you exactly which model processed your data — blue for sensor_gis_weather, green for sensor_gis, orange for sensor_weather, purple for sensor.

How It Works

1. Upload Detection

When you upload a CSV the system instantly checks for the presence of weather and GIS columns in the header row — before any processing begins — and selects the correct model automatically.

2. Column Validation

The file is validated against the required column set for the chosen model. Missing critical sensor columns will produce a clear error listing exactly what's absent.

3. Preprocessing

4. Feature Extraction

Each raw sensor reading is expanded into a rich feature vector:

5. Model Inference

The selected TCN-KAN-LSTM model outputs a probability over 3 classes (Normal, Bump, Pothole). The class with the highest probability is the prediction; its probability is the confidence score.

6. Results

Predictions are paired with GPS coordinates, saved to the database for hazards, and returned to the frontend where they are plotted on the interactive map and summarised in the analytics dashboard.

Getting Started

Minimum Required Columns (all models)

seconds_elapsed, accelerometer_x/y/z, gravity_x/y/z, gyroscope_x/y/z, orientation_qx/qy/qz/qw/roll/pitch/yaw, magnetometer_x/y/z, compass_magneticBearing, barometer_relativeAltitude, barometer_pressure, location_verticalAccuracy, location_longitude, location_latitude, totalAcceleration_x/y/z, magnetometerUncalibrated_x/y/z, gyroscopeUncalibrated_x/y/z, accelerometerUncalibrated_x/y/z
💡 Tip: Not sure which CSV to use? Start with predict_sensor_gis_weather.csv from the Downloads page — it works with the most complete model.

Video Tutorial

Watch the step-by-step walkthrough of the full system:

Topics Covered

Frequently Asked Questions

How does the system know which model to use?
It reads the CSV header row and checks for the 4 weather columns (Temperature (°C), Humidity (%), Cloud Cover (%), Wind Speed (km/h)) and GIS_Elevation. The combination determines the model: both present → sensor_gis_weather.pth, no weather → sensor_gis.pth, no GIS → sensor_weather.pth, neither → sensor.pth.
What do the coloured badges mean?
The badge in the file info box (before upload) and the result badge (after upload) both show which model was selected: 🔵 Blue = sensor_gis_weather, 🟢 Green = sensor_gis, 🟠 Orange = sensor_weather, 🟣 Purple = sensor.
Where can I get sample CSV files?
From the Downloads button in the navbar or the Download Sample CSV button in the left panel — both lead to the /downloads page which has all 4 files with descriptions. You can also generate synthetic data from Sample Data Generator.
How accurate are the models?
The top model (sensor_gis_weather.pth) achieves 97.68% validation accuracy. Accuracy may vary depending on sensor calibration and road conditions. The confidence score on each prediction indicates how certain the model is.
What does the confidence score mean?
A value from 0 to 1 representing the model's certainty. 0.95 means 95% confident. Predictions above 0.90 are generally very reliable; below 0.70 should be treated with caution.
Is my data stored on the server?
Uploaded CSV files are processed in memory and not saved. Only detected hazard coordinates and predictions are stored in the local SQLite database for the map and history views.
Can I generate sample data without real sensors?
Yes. The Sample Data Generator lets you create synthetic sensor readings for any of the 4 CSV types. You can control the number of samples, hazard percentage, bump/pothole ratio, location, and (for applicable types) weather values.
How do I export my results?
After running a prediction, scroll down to the Detailed Results table. Use the Export buttons to download results as CSV or JSON.
What datasets were used to train and validate the models?
The models were trained on the dataset described in arxiv.org/abs/2510.25211. External validation was performed using an independent dataset published in Nature Scientific Data (s41597-024-04193-0).
← Back to Home