Air Support explores the prediction of air quality in office environments and represents the practical part of my master’s thesis.
Poor indoor air quality (IAQ) is one of the greatest environmental health risks and also claims billions of dollars in financial losses due to non-fatal sequelae. The problem is that modern man spends 90% of his time indoors, where there are often high levels of pollutants in the indoor air.
Air Support therefore explores the following research question with the perspective of developing a digital assistant to support manual window ventilation:
Is it possible to successfully predict indoor air quality parameters using data of a real use case in an office environment?
Research Design and Outcome
The insights of this research are also available as an academic presentation.
Air Support, inspected by CRISP and DASC, followed a five-step research design consisting of Business Understanding, Data Understanding, Data Preparation, Modeling, and Evaluation. According to the constructed business understanding, CO2 as well as PM (Particulate Matter) of the indoor environment are crucial parameters of indoor air quality of office environments and are made the target of prediction. Window opening, humidity, temperature, wind speed and wind direction were considered as other influencing factors. The parameters were measured with sensors and third-party APIs for the indoor and outdoor environment. To record the data, an IoT infrastructure was implemented in the office building of the cooperation partner artiso solutions GmbH, which recorded 10 test rooms. A cloud architecture was used to process the data. Air Support collected 2,000,000 data points over the course of the project. Based on the data understanding, Data Preparation prepared a qualitative dataset. The following figure shows a snippet of the dataset.
Modeling defined the absolute value of the target parameters in 20 minutes as the forecast period and identified Recurrent Neural Networks, Long Short-Term Memory and Gated Recurrent Unit, respectively, in combination with a Direct Multi Step strategy of forecasting as suitable approaches. These were tested with different combinations of those features that correlate most strongly with the target parameters: CO2 and Particulate Matter of the indoor and outdoor environment. The following table lists the architectures that were used during modeling.
Although the errors are low, the developed models are rated negatively in the phase of evaluation. The forecasts are not effective predictions of the future, but delayed representations of real values. The delay in the forecasts could not be eliminated despite numerous experiments. The following figure contrasts a CO2 prediction and true values.
Key Data of the Project
- Start: 02/08/2021
- End: 12/06/2021
- Status: finished
PlatformIO: Programming of the microcontrollers
NodeJS (TypeScript): Development of the Azure Functions
Python: Data Science
- Data Science
JupyterLab: Data Science Notebooks
matplotlib: Data Analysis
tensorflow: Deep Learning
Docker: Development (VSCode devcontainer)
Microsoft Azure: Cloud infrastructure
- IoT Hub: Authentication and management of IoT devices
- Service Bus: Decoupling of IoT Hub and downstream Azure Function
- Functions: Event-Driven Serverless Compute
- Service Bus Trigger for processing sensor data
- Timer Function to retrieve data from third-party APIs
- HTTP Trigger for processing a Microsoft Teams Outgoing Webhook
Key Vault: Storage of Secrets
- Database for PostgreSQL with TimescaleDB: Serverless Time Series Database
- alembic: Database Migrations with `SQLAlchemy
- Terraform: Infrastructure As Code for managing Azure cloud infrastructure
Microsoft Teams: Webhooks
The IoT gateway acts as the central entry point of the IoT infrastructure to the cloud. It provides security for the connection of IoT devices and routes all data packets to a queue.
The queue between the IoT gateway and the IAQ Data & Notification Service decouples the ingress and processing of data. The queue is capable of holding messages for longer periods of time, which enables sequential and asynchronous processing. In addition to lower performance requirements, the advantage is that temporary consumer failures are mitigated. Messages remain in the queue for as long as this is the case.
The IAQ Data Service is responsible for storing the sensor data in the database. If the IAQ Data Service is available and there is a new message in the queue, the contained measured values are saved to the database.
IAQ Notification Service and IAQ Request Service provide assistance functions in Microsoft Teams. The two services are integrated into channels via webhooks and can publish messages in them. The IAQ Notification Service is a heuristic assistant that alerts room occupants to problematic air quality based on threshold values. The IAQ Request Service offers the possibility to retrieve the current values of all rooms ad-hoc via chat function in Microsoft Teams.
The data structure consists of 4 fundamental entities:
- Measurement contains data of all possibly measured parameters. The omittable attributes, marked with NULL, provide the necessary flexibility to create only those attributes that a specific sensor module provides.
- Location is the place of a measurement. This can be a test room or the outdoor area. Through the simple association to Measurement, a measurement can be traced back to a location. Location also has multiple associations to DataSource to install multiple data sources in one location.
- DataSource is a source of sensor data. It can be a sensor module or a third party API. A DataSource can be associated with unlimited Measurements. This means that a measurement can also be traced back to a sensor module and can be uniquely located.
- Notification is used to coordinate the sending of notifications to a Location and must not be misunderstood as an explicit notification. There is a simple association between Notification and Location.