KIT | KIT-Bibliothek | Impressum | Datenschutz

S7 Data Modification Attacks using an Industrial Control System Testbed

Kellerer, Nicolai ORCID iD icon 1; Sánchez Collado, Gustavo ORCID iD icon 1; Alberto, Hermenegildo ORCID iD icon 1; Hagenmeyer, Veit ORCID iD icon 1; Elbez, Ghada ORCID iD icon 1
1 Institut für Automation und angewandte Informatik (IAI), Karlsruher Institut für Technologie (KIT)


Download
Originalveröffentlichung
DOI: 10.5281/zenodo.15373938
Zugehörige Institution(en) am KIT Institut für Automation und angewandte Informatik (IAI)
Publikationstyp Forschungsdaten
Publikationsdatum 22.05.2025
Identifikator KITopen-ID: 1000181992
HGF-Programm 46.23.02 (POF IV, LK 01) Engineering Security for Energy Systems
Lizenz Creative Commons Namensnennung 4.0 International
Liesmich

This dataset contains data modification attacks on the S7 protocol. Three data sources are provided:

Network packets

Logs from the Siemens PLC 1512 (Battery power plant) and the PLC 1516 (PV power plant)

Process data from the Control Center

The Apache Parquet files are compressed using Gzip. Use Pandas with the Pyarrow backend to read them:

import pandas as pd
logs_df = pd.read_parquet("logs_dataset.parquet")
packets_df = pd.read_parquet("packets_dataset.parquet")
process_df = pd.read_parquet("process_dataset.parquet")

The attack scripts and a prototype Intrusion Detection System (IDS) are provided in the S7 Attacks GitHub repository.

Packet Data Source

Features:

timestamp

packet: binary representation of the network packet after the potential data modification attack

label: normal or data_modification

datamod_changes: list of changes to the original packet performed by the attack script

 s7_datablock: parsed user defined section of the S7 BSEND/BRECV packet

The S7 attacks repository includes the definition of the datablocks, which is needed to extract the process values from the user-defined section of the S7 BSEND/BRCV packet.

Process Data Source

Features:

archive_recv_ts

in_pv_temp_air

in_pv_wind_speed

in_pv_poa_direct

in_pv_poa_diffuse

in_pv_cell_temperature

in_pv_inverter_ac_power

in_pv_inverter_dc_power

in_batt_state_of_charge

in_batt_voltage

in_batt_current

in_batt_actual_charge_power

in_batt_temperature

state_batt_voltage

state_batt_current

state_batt_state_of_charge

state_batt_stored_energy

out_pv_on_off

out_pv_target_power

out_batt_on_off

out_batt_target_power

The process data was collected from the control center with a 1 second interval. archive_recv_ts is the timestamp when the record was saved in the database.

Every signal is prefixed with in, out or state:

in: Monitoring signal sent from power plant to control center

out: Control signal sent from control center to power plant

state: Calculated values from the power flow algorithm derived from the inbatt* values

Logs Data Source

Features:

server_recv_ts: timestamp when log message was received by rsyslog

device_ts: timestamp when log message was sent by the PLC (starts at 2015-05-18)

hostname: name of the PLC

field_name: Actual_Charge_Power (Battery), Target_Charge_Power (Battery), OnOff (PV), Poa_direct (PV), Poa_diffuse (PV), Inverter_ac_power (PV), Inverter_dc_power (PV), Cell_Temperature (PV), Temp_Air (PV)

old_value: value before tripping the threshold 

new_value: value after tripping the threshold

The log messages are only emitted by the PLCs for the battery and the PV power plants, when a process value goes above or below a defined threshold.The process values were extracted by parsing the log messages with regex rules.

Funding

This research is supported in part by funding from the topic Engineering Secure Systems of the Helmholtz Association (HGF) and by KASTEL Security Research Labs (structure 46.23.02).

Art der Forschungsdaten Dataset
KIT – Die Universität in der Helmholtz-Gemeinschaft
KITopen Landing Page