nilspodlib.session.SyncedSession#
- class nilspodlib.session.SyncedSession(datasets: Iterable[Dataset])[source]#
Object representing a collection of Datasets recorded with active synchronisation.
A session can access all the same attributes and most of the methods provided by a dataset. However, instead of returning a single value/acting only on a single dataset, it will return a tuple of values (one for each dataset) or modify all datasets of a session. You can also use the
self.info
object to access header information of all datasets at the same time. All return values will be in the same order asself.datasets
.To synchronise a dataset, you usually want to call
cut_to_syncregion
on the session. The resulting session is considered fully synchronised (depending on the parameters chosen). This means that all datasets have the same length and the exact same counter. However, note, that the header information of the individual datasets will not be updated to reflect the sync. This means that header values likenumber_of_samples
or the start and stop times will not match the data anymore. As a substitute you can use a set of direct attributes on the session (e.g.session_utc_start
,session_duration
, etc.)- Attributes:
- datasets
A tuple of the datasets belonging to the session
info
Get metainfo for all datasets.
master
Get the dataset belonging to the sync-server sensor.
slaves
Get the datasets belonging to all the sync-clients sensors.
session_utc_start
Start time of the session as utc timestamp.
session_utc_stop
Stop time of the session as utc timestamp.
session_duration
Duration of the session in seconds.
session_utc_datetime_start
Start time of the session as utc datetime.
session_utc_datetime_stop
Stop time of the session as utc datetime.
session_local_datetime_start
Start time of the session in the specified timezone of the session.
session_local_datetime_stop
Stop time of the session in the specified timezone of the session.
- VALIDATE_ON_INIT
If True all synced sessions will be checked on init. These checks include testing, if all datasets are really part of a single measurement. In rare cases, it might be useful to deactivate these checks and force the creation of a synced session. In this case you need to set this class attribute to false before loading the session:
>>> SyncedSession.VALIDATE_ON_INIT = False >>> SyncedSession.from_folder_path('./my/path') # No validation will be performed
See also
Methods
align_to_syncregion
([cut_start, cut_end, ...])Align all datasets based on regions where they were synchronised to the master.
calibrate_imu
(calibrations[, inplace])Calibrate the imus of all datasets by providing a list of calibration infos.
cut
([start, stop, step, inplace])Apply
Dataset.cut
to all datasets of the session.cut_counter_val
([start, stop, step, inplace])Apply
Dataset.cut_counter_val
to all datasets of the session.cut_to_syncregion
([start, end, warn_thres, ...])Apply
Dataset.cut_to_syncregion
to all datasets of the session.data_as_df
([datastreams, index, ...])Export all datasets of the session in a list of (or a single) pandas DataFrame.
downsample
(factor[, inplace])Apply
Dataset.downsample
to all datasets of the session.find_calibrations
([folder, recursive, ...])Apply
Dataset.find_calibrations
to all datasets of the session.find_closest_calibration
([folder, ...])Apply
Dataset.find_closest_calibration
to all datasets of the session.from_file_paths
(paths[, legacy_support, ...])Create a new session from a list of files pointing to valid .bin files.
from_folder_path
(base_path[, ...])Create a new session from a folder path containing valid .bin files.
get_dataset_by_id
(sensor_id)Get a specific dataset by its sensor_type id.
imu_data_as_df
([index, include_units, concat_df])Export the acc and gyro datastreams of all datasets in list of (or a single) pandas DataFrame.
validate
()Check if basic properties of a synced session are fulfilled.
- __init__(datasets: Iterable[Dataset])[source]#
Create new synced session.
Instead of this init you can also use the factory methods
from_file_paths
andfrom_folder_path
.This init performs basic validation on the datasets. See
validate
for details.- Parameters:
- datasets
List of
nilspodlib.dataset.Dataset
instances, which should be grouped into a session.
- align_to_syncregion(cut_start: bool = False, cut_end: bool = False, inplace: bool = False, warn_thres: Optional[int] = 30) Self [source]#
Align all datasets based on regions where they were synchronised to the master.
At the end all datasets are cut to the same length, so that the maximum overlap between all datasets is preserved.
- Parameters:
- cut_start
Whether the dataset should be cut at
info.sync_index_start
. IfFalse
a new corrected counter value will be calculated for all packages before the first syncpackage. Usually it can be assumed that this extrapolation is valid for multiple seconds before the first package.- cut_end
Whether the dataset should be cut at the
info.sync_index_stop
. Usually it can be assumed that the data will be synchronous for multiple seconds after the last sync package. Therefore, it might be acceptable to just ignore the last syncpackage and just cut the start of the dataset.- warn_thres
Threshold in seconds from the end of a dataset. If the last syncpackage occurred more than warn_thres before the end of the dataset, a warning is emitted. Use warn_thres = None to silence. This is not relevant if the end of the dataset is cut (e.g.
end=True
)- inplace
If operation should be performed on the current Session object, or on a copy Warns:
- inplace
If operation should be performed on the current Session object, or on a copy Warns: If a syncpackage occurred far before the last sample in any of the dataset. See arg
warn_thres
.
- calibrate_imu(calibrations: Iterable[Union[CalibrationInfo, path_t]], inplace: bool = False) Self [source]#
Calibrate the imus of all datasets by providing a list of calibration infos.
If you do not want to calibrate a specific IMU, you can pass
None
for its position.- Parameters:
- calibrations
List of calibration infos in the same order than
self.datasets
- inplace
If True this methods modifies the current session object. If False, a copy of the sesion and all dataset objects is created
- cut(start: Optional[int] = None, stop: Optional[int] = None, step: Optional[int] = None, inplace: bool = False) Self [source]#
Apply
Dataset.cut
to all datasets of the session.See
nilspodlib.dataset.Dataset.cut
for more details. The docstring of this method is included below:Cut all datastreams of the dataset. datastreams of the dataset.
This is equivalent to applying the following slicing to all datastreams and the counter: array[start:stop:step]
Warning
This will not modify any values in the header/info the dataset. I.e. the number of samples in the header/ sync index values. Using methods that rely on these values might result in unexpected behaviour. For example
cut_to_syncregion
will not work correctly, ifcut
orcut_counter_val
was used before.- Parameters:
- start
Start index
- stop
Stop index
- step
Step size of the cut
- inplace
If True this methods modifies the current dataset object. If False, a copy of the dataset and all datastream objects is created
- cut_counter_val(start: Optional[int] = None, stop: Optional[int] = None, step: Optional[int] = None, inplace: bool = False) Self [source]#
Apply
Dataset.cut_counter_val
to all datasets of the session.See
nilspodlib.dataset.Dataset.cut_counter_val
for more details. The docstring of this method is included below:Cut the dataset based on values in the counter and not the index. dataset based on values in the counter and not the index.
Instead of just cutting the datastream based on its index, it is cut based on the counter value. This is equivalent to applying the following pandas style slicing to all datastreams and the counter: array.loc[start:stop:step]
Warning
This will not modify any values in the header/info the dataset. I.e. the number of samples in the header/ sync index values. Using methods that rely on these values might result in unexpected behaviour. For example
cut_to_syncregion
will not work correctly, ifcut
orcut_counter_val
was used before.- Parameters:
- start
Start value in counter
- stop
Stop value in counter
- step
Step size of the cut
- inplace
If True this methods modifies the current dataset object. If False, a copy of the dataset and all datastream objects is created
Notes
The method searches the respective index for the start and the stop value in the
counter
and callscut
with these values. The step size will be passed directly and not modified (i.e. the step size will not respect downsampling or similar operations done beforehand).
- cut_to_syncregion(start: bool = True, end: bool = False, warn_thres: Optional[int] = 30, inplace: bool = False) Self [source]#
Apply
Dataset.cut_to_syncregion
to all datasets of the session.See
nilspodlib.dataset.Dataset.cut_to_syncregion
for more details. The docstring of this method is included below:Cut the dataset to the region indicated by the first and last sync package received from master. dataset to the region indicated by the first and last sync package received from master.
This cuts the dataset to the values indicated by
info.sync_index_start
andinfo.sync_index_stop
. In case the dataset was a sync-master (info.sync_role = 'master'
) this will have no effect and the dataset will be returned unmodified.Warning
This function should not be used after any other methods that can modify the counter (e.g.
cut
ordownsample
).Warning
This will not modify any values in the header/info the dataset. I.e. the number of samples in the header/ sync index values. Using methods that rely on these values might result in unexpected behaviour.
- Parameters:
- start
Whether the dataset should be cut at the
info.sync_index_start
. If this is False, a jump in the counter will remain. The only usecase for not cutting at the start is when the counters are already perfectly aligned.- end
Whether the dataset should be cut at the
info.sync_index_stop
. Usually it can be assumed that the data will be synchronous for multiple seconds after the last sync package. Therefore, it might be acceptable to just ignore the last syncpackage and just cut the start of the dataset.- warn_thres
Threshold in seconds from the end of a dataset. If the last syncpackage occurred more than warn_thres before the end of the dataset, a warning is emitted. Use warn_thres = None to silence. This is not relevant if the end of the dataset is cut (e.g.
end=True
)- inplace
If True this methods modifies the current dataset object. If False, a copy of the dataset and all datastream objects is created
- Raises:
- ValueError
If the dataset does not have any sync infos
- ValueError
If the dataset does not have any sync infos
Warning
- UserWarning
If a syncpackage occurred far before the last sample in the dataset. See arg
warn_thres
.
Notes
Usually to work with multiple syncronised datasets, a
SyncedSession
should be used instead of cutting the datasets manually.SyncedSession.cut_to_syncregion
will cover multiple edge cases involving multiple datasets, which can not be handled by this method.
- data_as_df(datastreams: Optional[Sequence[str]] = None, index: Optional[str] = None, include_units: Optional[bool] = False, concat_df: Optional[bool] = False) Union[Tuple[pd.DataFrame], pd.DataFrame] [source]#
Export all datasets of the session in a list of (or a single) pandas DataFrame.
- Parameters:
- datastreams
Optional list of datastream names, if only specific ones should be included. Datastreams that are not part of the current dataset will be silently ignored.
- index
Specify which index should be used for each dataset. The options are: “counter”: For the actual counter “time”: For the time in seconds since the first sample “utc”: For the utc time stamp of each sample “utc_datetime”: for a pandas DateTime index in UTC time “local_datetime”: for a pandas DateTime index in the timezone set for the session None: For a simple index (0…N)
- concat_df
If True the individual dfs from each dataset will be concatenated. This is only supported, if the session is properly cut to the sync region and all the datasets have the same counter.
- include_units
If True the column names will have the unit of the datastream concatenated with an
_
Notes:- include_units
If True the column names will have the unit of the datastream concatenated with an
_
Notes: This method calls thedata_as_df
methods of each Datastream object and then concats the results.- include_units
If True the column names will have the unit of the datastream concatenated with an
_
Notes: This method calls thedata_as_df
methods of each Datastream object and then concats the results. Therefore, it will use the column information of each datastream.
- Returns:
- Session as single or multiple dataframes
Tuple of pd.DataFrames (one for each Dataset) or a single DataFrame if
concat_df
is set to True
- Raises:
- ValueError
If any other than the allowed
index
values are used.
- downsample(factor: int, inplace: bool = False) Self [source]#
Apply
Dataset.downsample
to all datasets of the session.See
nilspodlib.dataset.Dataset.downsample
for more details. The docstring of this method is included below:Downsample all datastreams by a factor. le all datastreams by a factor.
This applies
scipy.signal.decimate
to all datastreams and the counter of the dataset. Seenilspodlib.datastream.Datastream.downsample
for details.Warning
This will not modify any values in the header/info the dataset. I.e. the number of samples in the header/ sync index values. Using methods that rely on these values might result in unexpected behaviour. For example
cut_to_syncregion
will not work correctly, ifcut
,cut_counter_val
, ordownsample
was used before.- Parameters:
- factor
Factor by which the dataset should be downsampled.
- inplace
If True this methods modifies the current dataset object. If False, a copy of the dataset and all datastream objects is created
- find_calibrations(folder: Optional[path_t] = None, recursive: bool = True, filter_cal_type: Optional[str] = None, ignore_file_not_found: Optional[bool] = False)[source]#
Apply
Dataset.find_calibrations
to all datasets of the session.See
nilspodlib.dataset.Dataset.find_calibrations
for more details. The docstring of this method is included below:- Find all calibration infos that belong to a given sensor_type.
calibration infos that belong to a given sensor_type.
As this only checks the filenames, this might return a false positive depending on your folder structure and naming.
- Parameters:
- folder
Basepath of the folder to search. If None, tries to find a default calibration
- recursive
If the folder should be searched recursive or not.
- filter_cal_type
Whether only files obtain with a certain calibration type should be found. This will look for the
CalType
inside the json file and hence cause performance problems. If None, all found files will be returned. For possible values, see theimucal
library.- ignore_file_not_found
If True this function will not raise an error, but rather return an empty list, if no calibration files were found for the specific sensor_type.
- find_closest_calibration(folder: Optional[path_t] = None, recursive: bool = True, filter_cal_type: Optional[str] = None, before_after: Optional[str] = None, ignore_file_not_found: Optional[bool] = False)[source]#
Apply
Dataset.find_closest_calibration
to all datasets of the session.See
nilspodlib.dataset.Dataset.find_closest_calibration
for more details. The docstring of this method is included below:- Find the closest calibration info to the start of the measurement.
closest calibration info to the start of the measurement.
As this only checks the filenames, this might return a false positive depending on your folder structure and naming.
- Parameters:
- folder
Basepath of the folder to search. If None, tries to find a default calibration
- recursive
If the folder should be searched recursive or not.
- filter_cal_type
Whether only files obtain with a certain calibration type should be found. This will look for the
CalType
inside the json file and hence cause performance problems. If None, all found files will be returned. For possible values, see theimucal
library.- before_after
Can either be ‘before’ or ‘after’, if the search should be limited to calibrations that were either before or after the specified date.
- warn_thres
If the distance to the closest calibration is larger than this threshold, a warning is emitted
- ignore_file_not_found
If True this function will not raise an error, but rather return
None
, if no calibration files were found for the specific sensor_type.
- classmethod from_file_paths(paths: Iterable[path_t], legacy_support: str = 'error', force_version: Optional[Version] = None, tz: Optional[str] = None) Self [source]#
Create a new session from a list of files pointing to valid .bin files.
- Parameters:
- paths
List of paths pointing to files to be included
- legacy_support
This indicates how to deal with old firmware versions. If
error
: An error is raised, if an unsupported version is detected. Ifwarn
: A warning is raised, but the file is parsed without modification Ifresolve
: A legacy conversion is performed to load old files. If no suitable conversion is found, an error is raised. See thelegacy
package and the README to learn more about available conversions.- force_version
Instead of relying on the version provided in the session header, the legacy support will be determined based on the version provided here. This is only used, if
legacy_support="resolve"
. This option can be helpful, when testing with development firmware images that don’t have official version numbers.- tz
Optional timezone str of the recording. This can be used to localize the start and end time. Note, this should not be the timezone of your current PC, but the timezone relevant for the specific recording.
- classmethod from_folder_path(base_path: path_t, filter_pattern: str = '*.bin', legacy_support: str = 'error', force_version: Optional[Version] = None, tz: Optional[str] = None) Self [source]#
Create a new session from a folder path containing valid .bin files.
- Parameters:
- base_path
Path to the folder
- filter_pattern
regex that can be used to filter the files in the folder. This is passed to Pathlib.glob()
- legacy_support
This indicates how to deal with old firmware versions. If
error
: An error is raised, if an unsupported version is detected. Ifwarn
: A warning is raised, but the file is parsed without modification Ifresolve
: A legacy conversion is performed to load old files. If no suitable conversion is found, an error is raised. See thelegacy
package and the README to learn more about available conversions.- force_version
Instead of relying on the version provided in the session header, the legacy support will be determined based on the version provided here. This is only used, if
legacy_support="resolve"
. This option can be helpful, when testing with development firmware images that don’t have official version numbers.- tz
Optional timezone str of the recording. This can be used to localize the start and end time. Note, this should not be the timezone of your current PC, but the timezone relevant for the specific recording.
- get_dataset_by_id(sensor_id: str) Dataset [source]#
Get a specific dataset by its sensor_type id.
- Parameters:
- sensor_id
Four letter/digit unique id of the sensor
- imu_data_as_df(index: Optional[str] = None, include_units: Optional[bool] = False, concat_df: Optional[bool] = False) Union[Tuple[pd.DataFrame], pd.DataFrame] [source]#
Export the acc and gyro datastreams of all datasets in list of (or a single) pandas DataFrame.
- Parameters:
- index
Specify which index should be used for the dataset. The options are: “counter”: For the actual counter “time”: For the time in seconds since the first sample “utc”: For the utc time stamp of each sample “utc_datetime”: for a pandas DateTime index in UTC time “local_datetime”: for a pandas DateTime index in the timezone set for the session None: For a simple index (0…N)
- concat_df
If True the individual dfs from each dataset will be concatenated. This is only supported, if the session is properly cut to the sync region and all the datasets have the same counter.
- include_units
If True the column names will have the unit of the datastream concatenated with an
_
Notes:- include_units
If True the column names will have the unit of the datastream concatenated with an
_
Notes: This method calls thedata_as_df
methods of each Datastream object and then concats the results.- include_units
If True the column names will have the unit of the datastream concatenated with an
_
Notes: This method calls thedata_as_df
methods of each Datastream object and then concats the results. Therefore, it will use the column information of each datastream.
- Returns:
- Imu data as single or multiple dataframes
Tuple of pd.DataFrames (one for each Dataset) or a single DataFrame if
concat_df
is set to True
- Raises:
- ValueError
If any other than the allowed
index
values are used.
- validate() None [source]#
Check if basic properties of a synced session are fulfilled.
- Raises:
- ValueError
This raises a ValueError in the following cases: - One or more of the datasets are not part of the same syncgroup/same sync channel - Multiple datasets are marked as “master” - One or more datasets indicate that they are not synchronised - One or more dataset has a different sampling rate than the others - If the recording times of provided datasets do not have any overlap