Core Functions and Classes#

This section documents the core functions and classes that form the foundation of eegprep.

Main Pipeline#

eegprep.bids_preproc(root, *, ApplyChanlocs=None, ApplyEvents=None, ApplyMetadata=None, EventColumn=None, Subjects=None, Sessions=None, Runs=None, Tasks=None, SkipIfPresent=True, NumJobs=None, ReservePerJob='', UseHashes=False, ReturnData=False, OutputDir=None, SamplingRate=None, OnlyChannelsWithPosition=True, OnlyModalities=(), WithInterp=False, WithICA=False, WithPicard=False, ICAAlgorithm='runica', AmicaArgs=None, WithICLabel=False, WithReport=True, CommonAverageReference=True, ChannelCriterion=0.8, LineNoiseCriterion=4.0, BurstCriterion=5.0, WindowCriterion=0.25, Highpass=(0.25, 0.75), ChannelCriterionMaxBadTime=0.5, BurstCriterionRefMaxBadChns=0.075, BurstCriterionRefTolerances=(-inf, 5.5), BurstRejection='off', WindowCriterionTolerances=(-inf, 7), FlatlineCriterion=5.0, NumSamples=50, NoLocsChannelCriterion=0.45, NoLocsChannelCriterionExcluded=0.1, MaxMem=64, Distance='euclidian', Channels=None, Channels_ignore=None, availableRAM_GB=None, EpochEvents=None, EpochLimits=(-1, 2), EpochBaseline=None, StageNames=('desc-cleaned', 'desc-picard', 'desc-iclabel', 'desc-epoch'), FinalDesc=None, ReportDir=None, MinimizeDiskUsage=True, SaveIntermediateStages=False, IntermediateDir=None, bidschanloc=None, bidsevent=None, bidsmetadata=None, eventtype=None, subjects=None, sessions=None, runs=None, tasks=None, outputdir=None, _lock=<contextlib.nullcontext object>, _n_skipped=None, _k=0, _n_total=1, _n_jobs=1, _t0=1781298326.2115788)

Apply data cleaning to EEG files in a BIDS dataset.

Parameters#

rootstr

The root directory containing BIDS data or a single EEG file path.

ApplyMetadatabool

Whether to apply metadata from BIDS sidecar files when loading raw EEG data. (default True)

ApplyEventsbool

Whether to apply events from BIDS sidecar files when loading raw EEG data. (default False)

ApplyChanlocsbool

Whether to apply channel locations from BIDS sidecar files when loading raw EEG data. (default True)

EventColumnstr

Optionally the column name in the BIDS events file to use for event types; if not set, will be inferred heuristically.

SubjectsSequence[str | int], optional

A sequence of subject identifiers or (zero-based) indices to filter the files by. If empty, all subjects are included.

SessionsSequence[str | int], optional

A sequence of session identifiers or (zero-based) indices to filter the files by. If empty, all sessions are included.

RunsSequence[str | int], optional

A sequence of run numbers or identifiers to filter the files by. If empty, all runs are included. Note that zero-based indexing does not apply to runs, unlike subjects and sessions since runs are already integers.

TasksSequence[str] | str, optional

A sequence of task names or single task to filter the files by. If empty, all tasks are included (default is an empty sequence).

OutputDirstr

The name of the subdirectory where cleaned files will be saved. This can start with the placeholder ‘{root}’ which will be replaced with the root path of the BIDS dataset. Defaults to ‘{root}/derivatives/eegprep’ if not specified.

SkipIfPresentbool

skip processing files that already have a cleaned version present.

NumJobsint, optional

The number of jobs to run in parallel. If set to -1, this will default to the number of logical cores on the system. If the ReservePerJob clause is also specified, this will be treated as a maximum, otherwise as the total. If neither of the two parameters is specified, a single job will run. Note: as usual when running multiple processes in Python, you need to use the if __name__ == “__main__”: guard pattern in your main processing script.

ReservePerJobstr

Optionally the resource amount and type to reserve per job, e.g. ‘4GB’ or ‘2CPU’; the run will then use as many jobs as fit within the system resources of the specified type. You can add a margin after a colon, as in ‘4GB:10GB’ or ‘2CPU:10%’. You can also specify a total or maximum number of jobs, such as ‘10total’ or ‘10max’. Multiple criteria can be provided as a comma-separated list, for example ‘4GB:20%, 2CPU, 5max’. If neither ReservePerJob nor NumJobs is specified, a single job will run. The system also runs serially in debug mode and on platforms that do not cleanly support multiprocessing. Tip: a good way to size this is to perform a serial run and to monitor how much peak RAM a single job takes, and then setting this to <PeakUsage>GB:<YourMargin>GB where YourMargin is however much you want to leave to other programs, e.g., 5GB (this will depend on what else you expect to be running on the machine).

UseHashesbool

Whether to bake hashes into intermediate file names; if you experiment with alternative preprocessing settings, it is recommended to enable this or disable the SkipIfPresent option since otherwise the routine may pick up a stale result.

ReturnDatabool

Whether to return the final EEG data objects as a list. Note that this can use quite a lot of memory for large studies and it may be better to iterate over the preprocessed files in downstream analyses.

OnlyChannelsWithPositionbool

Whether to retain only channels for which positions were recorded or could be inferred. If this is not set, then OnlyModalities should be set so as to retain only modalities that should be preprocessed together.

OnlyModalitiesSequence[str], optional

If set, retain only channels that have the associated modalities. If enabled, this is typically set to [‘EEG’] but may also include other ExG modalities such as EOG or EMG that have the same unit and scale as EEG. If non-electrophysiological modalities are included, some artifact removal steps may not function correctly.

SamplingRatefloat

Desired sampling rate for the preprocessed data. If not specified, will retain the original sampling rate.

WithInterpbool

Whether to reinterpolate dropped channels, thus retaining the same channel count as the raw data.

WithICAbool

Whether to apply PICARD ICA decomposition after cleaning.

AmicaArgsdict or None

Additional keyword arguments for AMICA when ICAAlgorithm=’amica’, e.g. {‘num_models’: 2, ‘max_iter’: 500}.

WithICLabelbool

Whether to apply ICLabel classification after ICA. Normally requires WithICA=True.

CommonAverageReferencebool

Whether to transform the EEG data to a common average referencing scheme; recommended for cross-study processing.

ChannelCriterionfloat or ‘off’

Minimum channel correlation threshold for channel cleaning; channels below this value are considered bad. Pass ‘off’ to skip channel criterion. Default 0.8.

LineNoiseCriterionfloat or ‘off’

Z-score threshold for line-noise contamination; channels exceeding this are considered bad. ‘off’ disables line-noise check. Default 4.0.

BurstCriterionfloat or ‘off’

ASR standard-deviation cutoff for high-amplitude bursts; values above this relative to calibration data are repaired (or removed if BurstRejection=’on’). ‘off’ skips ASR. Default 5.0.

WindowCriterionfloat or ‘off’

Fraction (0-1) or count of channels allowed to be bad per window; windows with more bad channels are removed. ‘off’ disables final window removal. Default 0.25.

Highpasstuple(float, float) or ‘off’

Transition band [low, high] in Hz for initial high-pass filtering. ‘off’ skips drift removal. Default (0.25, 0.75).

ChannelCriterionMaxBadTimefloat

Maximum tolerated time (seconds or fraction of recording) a channel may be flagged bad before being removed. Default 0.5.

BurstCriterionRefMaxBadChnsfloat or ‘off’

Maximum fraction of bad channels tolerated when selecting calibration data for ASR. ‘off’ uses all data for calibration. Default 0.075.

BurstCriterionRefTolerancestuple(float, float) or ‘off’

Power Z-score tolerances for selecting calibration windows in ASR. ‘off’ uses all data. Default (-inf, 5.5).

BurstRejectionstr

‘on’ to reject (drop) burst segments instead of reconstructing with ASR, ‘off’ to apply ASR repair. Default ‘off’.

WindowCriterionTolerancestuple(float, float) or ‘off’

Power Z-score bounds for final window removal. ‘off’ disables this stage. Default (-inf, 7).

FlatlineCriterionfloat or ‘off’

Maximum flatline duration in seconds; channels exceeding this are removed. ‘off’ disables flatline removal. Default 5.0.

NumSamplesint

Number of RANSAC samples for channel cleaning. Default 50.

NoLocsChannelCriterionfloat

Correlation threshold for fallback channel cleaning when no channel locations. Default 0.45.

NoLocsChannelCriterionExcludedfloat

Fraction of channels excluded when assessing correlation in nolocs cleaning. Default 0.1.

MaxMemint

Maximum memory in MB for ASR processing. Default 64.

Distancestr

Distance metric for ASR processing (‘euclidian’). Default ‘euclidian’.

ChannelsSequence[str] or None

List of channel labels to include before cleaning (pop_select). Default None.

Channels_ignoreSequence[str] or None

List of channel labels to exclude before cleaning. Default None.

availableRAM_GBfloat or None

Available system RAM in GB to adjust MaxMem. Default None.

EpochEventsstr or Sequence[str] or None

Optionally a list of event types or regular expression matching event types at which to time-lock epochs. If None (default), no epoching is done. If [], will time-lock to every event in the data (warning, this can amplify the data if epochs overlap!)

EpochLimitsSequence[float]

The time limits in seconds relative to the event markers for epoching. Default (-1, 2).

EpochBaselineSequence[float] or None

Optionally a time range in seconds relative to the event markers for baseline correction. If None (default), no baseline correction is applied. The special value None can be used to refer to the respective end of the epoch limits, as in (None, 0).

StageNamesSequence[str]

list of file name parts for the preprocessing stages, in the order of cleaning,ica,iclabel; these can be adjusted when working with different preprocessed versions (e.g., using different parameters for cleaning). It is recommended that these start with ‘desc-‘.

FinalDescstr or None

Optional desc- label for the final output file. If None (default), uses the last stage name from StageNames. If empty string ‘’, the output file has no desc- label (e.g., sub-01_task-rest_eeg.set instead of sub-01_task-rest_desc-cleaned_eeg.set).

ReportDirstr or None

Optional directory for report JSON files. If None (default), reports are saved alongside the data files. If set (e.g., ‘code/reports’), reports are saved there relative to the output directory.

MinimizeDiskUsagebool

whether to minimize disk usage by not saving some intermediate files (specifically the PICARD output if WithICLabel=False). Default True.

bidsmetadatabool

alias for ApplyMetadata

bidseventbool

alias for ApplyEvents

bidschanlocbool

alias for ApplyChanlocs

eventtypestr

alias for EventColumn

subjectsSequence[str | int], optional

alias for Subjects

sessionsSequence[str | int], optional

alias for Sessions

runsSequence[str | int], optional

alias for RUns

tasksSequence[str] | str, optional

alias for Tasks

outputdirstr

alias for OutputDir

Returns#

resultDict[str,Any] | List[Dict[str, Any]] | None

Depending on ReturnData, either a list of EEG objects (if BIDS root folder was specified) or a single EEG object (if a single file was specified), otherwise None.

Parameters:
Return type:

Dict[str, Any] | List[Dict[str, Any]] | None

eegprep.bids_list_eeg_files(root, subjects=(), sessions=(), runs=(), tasks=())

Return a list of all EEG raw-data files in a BIDS dataset.

Parameters#

rootstr

The root directory containing BIDS data.

subjectsSequence[str | int], optional

A sequence of subject identifiers or (zero-based) indices to filter the files by. If empty, all subjects are included.

sessionsSequence[str | int], optional

A sequence of session identifiers or (zero-based) indices to filter the files by. If empty, all sessions are included.

runsSequence[str | int], optional

A sequence of run numbers or identifiers to filter the files by. If empty, all runs are included. Note that zero-based indexing does not apply to runs, unlike subjects and sessions since runs are already integers.

tasksSequence[str] | str, optional

A sequence of task names or single task to filter the files by. If empty, all tasks are included (default is an empty sequence).

Returns#

List[str]

A list of file paths to EEG files in the BIDS dataset.

Parameters:
Return type:

List[str]

Data Validation#

eegprep.eeg_checkset(EEG, *checks, load_data=True)

Validate and set up EEG dataset structure.

Ensures EEG dict has required fields with correct types, computes ICA activations if possible, and loads data from file if specified.

Interactive Session#

class eegprep.EEGPrepSession(EEG=<factory>, ALLEEG=<factory>, CURRENTSET=<factory>, ALLCOM=<factory>, LASTCOM='', STUDY=None, CURRENTSTUDY=0, PLUGINLIST=<factory>)

Bases: object

EEGLAB-like GUI state without module globals.

Parameters:
EEG: dict[str, Any] | list[dict[str, Any]]
ALLEEG: list[dict[str, Any]]
CURRENTSET: list[int]
ALLCOM: list[str]
LASTCOM: str = ''
STUDY: dict[str, Any] | None = None
CURRENTSTUDY: int = 0
PLUGINLIST: list[dict[str, Any]]
add_change_listener(listener)

Register a callback that runs after session state changes.

Parameters:

listener (Callable[[EEGPrepSession], None])

Return type:

None

remove_change_listener(listener)

Remove a previously registered session change callback.

Parameters:

listener (Callable[[EEGPrepSession], None])

Return type:

None

add_command_echo_listener(listener)

Register a callback for GUI commands to display in the console.

Parameters:

listener (Callable[[str], None])

Return type:

None

remove_command_echo_listener(listener)

Remove a previously registered command echo callback.

Parameters:

listener (Callable[[str], None])

Return type:

None

add_gui_action_listener(listener)

Register a callback for GUI action start/end notifications.

Parameters:

listener (Callable[[str, str], None])

Return type:

None

remove_gui_action_listener(listener)

Remove a previously registered GUI action callback.

Parameters:

listener (Callable[[str, str], None])

Return type:

None

begin_gui_action(action)

Notify listeners that a GUI action is about to run.

Parameters:

action (str)

Return type:

None

end_gui_action(action)

Notify listeners that a GUI action has finished.

Parameters:

action (str)

Return type:

None

gui_action(action)

Wrap a user-triggered GUI action for console/output synchronization.

Parameters:

action (str)

Return type:

Iterator[None]

echo_command(command)

Display a GUI command without mutating session history.

Parameters:

command (str | None)

Return type:

None

notify_changed()

Notify listeners that session-backed state changed.

Return type:

None

current_eeg()

Return the current EEG selection.

Return type:

dict[str, Any] | list[dict[str, Any]]

current_set_value()

Return EEGLAB-style CURRENTSET scalar/list value.

Return type:

int | list[int]

selected_dataset_indices()

Return the selected EEGLAB-facing dataset indices in order.

Return type:

list[int]

store_current(eeg, *, new=False, command='', mark_saved=False, index=None)

Store eeg in ALLEEG and select it.

Parameters:
Return type:

int | list[int]

retrieve(indices)

Select dataset(s) from ALLEEG using 1-based indices.

Parameters:

indices (int | list[int])

Return type:

dict[str, Any] | list[dict[str, Any]]

apply_workspace_state(*, eeg=<object object>, alleeg=<object object>, currentset=<object object>, allcom=<object object>, lastcom=<object object>, study=<object object>, currentstudy=<object object>, command='', append_dataset_history=False)

Apply a GUI/console workspace update as one session transaction.

Parameters:
  • eeg (Any)

  • alleeg (Any)

  • currentset (Any)

  • allcom (Any)

  • lastcom (Any)

  • study (Any)

  • currentstudy (Any)

  • command (str)

  • append_dataset_history (bool)

Return type:

None

delete_current()

Delete the current dataset selection from memory.

Return type:

None

clear_all()

Clear all datasets and study state.

Return type:

None

set_study(study, alleeg=None, *, command='')

Set STUDY/CURRENTSTUDY and optionally replace loaded datasets.

Parameters:
Return type:

None

select_study(*, command='CURRENTSTUDY = 1')

Select the current STUDY set in the shared workspace.

Parameters:

command (str)

Return type:

None

add_history(command, *, notify=True)

Append an EEGLAB-style command to session history.

Parameters:
  • command (str | None)

  • notify (bool)

Return type:

None

clear_history(*, notify=True)

Clear command history and LASTCOM as one session mutation.

Parameters:

notify (bool)

Return type:

None

remove_history(count, *, notify=True)

Remove the most recent count command-history entries.

Parameters:
Return type:

None

history_command_at(index)

Return the 1-based command from most recent history first.

Parameters:

index (int)

Return type:

str

clear_last_command(*, notify=True)

Clear LASTCOM without deleting ALLCOM.

Parameters:

notify (bool)

Return type:

None

mark_current_saved()

Mark the current dataset selection as saved in EEG and ALLEEG.

Return type:

None

menu_statuses()

Return EEGLAB-style menu status tokens for the current state.

Return type:

set[str]

dataset_summaries()

Return (index, label, selected) tuples for the Datasets menu.

Return type:

list[tuple[int, str, bool]]

clone_current()

Return a deep copy of the current EEG selection.

Return type:

dict[str, Any] | list[dict[str, Any]]

class eegprep.EEGPrepConsoleWorkspace(session, *, window=None, refresh=None, command_echo=None, exports=None, extension_runtime=None)

Bases: object

Synchronize an IPython namespace with an EEGPrepSession.

Parameters:
  • session (EEGPrepSession)

  • window (Any | None)

  • refresh (Callable[[], None] | None)

  • command_echo (Callable[[str], Any] | None)

  • exports (Mapping[str, Any] | None)

  • extension_runtime (ExtensionRuntime | None)

close()

Detach this workspace from session notifications.

Return type:

None

pull_from_session()

Mirror session state into the console namespace.

Return type:

None

after_execute(source, *, success=True)

Push console-side workspace edits back into the session.

Parameters:
Return type:

None

accept_pop_result(result, args, kwargs=None)

Store a pop_* result in the current session when appropriate.

Parameters:
Return type:

Any

pop_wrapper(name)

Return the console-aware wrapper for a public pop_* function.

Parameters:

name (str)

Return type:

ConsolePopFunction

execute_history_command(command)

Execute an EEGLAB history command through the console namespace.

Parameters:

command (str)

Return type:

None

eegprep.plugin_menu(pluginlist=None, *, parent=None, session=None, show=True, registry=None, catalog=None, catalog_path=None, include_bundled=True, include_entry_points=True, disabled_extensions=None)

Show or return the EEGPrep Extension Manager inventory.

Parameters:
  • pluginlist (list[dict[str, Any]] | tuple[dict[str, Any], ...] | None) – Optional extension inventory to display. Defaults to the extension registry merged with the curated metadata catalog.

  • parent (Any | None) – Optional Qt parent widget for the dialog.

  • session (Any | None) – Optional EEGPrepSession; its PLUGINLIST mirror is updated with the displayed inventory.

  • show (bool) – Show the Qt dialog when True. Use False for scripts, examples, tests, or console inventory checks.

  • registry (ExtensionRegistry | None) – Optional discovered registry for tests or explicit control.

  • catalog (ExtensionCatalog | None) – Optional loaded catalog. Defaults to the packaged/local catalog.

  • catalog_path (str | None) – Optional JSON catalog path.

  • include_bundled (bool) – Include bundled EEGPrep plugin ports in default discovery.

  • include_entry_points (bool) – Include installed entry-point extensions in default discovery.

  • disabled_extensions (set[str] | list[str] | tuple[str, ...] | None) – Registry names to mark disabled during default discovery.

Returns:

The normalized extension inventory as a mutable list of dictionaries. Records include install/update command strings but never execute them.

Return type:

list[dict[str, Any]]

eegprep.plugin_status(pluginname, *, exactmatch=False, pluginlist=None, registry=None, catalog=None, catalog_path=None, include_bundled=True, include_entry_points=True, disabled_extensions=None)

Return EEGLAB-style installed status for EEGPrep extensions.

Parameters:
  • pluginname (str) – Plugin or extension name, package name, or substring to search.

  • exactmatch (bool) – Require exact case-insensitive name matching.

  • pluginlist (list[dict[str, Any]] | tuple[dict[str, Any], ...] | None) – Optional precomputed extension inventory. Defaults to the registry plus the curated catalog.

  • registry (ExtensionRegistry | None) – Optional discovered registry for tests or callers that need explicit discovery control.

  • catalog (ExtensionCatalog | None) – Optional loaded catalog. Defaults to the packaged/local catalog.

  • catalog_path (str | None) – Optional JSON catalog path.

  • include_bundled (bool) – Include bundled EEGPrep plugin ports in default discovery.

  • include_entry_points (bool) – Include installed entry-point extensions in default discovery.

  • disabled_extensions (set[str] | list[str] | tuple[str, ...] | None) – Registry names to mark disabled during default discovery.

Returns:

A tuple (status, names, pluginstruct) where status values are 1 for active installed/bundled extensions and 0 for curated-only, disabled, incompatible, failed, or missing-dependency matches.

Return type:

tuple[list[int], list[str], list[dict[str, Any]]]

Object-Oriented Interface#

class eegprep.EEGobj(EEG_or_path)

Bases: object

Wrapper class for EEG datasets stored as dictionaries.

Provides attribute access to EEG fields and method calls to eegprep functions.

__init__(EEG_or_path)

Initialize from an EEG dict or a file path string.

  • If string: loads dataset with pop_loadset(path).

  • If dict: uses it directly.

__getattr__(name)

Access EEG fields or eegprep functions.

  • If ‘name’ is a key in EEG, return EEG[name] (convenience).

  • If ‘name’ resolves to a function in eegprep, return a wrapper that: self.EEG = func(deepcopy(self.EEG), …) and returns updated EEG for convenience.

  • Otherwise raise AttributeError so field-name typos fail fast instead of silently returning a no-op callable.

__setattr__(name, value)

Set attributes on the underlying EEG dict when possible, else on the wrapper.

__repr__()

Multi-line, MNE-like summary of the EEG object.

Shows key metadata, data shape, sampling info, time span, and brief events/channels info.

__str__()

Multi-line, MNE-like summary of the EEG object.

Shows key metadata, data shape, sampling info, time span, and brief events/channels info.