EEG Preprocessing Steps with EEGLAB

elif.ozgur · Post by **elif.ozgur** » Sun Feb 23, 2025 1:28 pm

Hi everyone,

I am currently trying to analyze the EEG data that I collected for my master's thesis using EEGLAB, and I have A LOT of questions regarding each step of the way. If you could guide me in any of the steps, I would be so grateful.

My sampling rate is 2048 Hz, and I used a 64 Channel BioSemi cap (Active - 10/20).

What I want to after preprocessing is to do an FFT, and look at the power of the frequencies of interest.

1. IMPORT --> I have BioSemi data (so .bdf format), and when importing the raw data, EEGLAB gives a warning that I should provide a reference channel, otherwise there will remain 40 dB of unnecessary noise in the data. For that, I have read often that I should just select a channel like Cz during raw data import (since I didn't place electrodes to mastoids), and later re-reference to average. Then, when I load the channel location file, EEGLAB somehow shows at the reference part that it is "unknown", but when I channel scroll, I can see that Cz is flat.
Question: Is this what I should be doing?

2. FILTERING --> Most of the papers that I take as a reference only did a high-pass filter (0.1 Hz Butterworth, 2nd /4th order) and don't really report any other artefact rejection steps before ICA, which I find extremely puzzling. So, I want to do a 20 Hz (rather conservative, as I want to really clean the data, and am not interested in frequencies above that), and then downsample to 256 for less computation time, considering how long ICA takes.
Question: Any comments?

3. PREPARING DATA FOR ICA --> After the downsampling part, comes artefact rejection. So, for this, I know that there are several approaches:

a.) Run ICA on manually or automatically cleaned continuous data?
a.1.) If this is the suggestion, HOW? I experienced that, when I use the Clean Raw Data with ASR approach, it cleans almost 60% or above of the data, to which I am also puzzled about, as with channel scroll it doesn't look THAT terrible. And if I try to do it manually, I am afraid I won't do such a good job as this is my first time properly analyzing EEG data, plus it won't be replicable.

b.) Cut the data into dummy epochs (like 1 second epochs), provide a threshold like +- 100 mV or 3 SD's, and reject the epochs that exceed the threshold. And then, run ICA. However, here, what should be the approach?
b.1.) Run ICA on 1-sec-epoched data, get the ICA weights and project it onto the raw, continuous data?
b.2.) Run ICA on 1-sec-epoched data, get the ICA weights and project it onto similarly preprocessed but not ICA'd data, OR
b.3.) Concatenate the epoched and cleaned data into the continuous form again, and then run ICA?

4. EPOCHING --> So, I had trials which lasted around 2.5 mins and which had rhythmic patterns that always followed each other. However, the patterns are different (accented vs isochronous), so their triggers types are different (50 & 200 always together in one trial OR 150 & 250 always together in one trial). Each pattern lasts 4.2 seconds.

I have two questions:

1.) I know that in neural entrainment studies people often loop the same pattern, (for example patterns lasting 2.4 seconds looped 25 times), and then they cut the data into 60 second epochs. However, as my event types change right after the other, I have to(?) cut my data into epochs that are 0 to 4.2 seconds exactly. Is this okay? I feel like it then becomes rather like an ERP study more than a neural entrainment study?

2.) I don't have a baseline! Again, because the patterns follow each other without breaks, I don't have a baseline, right? I know that baseline correction is a huge step in plotting ERP's, so without removing it, isn't it a problem?

I tried to do run the steps differently for one participant many times, and I always ended up with extremely noisy looking ERP plots.

I would genuinely appreciate any help for any step of the way.

Thanks already!