Is it ok to remove noisy data?

AGI Blog - Is deleting bad, noisy data ok?


Here in North America, we're moving into the colder months of the year. And while some of us will brave the elements this season and do some cold-weather surveys, many of us will be indoors looking through the data we collected over the Summer.  So it’s a good time to start talking about the kind of work that's done off of the field—like data inversion. In this post, we want to discuss data filtering in particular.


Over the course of our 30 years in business, we’ve heard differing opinions on the matter. We've had customers who didn’t feel comfortable removing any data—even noisy data. However, there are those who believe it’s ok. We tend to agree with the latter, but with caveats. 


And by the way, you can find a lot of this information (including step-by-step tutorials) in the AGI Help Desk.

AGI Help Desk


What kind of data is ok to remove and why?

First and foremost, it has to be noisy data that you’re removing. This doesn't mean data that you don’t like. That’s obviously indefensible. 


So you may be asking, “Why is it ok to remove any data? Won’t it affect my model’s integrity?” These are completely understandable questions that we often hear. So let’s take a look at why we believe it’s ok to remove data, and then we'll discuss the right and wrong ways to do it. 


When doing an Electrical Resistivity survey, each measurement you take is an integrated response from the entire half or full-space. As such, the measurement carries information about the entire half or full-space. Data redundancy—due to over-sampling—ensures that the removal of some data is not fatal on the model's resolution.


The right way to remove data:

There’s a right way and a wrong way to remove data. The acceptable way to remove data is to do it procedurally. You should filter while following strict guidelines and a clear goal (more on that below). You can achieve this by using inversion software with a filtering tool—not by hand. The majority of our customers filter their data using EarthImager™ 2D or EarthImager™ 3D. Both versions of the software come with a Data Misfit Histogram tool for data cleanup (seen below).


You can try the Data Misfit Histogram Tool for yourself using our free demo of EarthImager™ 2D or EarthImager™ 3D

AGI Data Misfit Histogram Tool in EarthImager 2D
Above: The AGI Data Misfit Histogram Tool in EarthImager™ 2D


As long as you follow these 3 guidelines, filtering is quite easy:

  1. For most data sets, you’ll be able to remove up to 20% of the total and still have acceptable inversions. Keep this parameter in mind while filtering. 

  2. Do not remove all of your noisy data at once. Instead, you should work in iterations. 

  3. After each inversion process, you will want to only remove ≤5% of the data. Keep filtering and doing inversions until you get to ≤10% RMS or the 20% data removal limit. 


The wrong way to remove data:

As mentioned above, hand-picking data points is not a defensible practice—regardless of if it's good or noisy data. Another bad practice is to remove a large number of data points—even if done procedurally. Massive data removal will cause poor resolution in any area of your model without data coverage.



Remember, you should have some wiggle room with redundant data due to oversampling. In fact, we’ll often recommend our customers at least do a Dipole-Dipole with a Strong Gradient array so that they have more than enough data. 


Let’s take a look at a data set before filtering:
AGI Blog - Data Example Before Filtering


And now, after filtering:

AGI Blog - Data Example After Filtering

In these examples (taken from a previous webinar), we used the 3 guidelines from above to get to about 5% RMS—all the way down from 47%. As you can see, the model still has good resolution and would be totally acceptable for peer review. Filtering the right way is really simple to accomplish—and we encourage you to download a demo of EarthImager™ and try for yourself!


Request a Free 30-Day Demo of EarthImager 1D, 2D, or 3D