There is much chatter about “AI” nowadays with the emergence of “ChatGPT” and other “AI” flavors within browsers. Microsoft and Google have stepped into the arena, as well as others. Each have their own versions of these “AI” engines (aka models) that ingest questions and spit out answers based on all the data that is available to the “AI” engine.
Humans tend to want to take a quicker and easier path to get to what they need, so this news about the use of “AI” as effectively modernized “Cliff Notes” has shaken the learning and creative professional communities, especially.
How do these mechanisms work, and what will this mean to the world of geospatial data and tech?
It’s all about the models
Artificial intelligence (AI) essentially is driven by invoking “models” created by humans that contain algorithms enabling them to “learn” and become “smarter”. These AI models can assist humans in gaining understanding more rapidly about whatever it is that they want to know about. The recent “AI” in the news features “all consumptive” type capabilities, not unlike going to some sort of oracle and asking it to give you wisdom.
Our experience is that these “AI” tools are comprehensive and produce concise sets of results in human readable paragraphs, which is different from web searching which yields lists of leads to investigate.
ChatGPT for instance, gives you “answers in a list of paragraphs related to the search”. You will not know the sources it used. What we know from recent news is that it supposedly uses “everything on the internet” that is available to it. With the results, it does include warnings about information accuracy.
Being data nerds, we are always skeptical about anything that depends on data to do it’s work Why?
Because we know data are not perfect, sometimes are nefariously bad, and something that tells you answers based on poor or mixed quality data can…well, it can give you poor answers – and could give you risky or even downright dangerous ones.
If you are using these “AI” as shortcuts, be wary. Check the information by validating it through other means.
Just know that there is a lot of crazy data, intentionally wrong data that are being leveraged by these models, and we are not talking about small amounts of data! Further, there is no way to truly estimate the huge amount of data accessible to these “AI” tools (or “engines” as they also are called).
What about geospatial data? Will these “AI” tools help improve the quality of location-based data?
We can certainly leverage “AI” models as tools to ingest geospatial data in order to validate it, and spit out the poor quality data for further investigation. This is part of forensic data analysis that can be done in order to validate or quality assure location-based information.
Absolutely, with the correct modeling, “AI” can be used to sift through large amounts of data more efficiently than humans can do this manually. In fact, in the world of geospatial data use, we have been doing such work for many years.
In evaluating geophysical data which is location-based three dimensional data, the data have been processed in various ways by geophysical software beginning in the 1970’s in a serious way, likely earlier in some research and development areas.
We are cutting up the thick data, slicing, dicing, concisely, and making it into an edible meal.
When we hear the phrase “there is nothing new under the sun”, at a high-level, what today’s “AI” does is essentially following the design patterns used by programmers to evaluate information by executing code. This coded algorithm based on using certain models, tells the computer what to look for, by use of logic – usually Boolean, potentially other additions, and munch the data yielding results that are easier for scientists to evaluate.
It is and is not rocket science.
In geospatial technology practices, we have been using models for evaluating data, analyzing it, and yielding results for many years – decades, actually.
We’ve taken up training our models to recognize patterns and flag them or push them into separate data sets for further work. We have begun to do this with very large amounts of spatial data, and these amounts increase and are only restricted by computing power – we throw more computing resources in, and we get faster results.
We haven’t typically called this “AI”, but essentially, any time we set-up a set of processes that ingest data, evaluate or compute the data, and yield a new result, we are “training” the algorithms to do this without our manual intervention. “Modeling a workflow” is typically what we call this activity.
Manipulating data, engineering data, forming and reforming data – it’s all under the roof of data science, where we use mathematics, and information technology in engineering solutions that help humans get work done. Adding the location aspect to the data just gives a different dimension to the meaning and context. “AI” can potentially help refine these activities and make them more efficient.
Taking the advice of the “AI” provider to be wary of the results and understand that they can be inaccurate is a wise course. Keep in mind “garbage in/garbage out”, and that disinformation can look extremely assured and factual. Go forth with awe and caution.