New methodologies developed by Ben-Gurion University of the Negev researchers will enable search engines to overcome the complexities of name identification resulting in significantly more accurate online people searches.
“Today, searching for a person by name is a routine online activity, yet the search results are in many cases incomplete or even misleading due to variations in name spelling," says. Dr. Michael Fire, a member of the BGU Department of Software and Information Systems Engineering (SISE) and the Data Science for Social Good Lab. “This is a significant problem both for companies that might be conducting searches for job applicants or individuals who might want to search for a distant relative.“
Dr. Michael Fire
Unlike a standard word, e. g. 'ball', spelled or written only one way, names or shortened names (diminutives) can be spelled several ways – e. g. John/Jon or Debbie/Debby. Search engines try to identify such name similarities using string similarity algorithms which, in many cases perform poorly.
Dr. Rami Puzis
To accomplish this elusive feat, they harnessed a dataset of 17 million people, over 700,000 individual names and 500,000 unique surnames. The methods were tested on three cataloged datasets of first and last names including tens of thousands of first and verified last names.
As a part of their work, the researchers proposed an innovative and groundbreaking representation of names, which considers the way humans pronounce the name in a particular language and accent. “This innovative representation is very dynamic and allows you to identify names that sound similar, but are not necessarily written in the same way," Dr. Fire says.
“The impressive data obtained highlights the breakthrough and the huge potential in the methods proposed to make it easier to find people based on name variants," says BGU Researcher Dr. Rami Puzis. "We are creating a website that will be accessible to everyone and will allow people to be found using the algorithms we have developed."