IBM Research will turn over its data search technology to the open source community, the company said today. The Unstructured Information Management Architecture searches store data not through keywords, but by analyzing the data within documents to see if they fit the concepts and facts the user is researching. It will be made available through SourceForge, a repository for open-source code, by the end of the year, IBM said.
Thanks for the interesting article. Once again IBM is giving us a great vision about the future and how unstructured information can be searched. InfoCodex already does all this today with the help of a linguistical database and synonym and/or similarity search across 5 languages (German, French, Italian, English and Spanish). With InfoCodex you can search for a block of text in one language and it will find you all the similar documents in the other languages as well. All of this is done without one single minute of training - because of the linguistical database that contains 2.9 Mio words and terms (i.e. "European Court of Justice" or "The President of the United States" are terms and reconized as such). See the following links: http://www.ywesee.com/pmwiki.php/Ywesee/InfoCodexProcedure http://www.ywesee.com/uploads/Ywesee/archimag-e.pdf http://www.ywesee.com/uploads/Ywesee/Evaluationsentscheid-e.pdf http://www.ywesee.com/uploads/Main/USP_e.pdf
IBM to Open Source Conceptual Search
Posted by: Susan B. Shor August 8, 2005 01:22 PMIBM Research will turn over its data search technology to the open source community, the company said today. The Unstructured Information Management Architecture searches store data not through keywords, but by analyzing the data within documents to see if they fit the concepts and facts the user is researching. It will be made available through SourceForge, a repository for open-source code, by the end of the year, IBM said.
InfoCodex already does all this today with the help of a linguistical database and synonym and/or similarity search across 5 languages (German, French, Italian, English and Spanish). With InfoCodex you can search for a block of text in one language and it will find you all the similar documents in the other languages as well. All of this is done without one single minute of training - because of the linguistical database that contains 2.9 Mio words and terms (i.e. "European Court of Justice" or "The President of the United States" are terms and reconized as such).
See the following links:
http://www.ywesee.com/pmwiki.php/Ywesee/InfoCodexProcedure
http://www.ywesee.com/uploads/Ywesee/archimag-e.pdf
http://www.ywesee.com/uploads/Ywesee/Evaluationsentscheid-e.pdf
http://www.ywesee.com/uploads/Main/USP_e.pdf