Cognitive Tools Archives

Image Search – Find Similar Photos and Images with Noggle Photo Intelligence

January 31, 2018/in Cognitive Tools, Searching /by LvT

Our photo and image search features align with our efforts to use computer vision for knowledge retrieval. Users are able to find similar images via our integrated recommender engine with the blink of an eye. Only select one image shown in the results section and Noggle will automatically pull up all related and similar images in near-real-time.

Making your own photo and image library searchable with Noggle is now as easy as pulling up images on public search engines.

Image search options

You can search similar or related images like this:

Image search via external image drag and drop
Simply drag’n drop an image (.jpg/.png) file from your file explorer or email inbox onto the Noggle app search bar. The Noggle app will look for similar photos and images. The results will show up in the Noggle search results.
Image search via file explorer context menu
Use the windows file explorer context menu for image and photo search. The context menu opens with a right mouse click on the image. Select “Open with…” and choose “Desktop & Cloud search” from the selection to activate Noggle similarity photo intelligence. The Noggle app will open and show similar and related images.
Image search via Noggle client
Do a text search within the Noggle client application: Just enter your search term and Noggle will check if there are images related to that search term. Once an image was found and shown in the search results, related similar images will automatically be shown in the “Related” info section of the search window.

Safe and secure

Image indexing is applied and stored locally on your client.

Location Tree Map Clustering

November 26, 2017/in Cognitive Tools, KnowledgeBase /by LvT

The “Location” tab shows folder based treemaps for space-constrained visualization of relevant file hierarchies

The size of the cluster shows the relative count of files in that folder based on the search term. The color indicates the relevance of the found documents according to the search term specified. A clicked or selected cluster on the map will filter the overall search results list automatically.

Internet of Things – Trend Research 2017

February 3, 2017/in Cognitive Tools, KnowledgeBase /by LvT

Internet of Things – Cognitive Clusters

Technology moves fast, and when predicting the future, it can be hard to keep up. Here at Noggle, we believe in analyzing what’s happening right now in order to gain a more accurate gauge of what’s realistically going to come into being over the next few months and years ahead.

To do this, where better to look for the ideas of the future than in the worldwide Patents database? Examining the concepts that have been submitted and protected now, gives a strong indication of where technology is heading and what innovations are taking place. Of course, not all inventions are created equal, and many patents won’t last the course and make it into our collective future conscious and culture – this is why we have produced a broad overview of recent patents, and picked up on recurring and common aspects and topics. By detecting clusters and averages of prevalent and frequently appearing themes, our findings represent a more likely look at the ideas that may be entering and shaping our lives in the not too distant future.

What’s next for IoT?

We used this approach to analyse upcoming trends in the field of the Internet of Things (IoT) for this year, and beyond. Below, you’ll see a cognitive map which depicts popular phrases, ideas and subjects for IoT in the current patents database.

IoT Trends Map Internet of Things – Bibliometrics

As you can see, proposals centre around “machine to machine” usage – which is the largest cluster in the centre of the map. So IoT is not for humans – IoT will be used by machines and will bring a new generation of robots to life. These robots may be communicating with lots of machines to make decisions that humans are not able to. That’s the core of IoT – IoT is the ‘brain’ for the age of machines. Compare that to our nervous system where synapses pass chemical signals between neurons. IoT are the synapses of machines. This isn’t science fiction anymore; the patent clusters prove it today.

Move forward in the cognitive map and not far away is the cluster “controlling a vehicle”. So the most expected use of IoT which, as anticipated is vehicles (e.g. cars, ships, airplanes) that will be controlled by machines communicating with each other. Think about a “traffic robot” so instead of a police officer managing the flow for cars on the road because the cars are communicating with each other, you could feed a central machine which decides about speed and stops. Expect this kind of smart traffic management to show up in all areas during the next 1-3 years.

We can then group these items together further. In the following visual, you’ll see how these clusters relate to each other with color highlighting.

IoT Trend Clusters Grouped

By connecting these subjects, we can clearly see that immediate technological growth is also focused on powering and monitoring our living spaces utilizing intelligent switches, sensors for lights and water flows.

Discover trends at a glance

This graphic visualisation of submission patterns allows us to identify the following key trends for IoT:

1. Machine-To-Machine – Next generation robots:
Named clusters for ‘M2M’ with communicating ‘nodes’ including intelligent and connecting ‘chips’. Watch out for new robot-alike assistants. E.g. Alexa and Siri communicating with lots of devices in the years to come.

2. Vehicles – Say “Hello” to Smart Traffic:
Named clusters include ‘monitored vehicle’, ‘vehicle terminal’, ‘controlling a vehicle’ and ‘detects a vehicle’. So you wont drive alone anymore.

3. Lighting & Switch management – The Smart Home becoming the new normal:
Intelligent devices used to control lighting, with clusters such as ‘light sensor’, ‘lighting modules’, ‘lighting circuit’, ‘intelligent lighting’, ‘lamp light’ and phrases such as ‘smart home’, ‘smart home services’ and ‘intelligent switches.’

4. Water management:
Clusters show interest in areas of ‘intelligent water’, ‘water pipe’, ‘controls water’ and ‘equipped with the water.’

5. Data explosion in production processes:
Automation looks set to gain momentum in 2017, with clusters outlining ‘production data’, ‘production process’ and ‘production costs.’

Finally, we can expect to see massive growth in machine-to-machine data communication with new cloud based management and integration platforms for IoT coming into real world application in 2017, with an associated influx of new management software coming to market for the management of IoT devices in production processes.

Creating cognitive clusters

Already, our smart search and cognitive clustering has provided us with a valuable insight into the ideas that are being developed and registered in the world of IoT, some of which will eventually come to fruition and affect our daily lives. But how did we do it?

Any Noggle user can create similar intelligently curated trend diagrams – on this occasion, we simply issued a search request to the worldwide patent database on our chosen subject of Internet of Things. (A search could also be run on one of Noggle’s other third party databases, such as our comprehensive listing of Ted Talks, and open access science articles and research journals.)

We used a very generic and broad request in this example. You could even start with more specific terms to get more specific clusters. This produced over 6,000 patent results, allowing us to then run the ‘clustering’ algorithm – our artificial intelligence text processing automatically scans all of these search responses for similarities, in a manner which is not biased towards any specific clustering output. Within 20 seconds, we could see common ideas organised together in a way that is both manageable and interesting to evaluate. From a conceptual view, it is a way of bringing the approach of Scientometrics to real life on every desktop. Scientometrics is no longer a theoretic topic for experts with huge machine power – everyone can start doing trend research today.

But the map is more than just an image – we can delve deeper into each cluster to examine the individual patents that have been grouped together. To explore the collection for yourself, Noggle clients can browse the IoT cognitive cluster map discussed within this article at: https://www.noggle.online/knowledgemap/internet-of-things-trends-to-watch-2017/

Increase your knowledge with cognitive features from Noggle.

Noggle knows that you want to spend your precious time learning and creating value – not searching for files. Our system creates a secure peer network that syncs and connects disparate locations, so that locating and retrieving files is now the work of moments, not minutes. Search and share new depths of content, and easily collaborate on inspiring ways of working. To see how Noggle could make a difference to the way that you manage the files that matter, install our free trial.

Watch the making-of in this 100sec screen recoding:

Or check out the final infographic based on this research:

IoT Trends Infographic

Download material

Pictures, references and .pdf versions for download, sharing and further research

Infographic – IoT

February 3, 2017/in Cognitive Tools /by LvT

Article and making of available here: https://www.noggle.online/IoT

IoT Trends Infographic

Download material

Pictures, references and .pdf versions for download, sharing and further research

Article and making of available here: https://www.noggle.online/IoT

How-To: Maps Of The Worlds Digital Knowledge

January 11, 2017/0 Comments/in Cognitive Tools, Intro, KnowledgeBase /by LvT

Digital Knowledge

If we could join the dots between all the research articles that have been published digitally, what would happen?

Academics have already suggested that if we could only make the right connections between all the pieces of digital knowledge already available, we could tackle the most pressing questions facing society.

It is not about generating more and more content. It is about connecting the dots between pieces of material already in existence. The problem is not having digital knowledge “somewhere”. Our problem is retrieving knowledge when we need it.

Examples are already available of experts identifying key information about diseases like Alzheimer’s by data mining relevant literature. [1]

Stop talking! Let’s start putting Big Data and Text Mining into practice

You’ve already come across the buzzwords ‘big data’ and ‘text mining,’ right? But do you have this technology on your desktop ready to use? I bet not. At least, only a few of us will.

We can change that. I’ve started to produce an application which brings text mining, indexing and cognitive clustering right to your fingertips. I’ve kept it simple. Its “Google-like” interface with cognitive technologies can analyze private and deep-web content sources.

The technology is just one side of the story. The other side is the ability to analyze personal, private knowledge sources as well as external content libraries.

We need access to an entire body of knowledge, across all content sources

Nowadays, we want easy access to an entire body of domain knowledge stored digitally. Unfortunately, external publishers’ collections represent knowledge silos, because nobody wants to blend together different publishers.

So let’s move forward and join the dots.

The goal is to solve the following two problems:

A. Unify the search experience to different content sources

The good news is that many publishers nowadays offer open access to their content via APIs interfaces. The bad news is that they all look different.

I want to unify these different technical API access points. This will bring all content into one simplified search front-end to enable a search for term-based knowledge domains. Be it Patents, Open Access Science, IEEE or TEDTalks or …

Together with private knowledge sources from different storage locations, this would create a cool search experience.

B. Provide cognitive guided visual maps to explore the results

Often, so much content is returned in response to generic search terms that we ending up having to browse endless listings. But browsing linear listings is not a way of learning our brain can manage efficiently. Confronting hit after hit in a list—when lists don’t end—is not the way our brain works best. Our brain works more associatively. We need different forms to visualize the search result listings.

Forget about boring search result listings – use KnowledgeMaps

What would you think of a search result visualization tool that provides essential information about the structure of topics within the search results? Let’s call it a “KnowledgeMap” of similar terms in the documents from your initial search.

And it does so in a visual way like how our brain works – not with pure “listings.”

It looks like this: A clustering algorithm scans internal relations and linguistic patterns among documents according to how similar they are to the initial search request. Then it presents you with a visual map of these clusters and documents. You can now unearth new groups or cross-document relationships, which might guide you to new, interesting areas that build upon the initial search request.

Example 1: TED Talks – Predictions and future projections

The following infographic-like knowledge map was created by searching TED Talks for future projections and predictions. This allows you to browse 500 TED Talk predictions in clusters like “Future Energy,” “Social Change,” “Education,” or “Medical Research”:

TED Talks

Picture 1: KnowledgeMap of Clusters for TED Talks Predictions [2]

The Technology cluster includes a talk by Nicholas Negroponte, “5 predictions, from 1984.”
The Social Change cluster ranks the talk “The future of money” about how crytopcurrencies will change the banking landscape as #1.
The Politics cluster contains the top ranked talk “A prediction for the future of Iran,” which is based on a mathematical analysis for predicting human events.

These clusters have been auto-generated based on cognitive analysis and each talk is listed in the Noggle KnowledgeMap browser with a link to the original TED website.

Isn’t it a beautiful map of 500 predictions from the world’s most inspiring leaders?

Example 2: How drones are changing our lives

While there is a lot of “social noise” on the “airborne fulfillment centers” patent from Amazon [3], there is not just this one patent in that area. But it will hard to get it by just scrolling large listings in patent databases.

The following infographic-like knowledge map was created by searching the US patent library for “UAV and drone” with additional cognitive clustering based on the search result, which contained over 350 patents.

The KnowledgeMap spotted a cluster with 20 patents on the subject of “delivery” via drones!

Picture 2: KnowledgeMap of Clusters of US Patents on UAVs and Drones [4]

You can now unearth new groups or cross-document relationships, which may guide you to new, interesting areas that build upon the initial search request.

In milliseconds, thousands of documents located for the initial request are analyzed to build cognitive guided clusters. In addition, a new visual search experience is created by using KnowledgeMaps to present and browse the retrieved documents.

Generate stunning knowledge maps on your own

Now the fast and final part of the story: Let’s connect the unified search to access different content sources with the stunning KnowledgeMap feature: There it is, right at your fingertips… Generate stunning maps of the world’s digital knowledge by yourself.

Whether it’s patents, inspiring TED Talks, or open-access science articles—it is all now in front of you. Now you can discover links that could help us tackle whatever question or issue you can think of, in whatever area you choose.

Final 4 How-To steps to do it on your own

1. Browse available KnowledgeMaps at public.knowledgemaps.online

2. Download and install the free application via noggle.online

3. Create or select libraries to execute a research request

4. Generate cognitive clusters and maps on your own

The creative potential of this technology offers new ways of research and digital knowledge discovery. The produced maps are open to share, so it allows the retrieved KnowledgeMaps to be published and shared in your teams.

Now, share the news and lets start to make the right connections between all the pieces of information to tackle the most pressing questions facing society!

Happy Knowledge Mapping!

Yours, Lars von Thienen

Document Recommendation – Cognitive-guided Knowledge Retrieval

June 18, 2016/0 Comments/in Cognitive Tools, KnowledgeBase /by LvT

Document Recommendation

The task of document recommendation to knowledge workers differs from the task of recommending products to consumers.

Collaborative approaches, as applied to books or videos, attempt to communicate patterns of shared interest to augment conventional search results. However, it turns out that subtle variations in search context can undermine the effectiveness of collaborative filtering. There are well-known problems with these approaches.

For information seeking, what seems to be required is a recommendation system that takes into account both the user’s query and certain cognitive features from the context. Being able to leverage existing taxonomies and inter-document and inter-library relationships helps to recommend related and similar documents.

The Noggle recommendation engine is optimized and can detect all related documents for a given document. If a document is selected from the search results, the engine pulls up all related or similar documents. Regardless of the filename or file type. The recommendation intelligence is based on full-text/content-similarity deep-search algorithms. It can even pull up new versions of existing documents that have been edited by your colleagues and saved in completely different locations. You can’t locate these documents with simple search queries on your own. Imagine that you find an old PowerPoint document and you want to see the latest version of the document and its Excel calculation sheet. They might be anywhere on the network, but our recommendation engine detects them instantly.

Please review the following example:

References:

Recommender Systems Definition & Background
Recommender systems survey

OnlineHelp Settings Map Tab

June 6, 2016/0 Comments/in Cognitive Tools, KnowledgeBase /by LvT

How to fine-tune the KnowledgeMap cognitive AI clustering algorithm

You can specifiy how the cluster lables should be generated and which lables should be excluded by the algorithm. Please go to the Settings -> Map section.

noggle_settings_map

Strong Cluster Label [Enabled/Disabled]:

This attribute may be useful when certain words appear in most of the input documents (e.g. company name from header or footer) and such words dominate the cluster labels. In such case, enabled strong cluster lables may improve the clusters.

Another useful application of this attribute is when there is a need to generate only very specific clusters, i.e. clusters containing small numbers of documents. This can be achieved by enabling strong cluster lables.

Stopwords:

A stopword is a word that has little meaning by itself. For example, the, a, then, and towards are stopwords for all English documents. A stopword can never appear by itself as a cluster label, although it might be used within a label, depending on the stoplabel settings.

Stoplables:

If a KnowledgeMap label includes one of the stop labels, the label will not appear on the map of clusters produced by Noggle KnowledgeMap.

Cognitive Search Engine: How To Overcome The Knowledge Disconnect

May 30, 2016/0 Comments/in Cognitive Tools, KnowledgeBase /by LvT

How To Overcome The Big Knowledge Disconnect With Cognitive AI: Cognitive Search Engine

Our cognitive search engine with cognitive document retrieval features knocks down barriers between you and your documents. Use our natural and contextual search features that augment users’ experiences via the power of machine-based AI. Plug them in and stop searching – start knowing.

Document Recommendations

The document recommendation engine can detect all related documents for a given document. If a document is selected from the search results, the engine pulls up all related or similar documents from available libraries regardless of the filename or file type. Our recommendation intelligence is based on full-text/content-similarity deep-search algorithms. It can even pull up new versions of existing documents that have been edited by your colleagues and saved in completely different locations. You can’t locate these documents with simple search queries on your own. For example, imagine that you find an old PowerPoint document and you want to see the latest version of the document and its Excel calculation sheet. They might be anywhere on the network, but our recommendation engine detects them instantly. [read more…]

Cross-Library Search

The managed library-sharing feature enables organizations to make their documents retrievable by approved people through distributed-search functions. With this feature, users can easily and quickly retrieve useful, relevant documents stored elsewhere on the network or on local computers. The cross-library search saves time and helps avoid the high cost of reinventing the wheel when a document exists somewhere else but cannot be located locally. The embedded “request document” function makes knowledge sharing as simple and secure as sending emails. Cross-library searches speed up the retrieval process and make document retrieval a collaborative activity via our cognitive search engine.

Topic Detection and KnowledgeMap Clustering

One search aid that helps information workers to retrieve relevant content from large content libraries is clustered cross-document relationship information. This cognitive search service returns visually enriched content topics for all documents in the current search results. It helps to overcome information overload by organizing collections of documents into clearly labeled, hierarchical, thematic clusters in real time, fully automatically, and without external knowledge bases. Instead of browsing linear search results, the KnowledgeMap is a cognitive, non-supervised search-result visualization tool that presents essential information about the structure of topics within search results. The clustering algorithm scans internal relationships and linguistic patterns among all the documents found. In doing so, it unearths new groups or cross-document relationships that might guide you to new, interesting topic areas that enhance the initial search request. The amount of time users spend trying to make sense of long lists of search results is shortened dramatically. With clearly labelled folders, users can navigate straight to the documents they need and easily skip irrelevant ones.

Topic Exploration Service

With the KnowledgeMap topic clustering engine, query refinement is just a mouse click away. Topic clusters generated by our cognitive KnowledgeMap can help users refine their initial queries and drill down to a specific subject. This cognitive feature allows users to automatically rephrase search queries to pull relevant documents out of the selected topic clusters.

Intelligent Duplicate Filtering

As documents got copied, shared, and reorganized over time, more copies of the same file become available in different folders. These files generate “noise” in your search results and make the results list inconvenient to read and browse if it includes duplicate files from different locations. Our duplicate filter is intelligent enough to keep only the version to which you have direct access. For example, a file might be available three times in your libraries: in the library on your local computer, in a network library (e.g., archive), and in a shared library from your colleague. Our intelligent duplicate filter shows you the file to which you have direct access and filters out the duplicate network file and the duplicate file from your colleague. You can always be sure you’re finding the smartest way to access the document without noisy search result listings.

Recent Work Linking

This feature extends the recommendation engine to drill down into your recent work. Our recent-work-linking algorithm scans your recently used files (e.g., Word documents and presentations you have recently worked on) and scans all available libraries for similar and related documents. Noggle presents a recommendation list for relevant files in your libraries that are related to your current work. In the blink of an eye, this cognitive search engine feature presents all documents from your libraries that might help you during your work activities.

Intelligent Open Engine

Noggle is not built on absolute storage paths. The proprietary Noggle document fingerprint holds all the content/full-text-based information needed to retrieve document regardless of file-naming conventions and storage locations. You can move files during the lifecycle, and the “intelligent open” engine for the document fingerprint will always try to locate the document and open it. This cognitive feature attempts to locate files via different mechanisms. First, the absolute file path is tested. Then, a similarity search is performed to locate a duplicate or similar version in your libraries. Finally, if not found, a document request is sent to the file owner if the file cannot be located in your environment. You can always get to the document no matter whether you have it, it has been moved, or it is part of a shared library. With just one click, the cognitive intelligent open feature guides to the physical document.

Image Text Recognition

The text recognition engine reads text from image files. Optical character recognition (OCR) detects text in an image and extracts the recognized words into an indexed character stream. This feature analyzes images to detect embedded text, generate text, and enable searching. This allows you to scan or take photos of important printed documents and save them in an indexed folder (e.g., simple TIF scans from printed “paper” documents). If these scanned or photographed files are included in a special library, our text recognition makes them retrievable via simple text searches.

Encyclopedia Document Trails

This service allows users to generate topic-specific document trails just by dragging and dropping a document from a library or search result into a Nogglepedia topic. Once you drag and drop a relevant document out of your Noggle library and into a specific Nogglepedia, a proprietary document “fingerprint” is generated. This service isn’t based on moving or sharing the document itself; the document fingerprint holds all the relevant information in an enriched, compressed format. These digital document encyclopedias can be privately shared in the managed Noggle network to empower swarm intelligence, such as research groups collecting fingerprints from private or corporate documents. These fingerprints bundle the available knowledge on special subjects. From each fingerprint in a Nogglepedia, all our cognitive search engine and retrieval services, such as recommendations, can be executed with just a mouse click.

Drop-In document linking

This service allows users to drag and drop any available document into the Noggle application to retrieve related documents, such as an email attachment or a local file. This file might not be part of any indexed library, but Noggle instantly scans the document and performs a concept-based full-text search within your document libraries. Therefore, you can drop any file into the Noggle client application, and this cognitive search engine and document retrieval service will perform full-text concept matching

Additional License Information:
You need a professional license for the following services:
Shared Cross-Library Search, Intelligent Open/Document Request, Collaborative Encyclopedia Document Trails

Further reading and references on “Cognitive Search Engine”:

What is the NoggleMap or KnowledgeMap?

November 14, 2015/0 Comments/in Cognitive Tools, KnowledgeBase /by LvT

The Noggle “KnowledgeMap,” a search result visualization tool, provides users with essential information about the structure of topics that appear within the search results. The Noggle clustering algorithm scans internal relations and linguistic patterns among all the documents according to how similar they are to the initial search request. This tool can unearth new groups or cross-document relationships, which might guide users to new, interesting areas that build upon their initial search request. Clustering is one of many methods that can be used to make searching collections of documents easier.

We have often heard users demand such clustered cross-document relationship information, likely because they become frustrated with the constantly growing document volume and fragmented data storage solutions they encounter in the cloud and other big data services.

Please review the following video tutorial:

[embedyt]http://www.youtube.com/watch?v=vZdNdJZrpn4[/embedyt]

A detailed knowledge base article on our NoggleMap search feature can be found here:

Document Clustering with KnowledgeMaps

November 14, 2015/0 Comments/in Cognitive Tools, KnowledgeBase /by LvT

[layerslider id=”4″]

Cognitive-guided, non supervised document clustering

NoggleMap Search Document Clustering

One of the most common problems people used to encounter when searching for information is that they could not find documents specifically related to what they were looking for. Nowadays, this task is quite successfully handled by standard search applications.

Thanks to these sorts of search engines, pulling up results has become easy. However, when it comes to explaining the search results or displaying specific details on what sort of results have been returned, users’ options are much more limited. Usually, a search application displays a ranked list of documents and a snippet of their contents. These ranked lists are helpful for document retrieval, but far away from knowledge management. Information about the internal relationships among the documents in the search results is often not provided by standard search algorithms.

Search Document Clustering

“Search result clustering” is defined as an automatic, non-supervised grouping of similar documents in a search hits list returned from a search engine. Clustering is one of many methods that can be used to make searching collections of documents easier.

So, the Noggle “KnowledgeMap,” a search result visualization tool, provides users with essential information about the structure of topics that appear within the search results. Furthermore, the Noggle clustering algorithm scans internal relations and linguistic patterns among all the documents according to how similar they are to the initial search request. This tool can unearth new groups or cross-document relationships, which might guide users to new, interesting areas that build upon their initial search request.

We have often heard users demand such clustered cross-document relationship information, likely because they become frustrated with the constantly growing document volume and fragmented data storage solutions they encounter in the cloud and other big data services.

Problem with ranked search lists

To illustrate the problems with conventionally ranked search result lists, let’s imagine a user wants to find information about “security.” Therefore, he or she starts with the simple search term “security.”

First, the user selects peer libraries that might be relevant. In this example, the user has libraries from three different peers. In addition, the user selects six of his own libraries to perform the search request.

Figure 1: Search results for search term “security” on nine libraries from four different owners

Figure 1 shows that the search included 27,616 documents and returned 1,500 top-ranked documents related to “security.” Obviously, this is a very general query that leads to a large number of hits. Therefore the majority will be about information security, system security, or security policies based on a library for “Information Technology”.

A determined user patient enough to sort through results ranking 100 or lower should be able to find some hits on topics like “access control” or “service continuity.” However, one problem with ranked lists is that sometimes users need to wade through irrelevant documents to get to the ones they want.

Grouping results into semantic cluster via document clustering

But what about an interface that groups search results into separate semantic topics? Like network security, data security, access control, service continuity, and so on? And what if these groups were decided automatically from their own internal content—not by biased methods where someone defines what might be important?

By generating groups like this, the user will immediately get an overview of what the results contain and should be able to pick out relevant documents with much less effort.

The following figure shows how the NoggleMap feature automatically detects cross-document relations based on linguistic patterns. The left part of the screen shows the clusters and the number of documents related to that cluster. The right panel shows a visual representation of that information.

Figure 2: Clustered search results for “security” via the Noggle KnowledgeMap document clustering service

All 1,500 documents are linked to one or more of these clusters. This way, users don’t need to browse through a ranked list from the top down—they can narrow down the major cluster they are looking for and go from there.

In order to be helpful, search result clustering must organize similar results into one group. This is the primary requirement for all document clustering algorithms. But in search result clustering, the clusters labels are also extremely important. The program must accurately and concisely describe the cluster’s contents so that users can decide if the information is relevant.

Start with generic search terms first

Since users are often unaware of all their choices in a search, they do not always know the exact phrase they should search for. Thus, starting with a more generic search makes sense. Let the artificial intelligence of the Noggle search engine detect knowledge clusters based on the cross-document linguistic patterns. The visual guide then allows the user to quickly focus on the results of interest by visually selecting the relevant clusters.

This kind of interface for search results is implemented by applying a variety of document clustering techniques to the results returned. This is something that we call the Noggle “KnowledgeMap” and “ClusterSearch” technique.

The user can now select the cluster “Access Control” and browse the relevant documents from the initial request on “security”. And later focus in on the associated documents.

Figure 3: Document list in the security cluster “Access Control” from the overall search results

This makes document retrieval over different libraries and document search spaces much more efficient. By using “generic” search terms first, Noggle builds clusters for users, who can then narrow down their area of interest and check relevant documents there. Using Noggle this way is not just about searching for documents. Finally, it is a full, non-supervised knowledge management approach to retrieving knowledge that matters. Without the need to know exact phrases and exactly which documents they appear in.

Video Example

The following live presentation showcases the document clustering for included TED Talk digital library. All maps are build by the Noggle client based on the standard application (2min.):

[embedyt] http://www.youtube.com/watch?v=YMHxWGLddjE[/embedyt]

The NoggleMap feature combines latest technolgies based on Text parsing, Microsof Azure, Apache Lucene, Carrot2 Project, Noggle pre- and post-processing algorithms and the Noggle network. Patent pending.