Location Tree Map Clustering

The “Location” tab shows folder based treemaps for space-constrained visualization of relevant file hierarchies

The size of the cluster shows the relative count of files in that folder based on the search term. The color indicates the relevance of the found documents according to the search term specified. A clicked or selected cluster on the map will filter the overall search results list automatically.

Internet of Things – Trend Research 2017

Internet of Things – Cognitive Clusters

Technology moves fast, and when predicting the future, it can be hard to keep up. Here at Noggle, we believe in analyzing what’s happening right now in order to gain a more accurate gauge of what’s realistically going to come into being over the next few months and years ahead.

To do this, where better to look for the ideas of the future than in the worldwide Patents database? Examining the concepts that have been submitted and protected now, gives a strong indication of where technology is heading and what innovations are taking place. Of course, not all inventions are created equal, and many patents won’t last the course and make it into our collective future conscious and culture – this is why we have produced a broad overview of recent patents, and picked up on recurring and common aspects and topics. By detecting clusters and averages of prevalent and frequently appearing themes, our findings represent a more likely look at the ideas that may be entering and shaping our lives in the not too distant future.

What’s next for IoT?

We used this approach to analyse upcoming trends in the field of the Internet of Things (IoT) for this year, and beyond. Below, you’ll see a cognitive map which depicts popular phrases, ideas and subjects for IoT in the current patents database.

IoT Trends Map Internet of Things – Bibliometrics

As you can see, proposals centre around “machine to machine” usage – which is the largest cluster in the centre of the map. So IoT is not for humans – IoT will be used by machines and will bring a new generation of robots to life. These robots may be communicating with lots of machines to make decisions that humans are not able to. That’s the core of IoT – IoT is the ‘brain’ for the age of machines. Compare that to our nervous system where synapses pass chemical signals between neurons. IoT are the synapses of machines. This isn’t science fiction anymore; the patent clusters prove it today.

Move forward in the cognitive map and not far away is the cluster “controlling a vehicle”. So the most expected use of IoT which, as anticipated is vehicles (e.g. cars, ships, airplanes) that will be controlled by machines communicating with each other. Think about a “traffic robot” so instead of a police officer managing the flow for cars on the road because the cars are communicating with each other, you could feed a central machine which decides about speed and stops. Expect this kind of smart traffic management to show up in all areas during the next 1-3 years.

We can then group these items together further. In the following visual, you’ll see how these clusters relate to each other with color highlighting.

IoT Trend Clusters Grouped

IoT Trend Clusters Grouped

By connecting these subjects, we can clearly see that immediate technological growth is also focused on powering and monitoring our living spaces utilizing intelligent switches, sensors for lights and water flows.

Discover trends at a glance

This graphic visualisation of submission patterns allows us to identify the following key trends for IoT:

1. Machine-To-Machine – Next generation robots:
Named clusters for ‘M2M’ with communicating ‘nodes’ including intelligent and connecting ‘chips’. Watch out for new robot-alike assistants. E.g. Alexa and Siri communicating with lots of devices in the years to come.

2. Vehicles – Say “Hello” to Smart Traffic:
Named clusters include ‘monitored vehicle’, ‘vehicle terminal’, ‘controlling a vehicle’ and ‘detects a vehicle’. So you wont drive alone anymore.

3. Lighting & Switch management – The Smart Home becoming the new normal:
Intelligent devices used to control lighting, with clusters such as ‘light sensor’, ‘lighting modules’, ‘lighting circuit’, ‘intelligent lighting’, ‘lamp light’ and phrases such as ‘smart home’, ‘smart home services’ and ‘intelligent switches.’

4. Water management:
Clusters show interest in areas of ‘intelligent water’, ‘water pipe’, ‘controls water’ and ‘equipped with the water.’

5. Data explosion in production processes:
Automation looks set to gain momentum in 2017, with clusters outlining ‘production data’, ‘production process’ and ‘production costs.’

Finally, we can expect to see massive growth in machine-to-machine data communication with new cloud based management and integration platforms for IoT coming into real world application in 2017, with an associated influx of new management software coming to market for the management of IoT devices in production processes.

Creating cognitive clusters

Already, our smart search and cognitive clustering has provided us with a valuable insight into the ideas that are being developed and registered in the world of IoT, some of which will eventually come to fruition and affect our daily lives. But how did we do it?

Any Noggle user can create similar intelligently curated trend diagrams – on this occasion, we simply issued a search request to the worldwide patent database on our chosen subject of Internet of Things. (A search could also be run on one of Noggle’s other third party databases, such as our comprehensive listing of Ted Talks, and open access science articles and research journals.)

We used a very generic and broad request in this example. You could even start with more specific terms to get more specific clusters. This produced over 6,000 patent results, allowing us to then run the ‘clustering’ algorithm – our artificial intelligence text processing automatically scans all of these search responses for similarities, in a manner which is not biased towards any specific clustering output. Within 20 seconds, we could see common ideas organised together in a way that is both manageable and interesting to evaluate. From a  conceptual view, it is a way of bringing the approach of Scientometrics to real life on every desktop. Scientometrics is no longer a theoretic topic for experts with huge machine power – everyone can start doing trend research today.

But the map is more than just an image – we can delve deeper into each cluster to examine the individual patents that have been grouped together. To explore the collection for yourself, Noggle clients can browse the IoT cognitive cluster map discussed within this article at:

Increase your knowledge with cognitive features from Noggle.

Noggle knows that you want to spend your precious time learning and creating value – not searching for files. Our system creates a secure peer network that syncs and connects disparate locations, so that locating and retrieving files is now the work of moments, not minutes. Search and share new depths of content, and easily collaborate on inspiring ways of working. To see how Noggle could make a difference to the way that you manage the files that matter, install our free trial.

Watch the making-of in this 100sec screen recoding:

Or check out the final infographic based on this research:

IoT Trends Infographic

IoT Trends Infographic

Download material

Pictures, references and .pdf versions for download, sharing and further research

How-To: Maps Of The Worlds Digital Knowledge

Digital Knowledge

If we could join the dots between all the research articles that have been published digitally, what would happen?

Academics have already suggested that if we could only make the right connections between all the pieces of digital knowledge already available, we could tackle the most pressing questions facing society.

It is not about generating more and more content. It is about connecting the dots between pieces of material already in existence. The problem is not having digital knowledge “somewhere”. Our problem is retrieving knowledge when we need it.

Examples are already available of experts identifying key information about diseases like Alzheimer’s by data mining relevant literature. [1]

Stop talking! Let’s start putting Big Data and Text Mining into practice

You’ve already come across the buzzwords ‘big data’ and ‘text mining,’ right? But do you have this technology on your desktop ready to use? I bet not. At least, only a few of us will.

We can change that. I’ve started to produce an application which brings text mining, indexing and cognitive clustering right to your fingertips. I’ve kept it simple. Its “Google-like” interface with cognitive technologies can analyze private and deep-web content sources.

The technology is just one side of the story. The other side is the ability to analyze personal, private knowledge sources as well as external content libraries.

We need access to an entire body of knowledge, across all content sources

Nowadays, we want easy access to an entire body of domain knowledge stored digitally. Unfortunately, external publishers’ collections represent knowledge silos, because nobody wants to blend together different publishers.

So let’s move forward and join the dots.

The goal is to solve the following two problems:

A. Unify the search experience to different content sources

The good news is that many publishers nowadays offer open access to their content via APIs interfaces. The bad news is that they all look different.

I want to unify these different technical API access points. This will bring all content into one simplified search front-end to enable a search for term-based knowledge domains. Be it Patents, Open Access Science, IEEE or TEDTalks or …

Together with private knowledge sources from different storage locations, this would create a cool search experience.

B. Provide cognitive guided visual maps to explore the results

Often, so much content is returned in response to generic search terms that we ending up having to browse endless listings. But browsing linear listings is not a way of learning our brain can manage efficiently. Confronting hit after hit in a list—when lists don’t end—is not the way our brain works best. Our brain works more associatively. We need different forms to visualize the search result listings.

Forget about boring search result listings – use KnowledgeMaps

What would you think of a search result visualization tool that provides essential information about the structure of topics within the search results? Let’s call it a “KnowledgeMap” of similar terms in the documents from your initial search.

And it does so in a visual way like how our brain works – not with pure “listings.”

It looks like this: A clustering algorithm scans internal relations and linguistic patterns among documents according to how similar they are to the initial search request. Then it presents you with a visual map of these clusters and documents. You can now unearth new groups or cross-document relationships, which might guide you to new, interesting areas that build upon the initial search request.

Example 1: TED Talks – Predictions and future projections

The following infographic-like knowledge map was created by searching TED Talks for future projections and predictions. This allows you to browse 500 TED Talk predictions in clusters like “Future Energy,” “Social Change,” “Education,” or “Medical Research”:

TED Talks

Picture 1: KnowledgeMap of Clusters for TED Talks Predictions [2]

  • The Technology cluster includes a talk by Nicholas Negroponte, “5 predictions, from 1984.”
  • The Social Change cluster ranks the talk “The future of money” about how crytopcurrencies will change the banking landscape as #1.
  • The Politics cluster contains the top ranked talk “A prediction for the future of Iran,” which is based on a mathematical analysis for predicting human events.

These clusters have been auto-generated based on cognitive analysis and each talk is listed in the Noggle KnowledgeMap browser with a link to the original TED website.

Isn’t it a beautiful map of 500 predictions from the world’s most inspiring leaders?

Example 2: How drones are changing our lives

While there is a lot of “social noise” on the “airborne fulfillment centers” patent from Amazon [3], there is not just this one patent in that area. But it will hard to get it by just scrolling large listings in patent databases.

The following infographic-like knowledge map was created by searching the US patent library for “UAV and drone” with additional cognitive clustering based on the search result, which contained over 350 patents.

The KnowledgeMap spotted a cluster with 20 patents on the subject of “delivery” via drones!

Picture 2: KnowledgeMap of Clusters of US Patents on UAVs and Drones [4]

You can now unearth new groups or cross-document relationships, which may guide you to new, interesting areas that build upon the initial search request.

In milliseconds, thousands of documents located for the initial request are analyzed to build cognitive guided clusters. In addition, a new visual search experience is created by using KnowledgeMaps to present and browse the retrieved documents.

Generate stunning knowledge maps on your own

Now the fast and final part of the story: Let’s connect the unified search to access different content sources with the stunning KnowledgeMap feature: There it is, right at your fingertips… Generate stunning maps of the world’s digital knowledge by yourself.

Whether it’s patents, inspiring TED Talks, or open-access science articles—it is all now in front of you. Now you can discover links that could help us tackle whatever question or issue you can think of, in whatever area you choose.

Final 4 How-To steps to do it on your own

1.    Browse available KnowledgeMaps at

2.    Download and install the free application via

3.    Create or select libraries to execute a research request

4.    Generate cognitive clusters and maps on your own

The creative potential of this technology offers new ways of research and digital knowledge discovery. The produced maps are open to share, so it allows the retrieved KnowledgeMaps to be published and shared in your teams.

Now, share the news and lets start to make the right connections between all the pieces of information to tackle the most pressing questions facing society!

Happy Knowledge Mapping!

Yours, Lars von Thienen

OnlineHelp Settings Map Tab

How to fine-tune the KnowledgeMap cognitive AI clustering algorithm

You can specifiy how the cluster lables should be generated and which lables should be excluded by the algorithm. Please go to the Settings -> Map section.


Strong Cluster Label [Enabled/Disabled]:

This attribute may be useful when certain words appear in most of the input documents (e.g. company name from header or footer) and such words dominate the cluster labels. In such case, enabled strong cluster lables may improve the clusters.

Another useful application of this attribute is when there is a need to generate only very specific clusters, i.e. clusters containing small numbers of documents. This can be achieved by enabling strong cluster lables.


A stopword is a word that has little meaning by itself. For example, the, a, then, and towards are stopwords for all English documents. A stopword can never appear by itself as a cluster label, although it might be used within a label, depending on the stoplabel settings.


If a KnowledgeMap label includes one of the stop labels, the label will not appear on the map of clusters produced by Noggle KnowledgeMap.

What is the NoggleMap or KnowledgeMap?

The Noggle “KnowledgeMap,” a search result visualization tool, provides users with essential information about the structure of topics that appear within the search results. The Noggle clustering algorithm scans internal relations and linguistic patterns among all the documents according to how similar they are to the initial search request. This tool can unearth new groups or cross-document relationships, which might guide users to new, interesting areas that build upon their initial search request. Clustering is one of many methods that can be used to make searching collections of documents easier.

We have often heard users demand such clustered cross-document relationship information, likely because they become frustrated with the constantly growing document volume and fragmented data storage solutions they encounter in the cloud and other big data services.

Please review the following video tutorial:


A detailed knowledge base article on our NoggleMap search feature can be found here:

Document Clustering with KnowledgeMaps

[layerslider id=”4″]

Cognitive-guided, non supervised document clustering

NoggleMap Search Document Clustering

One of the most common problems people used to encounter when searching for information is that they could not find documents specifically related to what they were looking for. Nowadays, this task is quite successfully handled by standard search applications.

Thanks to these sorts of search engines, pulling up results has become easy. However, when it comes to explaining the search results or displaying specific details on what sort of results have been returned, users’ options are much more limited. Usually, a search application displays a ranked list of documents and a snippet of their contents. These ranked lists are helpful for document retrieval, but far away from knowledge management. Information about the internal relationships among the documents in the search results is often not provided by standard search algorithms.

Search Document Clustering

“Search result clustering” is defined as an automatic, non-supervised grouping of similar documents in a search hits list returned from a search engine. Clustering is one of many methods that can be used to make searching collections of documents easier.

So, the Noggle “KnowledgeMap,” a search result visualization tool, provides users with essential information about the structure of topics that appear within the search results. Furthermore, the Noggle clustering algorithm scans internal relations and linguistic patterns among all the documents according to how similar they are to the initial search request. This tool can unearth new groups or cross-document relationships, which might guide users to new, interesting areas that build upon their initial search request.

We have often heard users demand such clustered cross-document relationship information, likely because they become frustrated with the constantly growing document volume and fragmented data storage solutions they encounter in the cloud and other big data services.

Problem with ranked search lists

To illustrate the problems with conventionally ranked search result lists, let’s imagine a user wants to find information about “security.” Therefore, he or she starts with the simple search term “security.”

First, the user selects peer libraries that might be relevant. In this example, the user has libraries from three different peers. In addition, the user selects six of his own libraries to perform the search request.


Figure 1: Search results for search term “security” on nine libraries from four different owners

Figure 1 shows that the search included 27,616 documents and returned 1,500 top-ranked documents related to “security.” Obviously, this is a very general query that leads to a large number of hits. Therefore the majority will be about information security, system security, or security policies based on a library for “Information Technology”.

A determined user patient enough to sort through results ranking 100 or lower should be able to find some hits on topics like “access control” or “service continuity.” However, one problem with ranked lists is that sometimes users need to wade through irrelevant documents to get to the ones they want.

Grouping results into semantic cluster via document clustering

But what about an interface that groups search results into separate semantic topics? Like network security, data security, access control, service continuity, and so on? And what if these groups were decided automatically from their own internal content—not by biased methods where someone defines what might be important?

By generating groups like this, the user will immediately get an overview of what the results contain and should be able to pick out relevant documents with much less effort.

The following figure shows how the NoggleMap feature automatically detects cross-document relations based on linguistic patterns. The left part of the screen shows the clusters and the number of documents related to that cluster. The right panel shows a visual representation of that information.

Document Clustering Search Results Security

Figure 2: Clustered search results for “security” via the Noggle KnowledgeMap document clustering service

All 1,500 documents are linked to one or more of these clusters. This way, users don’t need to browse through a ranked list from the top down—they can narrow down the major cluster they are looking for and go from there.

In order to be helpful, search result clustering must organize similar results into one group. This is the primary requirement for all document clustering algorithms. But in search result clustering, the clusters labels are also extremely important. The program must accurately and concisely describe the cluster’s contents so that users can decide if the information is relevant.

Start with generic search terms first

Since users are often unaware of all their choices in a search, they do not always know the exact phrase they should search for. Thus, starting with a more generic search makes sense. Let the artificial intelligence of the Noggle search engine detect knowledge clusters based on the cross-document linguistic patterns. The visual guide then allows the user to quickly focus on the results of interest by visually selecting the relevant clusters.

This kind of interface for search results is implemented by applying a variety of document clustering techniques to the results returned. This is something that we call the Noggle “KnowledgeMap” and “ClusterSearch” technique.

The user can now select the cluster “Access Control” and browse the relevant documents from the initial request on “security”. And later focus in on the associated documents.

Document Clustering
Figure 3: Document list in the security cluster “Access Control” from the overall search results

This makes document retrieval over different libraries and document search spaces much more efficient. By using “generic” search terms first, Noggle builds clusters for users, who can then narrow down their area of interest and check relevant documents there. Using Noggle this way is not just about searching for documents. Finally, it is a full, non-supervised knowledge management approach to retrieving knowledge that matters. Without the need to know exact phrases and exactly which documents they appear in.

Video Example

The following live presentation showcases the document clustering for included TED Talk digital library. All maps are build by the Noggle client based on the standard application (2min.):


The NoggleMap feature combines latest technolgies based on Text parsing, Microsof Azure, Apache Lucene, Carrot2 Project, Noggle pre- and post-processing algorithms and the Noggle network. Patent pending.

Further Reading: