preview

Looking Dark Web Crawling System

Decent Essays

9. Focused Dark Web Crawling System Based on the design, UA’s AI Lab implemented a focused crawler for the Dark Web forums. This system comprised of four major components: i. Forum Identification: This is used to categorize and list the extremist forums to spider. ii. Forum Preprocessing: This is used to gain accessibility to the listed forums and crawl space traversal issues as well as wrapper generation. iii. Forum Spidering: This process consists of an incremental crawler and recalls improvement mechanism. iv. Forum Storage and Analysis: This process involves archiving the collected data and analyzing it. Fig. 1 Dark Web forum crawling system design 10. Dark Web Analysis and Visualization Dark web analysis and visualization …show more content…

Fig. 2 Diagrammatic representation of Block model clustering 10.1.2. Spring Embedding Graphs can be used to represent relational information. This is the reason why it is often useful to draw graphs in order to visualize this relational information. Spring Embedder are force-directed layout algorithms. All edges are drawn as straight lines. Force-directed layout algorithms model the input graph as a system of forces and try to find a minimum energy configuration of this system. Essentially the aim is to provide an aesthetically pleasing graph for easy visualization. 10.2. Content Analysis Content Analysis is done based on a particular set of keywords and/or targets. For example, to analyze the contents of terrorist and extremist websites, the content categories include: recruiting, training, sharing ideology, communication, propaganda, etc. Specialized computer programs are developed to help automatically identify selected content categories. UA’s AI Lab has built a systematic procedure for collecting and monitoring Dark Web contents using a Dark Web Attribute System to enable quantitative analysis. Fig.3 Dark Web Collection and content analysis framework 10.3. Web Metrics Analysis Web metrics analysis scrutinizes the technical sophistication, media richness and web interactivity of a targeted websites. The end goal is to determine the level of “web-savviness” of the targeted individuals by examining their technical

Get Access