Title
Towards a formal concept analysis approach to exploring communities on the world wide web
Abstract
An interesting problem associated with the World Wide Web (Web) is the definition and delineation of so called Web communities. The Web can be characterized as a directed graph whose nodes represent Web pages and whose edges represent hyperlinks. An authority is a page that is linked to by high quality hubs, while a hub is a page that links to high quality authorities. A Web community is a highly interconnected aggregate of hubs and authorities. We define a community core to be a maximally connected bipartite subgraph of the Web graph. We observe that the web subgraph can be viewed as a formal context and that web communities can be modeled by formal concepts. Additionally, the notions of hub and authority are captured by the extent and intent, respectively, of a concept. Though Formal Concept Analysis (FCA) has previously been applied to the Web, none of the FCA based approaches that we are aware of consider the link structure of the Web pages. We utilize notions from FCA to explore the community structure of the Web graph. We discuss the problem of utilizing this structure to locate and organize communities in the form of a knowledge base built from the resulting concept lattice and discuss methods to reduce the complexity of the knowledge base by coalescing similar Web communities. We present preliminary experimental results obtained from real Web data that demonstrate the usefulness of FCA for improving Web search.
Year
DOI
Venue
2005
10.1007/978-3-540-32262-7_3
ICFCA
Keywords
Field
DocType
formal concept analysis approach,community core,web graph,real web data,web search,web community,world wide web,knowledge base,community structure,similar web community,web page,directed graph,web pages,formal concept analysis
World Wide Web,Web page,Computer science,Web standards,Data Web,Semantic Web,Web modeling,Web navigation,Social Semantic Web,Web service
Conference
Volume
ISSN
ISBN
3403
0302-9743
3-540-24525-1
Citations 
PageRank 
References 
17
0.89
28
Authors
2
Name
Order
Citations
PageRank
Jayson E. Rome1211.39
Robert M. Haralick2102622605.93