Notes:
- How to use link analysis to determine relevant pages?
- In-link = endorsement
- Similar to citation analysis
Notes:
- Anchor text describes target page
- "computer manufacturer" → lenovo.com
- Also text surrounding links
- "Find funny cat pics here"
- May also be abused
Notes:
- Rank pages with many in-links higher
- PageRank =~ In-degree
- But harder to influence as it propagates
- One in-link from high PageRank site worth more than many low PageRank in-link
- Probability that random surfer will end up on page
- Assume random surfer
- Randomly walk web graph
- No out-links → teleport
- 15% chance that user opens a random page
- Count how many times a page is visited
Notes:
- Clicks some links more often than others
- Main content vs sidebar / footer
- Anchor text related to query / user intent
- Avoid ads
Notes:
- How can the random surfer be improved to provide more realistic results?