Mitarbeit willkommen! Bitte schau unter Hilfe:Benutzerkonto oder informiere Dich über Populus.Wiki.

Populus:DezInV

Aus Populus DE
Version vom 17. Dezember 2023, 11:43 Uhr von Thk (Diskussion | Beiträge)
(Unterschied) ← Nächstältere Version | Aktuelle Version (Unterschied) | Nächstjüngere Version → (Unterschied)
Zur Navigation springenZur Suche springen

DezInV is a project to create, search and share a decentralized index and web archive of a focused set of internet sites.

Source code: https://gitlab.com/thkoch/dezinv

Pages in this area:

Vision

T. wants to retrieve an article they read some time ago. Their DezInV-Instance indexes websites they regularly read and their outlinks. So the article was quickly found again.

T. wants to read an article found in DezInV but which meanwhile has been censored. The article can however still be recovered from DezInV's archive.

T.'s DezInV-instance is paired with a couple of other instances from trusted peers. Thus T. can actually search through a considerably large corpus of sites.

requirements

  • DezInV creates and updates a search index over a few hundred domains with around 100000 pages per domain.
  • DezInV archives crawled pages for later display.
  • DezInV can peer with other instances to mutually share index and archive.

out of scope:

  • JavaScript execution - it is expected that pages contain their interesting content in plain HTML
  • reputation is mainly relying on the fact that domains have been manually provided. Thus reputation can rely on distance from manually added domains .