Populus:DezInV
Aus Populus DE
Zur Navigation springenZur Suche springen
DezInV is a project to create, search and share a decentralized index and web archive of a focused set of internet sites.
Source code: ~~https://gitlab.com/thkoch/dezinv~~ https://github.com/thkoch2001/lara
Pages in this area:
Vision
T. wants to retrieve an article they read some time ago. Their DezInV-Instance indexes websites they regularly read and their outlinks. So the article was quickly found again.
T. wants to read an article found in DezInV but which meanwhile has been censored. The article can however still be recovered from DezInV's archive.
T.'s DezInV-instance is paired with a couple of other instances from trusted peers. Thus T. can actually search through a considerably large corpus of sites.
requirements
- DezInV creates and updates a search index over a few hundred domains with around 100000 pages per domain.
- DezInV archives crawled pages for later display.
- DezInV can peer with other instances to mutually share index and archive.
out of scope:
- JavaScript execution - it is expected that pages contain their interesting content in plain HTML
- reputation is mainly relying on the fact that domains have been manually provided. Thus reputation can rely on distance from manually added domains .