@IceFoxX

IceFoxX@lemmy.world · edit-2 23 hours ago

Of course, it depends on how you set the crawler, i.e. how deep because of subdomains and how far links to other domains etc. are.

I crawl 71 pages: My index is currently 4,576,319 documents ( i crawled sites like github too )and occupies just under 14GB

The results depend on several factors. For example, whether you only use local or p2p. But then it also has a number of settings and you can also explicitly control what is responsible for the results down to the smallest detail. But I have to be honest and say that I haven’t dealt with this at all (especially since it’s a bit complex in some cases) because I want to expand my list of pages to crawl myself first and I only use it locally. I still regularly use duckduckgo to search. However, if you take the time for it you will get what you want in terms of quality of results.

Ah well, depending on how you set up the crawler, it consumes system resources accordingly. However, you can set and limit the utilization of RAM and storage space. The same goes for network utilization, which is pretty important because otherwise no other connections would be possible besides crawling xD

IceFoxX@lemmy.world · 1 day ago

Yacy. Self hosted p2p ( or local only ) search engine. Crawl the Web yourself.

IceFoxX@lemmy.world · 4 days ago

Maybe they’ve been watching Schleswig Holstein in Germany and they do it better than we do.

IceFoxX@lemmy.world · edit-2 5 days ago

Germany simply wants to be the pioneer in surveillance and would prefer to leave even China behind (there are already future plans for an alternative to the Chinese social credit system). So I was all the more surprised that a no to chat control came or is coming from Germany. But surveillance and censorship are getting worse and worse at the same time as people want to make money from the public’s personal data… We also sovereignly screwed up the EU cloud with lobbying that it is only operated by American companies.
Well, DNS blocks can be bypassed pretty quickly, but you can already see where it’s all going to end.

IceFoxX@lemmy.world · edit-2 5 days ago

ISP DNS blocking In Germany, DNS blocks have been carried out by ISPs for some time now. At the same time, they want to push through data retention without cause, etc.

Blocked domains

IceFoxX@lemmy.world · edit-2 5 days ago

It’s pretty clear that’s why I mentioned Germany? Since in Germany blocking is used at DNS level… So in which country are the servers located and which laws do the operators have to comply with? Data retention, etc., is also something they really want to push through… then everything will be logged later… Server on EU soil = no data protection or anything else.

IceFoxX@lemmy.world · edit-2 5 days ago

Don’t use this insecure bs DNS with censoring. Boycott Europe’s spyware/malware

Edit: think about french or germany… so porn is already blocked. + More blocked stuff

Edit2: When the EU funding expires at the end of 2025, DNS4EU is to be “commercialized”, i.e. transferred to operation by a profit-oriented company. However, this commercialization has already begun: The spin-off “Whalebone” is intended to help companies and government institutions to detect threats and prevent attacks via DNS.

Ah well. :)

IceFoxX@lemmy.world · 6 days ago

Ah yes, chat control, the crackdown on encryption, backdoors, etc., all forgotten? So now thousands of French people are deliberately using VPNs for something that is now illegal. Well, the politicians won’t take advantage of this to take action against VPNs and encryption… To protect the children, of course…