ShawiniganHandshake@sh.itjust.workstoSelfhosted@lemmy.world•Experiences with zfs deduplication?English
3·
5 days agoI worked with dedupe products at a previous job. Media files generally deduplicate poorly.
I worked with dedupe products at a previous job. Media files generally deduplicate poorly.
deleted by creator
I haven’t read the article but I work with Bloom filters at work sometimes.
Bloom filters basically tell you “this thing might be present” or “this thing is definitely not present”.
If you’re looking for a piece of data in a set of large files, being able to say “this data is definitely not in this file” saves you a bunch of time because you can skip over the file instead of searching through the whole thing just to figure out what you’re looking for isn’t there.
I’ve worked in bash. I’ve written tools in bash that ended up having a significant lifetime.
Personally, you lost me at
Database drivers exist for a reason. Shelling out to a database cli interface is full of potential pitfalls that don’t exist in any language with a programmatic interface to the database. Dealing with query parameterization in bash sounds un-fun and that’s table stakes, security-wise.
Same with making web API calls. Error handling in particular is going to require a lot of boilerplate code that you would get mostly for free in languages like Python or Ruby or Go, especially if there’s an existing library that wraps the API you want to use in native language constructs.