Just an explorer in the threadiverse.

  • 8 Posts
  • 47 Comments
Joined 1 year ago
cake
Cake day: June 4th, 2023

help-circle


  • Like helping to find a bug, discussing about how to setup an application for a certain use case or anything like that? Answering questions on Stack overflow is an example but is that the best way?

    Generally the best way to help out is to do a thing that’s needed and that you can figure out how to do. Your list includes a bunch of good options, and I’ve been thanked for doing all those things at one point or another. Some common growth paths include:

    1. Using the software
    2. Encountering bugs, problems, or small opportunities for improvement.
    3. Discussing those informally in forums and helping people find workarounds.
    4. Identifying some of those issues as common things other things experience as well, so filing bugs for them with clear explanations and links to related forum discussions.
    5. Reading source code to better understand bugs.
    6. Discussing potential fixes in developer bug threads (or in GitHub or whatever).
    7. Submitting small fixes for simple bugs as pull requests.

    Another path might be:

    1. Using the software and reading forums/docs for help.
    2. Answering basic questions on forums, looking to old threads and relevant docs.
    3. Learning about common questions.
    4. Writing blogs or forum posts about common questions.
    5. Submitting improvements to official docs to clarify common areas of confusion.

    There are other paths as well, the main thing is to use a thing so you learn about it and then use that knowledge to make it a little easier for the next person. Good luck!








  • I don’t know what’s up on your case, but I would not jump to the conclusion that it’s impossible to use tailscale with any other VPN in any circumstance.

    Rather, tailscale and Mullvad will now work easily and out of the box. For other VPNs, you may need to do understand the topology and routing of virtual devices and have the technical ability and system permissions to make deep networking changes.

    So I’d expect one can probably find a way for most things to coexist on a Linux server. On a non-rootrr android phone? I’m less confident.



  • I replied to the parent comment here to say that governments HAVE set up CSAM detection services. I linked a review of them in my original comment.

    • They’ve set them up through commercial partnerships with technology companies… but that’s no accident. CSAM fighting orgs don’t have the tech reach of a major tech company so they ask for help there.
    • Those partnerships are limited to major/successful orgs… which makes it hard to participate as an OSS dev. But again, that’s on-purpose as the same access that would empower OSS devs to improve detection would enable CSAM producers to improve evasion. Secrecy is useful in this race, even if it has a high cost.

    Plus with the flurry of hugely privacy-invading or anti-encryption legislation that shows up every few months under the guise of “protecting the children online”, it seems like that should be a top priority for them, right?! Right…?

    This seems like inflammatory bait but I’ll bite once.

    • Improving CSAM detection is absolutely a top priority of these orgs, and in the last 10y the scope and reach of the detection tools they’ve created with partners has expanded in reach from scanning zero images to scanning hundreds of millions or billions of images annually. It’s a fairly massive success story even if it’s nowhere near perfect.
    • Building global internet infrastructure to scan all/most images posted to the internet is itself hugely privacy invading even if it’s for a good cause. Nothing prevents law-makers from coopting such infrastructure for less noble goals once it’s been created. Lemmy is in desperate need of help here, and CSAM detection tools are necessary in some form, but they are also very much scary scary privacy invading tools that are subject to “think of the children” abuse.

  • I’m not sure I follow the suggestion.

    • NCMEC, the US-based organization tasked with fighting CSAM, has already partnered with a list of groups to develop CSAM detection tools. I’ve already linked to an overview of the resulting toolsets in my original comment.
    • The datasets used to develop these tools are private, but that’s not an oversight. The datasets are… well… full of CSAM. Distributing them openly and without restriction would be contrary to NCMEC’s mission and to US law, so they limit the downside by partnering only with serious/capable partners who are able to commit to investing significant resources to developing and long-term maintaining detection tools, and who can sign onerous legal paperwork promising to handle appropriately the access they must be given to otherwise illegal material to do so.
    • CSAM detection tools are necessarily a cat and mouse game of CSAM producers attempting to evade detection vs detection experts trying to improve detection. In such a race, secrecy is a useful… if costly… tool. But as a result, NCMEC requires a certain amount of secrecy from their partners about how the detection tools work and who can run them in what circumstances. The goal of this secrecy is to prevent CSAM producers from developing test suites that allow them to repeatedly test image manipulation strategies that retain visual fidelity but thwart detection techniques.

    All of which is to say…

    … seems like law enforcement would have such a data set and seems they should of course allow tools to be trained on it. seems but who knows? might be worth finding out.)

    Law enforcement DOES have datasets, and DO allow tools to be trained on them… I’ve linked the resulting tools. They do NOT allow randos direct access to the data or tools, which is a necessary precaution to prevent attackers from winning the circumvention race. A Red Hat or Mozilla scale organization might be able to partner with NCMEC or another organization to become a detection tooling partner, but db0, sunaurus, or the Lemmy devs likely cannot without the support of a large technology org with a proven track record or delivering and maintaining successful/impactful technology products. This has the big downside of making a true open-source detection tool more or less impossible… but that’s a well-understood tradeoff that CSAM-fighting orgs are not likely to change as the same access that would empower OSS devs would empower CSAM producers. I’m not sure there’s anything more to find out in this regard.


  • It’s worth considering some commercially developed options as well: https://prostasia.org/blog/csam-filtering-options-compared/

    The Cloudflare tool in particular is freely and widely available: https://blog.cloudflare.com/the-csam-scanning-tool/

    I am no expert, but I’m quite skeptical of db0’s tool:

    • It repurposes a library designed for preventing the creation of synthetic CSAM using stable diffusion. This library is typically used in conjunction with prompt scanning and other inputs into the generation process. When run outside it’s normal context on non-ai images, it will lack all this input context which I speculate reduces its effectiveness relative to the conditions under which it’s tested and developed.
    • AI techniques live and die by the quality of the dataset used to train them. There is not and cannot be an open-source test dataset of CSAM upon which to train such a tool. One can attempt workarounds like extracting features classified and extracted separately like trying to detect coexisting features related to youth (trained from dataset A using non sexualized images including children) and sexuality (trained separately from dataset B using images containing only adult performers)… but the efficacy of open source solutions is going to be hamstrung by the inability to train, test, and assess effectiveness of the open tools. Developers of major commercial CSAM scanners are better able to partner with NCMEC and other groups fighting CSAM to assess the effectiveness of their tools.

    I’m no expert, but my belief is that open tools are likely to be hamstrung permanently compared to the tools developed by big companies and the most effective solutions for Lemmy must integrate big company tools (or gov/nonprofit tools if they exist).

    PS: Really impressed by your response plan. I hope the Lemmy world admins are watching this post, I know you all communicate and collaborate. Disabling image uploads is I think I very effective temporary response until detection and response tooling can be improved.



    1. …create a sidebar with some contents… At least some of these communities have empty sidebars.
    2. Every community needs enough moderators. A single-mod community is not “enough” for a healthy community because things can blow up when you’re asleep or away, even in a community that was previously inactive. If a community member reaches out to offer to join a single-mod team… that contact warrants a response from the existing mod. Not necessarily to immediately accept the offer, but at least to discuss the possibility of extra mod coverage.
    3. It’s just not at all true that if others aren’t posting there’s no moderation work that could be done. Mods of inactive communities can jumpstart them by soliciting feedback on proposed rules, advertising them elsewhere, making scheduled discussion posts, and more. Some of these things can be done by a “regular” community member as well, but if community members try to include mods in discussions about how best to promote the community and the mods ignore them… that’s a sign that the community is abandoned.
    4. If a mod is notified that they’re their community is about to get reassigned and they don’t respond… the community is definitely abandoned.

    All of which is to say, there are lots of way to detect abandoned communities when post volume is low, and the process I highlighted is the standard way to request a takeover.


  • I use k8s at work and have built a k8s cluster in my homelab… but I did not like it. I tore it down, and currently using podman, and don’t think I would go back to k8s (though I would definitely use docker as an alternative to podman and would probably even recommend it over podman for beginners even though I’ve settled on podman for myself).

    1. K8s itself is quite resource-consuming, especially on ram. My homelab is built on old/junk hardware from retired workstations. I don’t want the kubelet itself sucking up half my ram. Things like k3s help with this considerably, but that’s not quite precisely k8s either. If I’m going to start trimming off the parts of k8s I don’t need, I end up going all the way to single-node podman/docker… not the halfway point that is k3s.
    2. If you don’t use hostNetworking, the k8s model of traffic routes only with the cluster except for egress is all pure overhead. It’s totally necessary with you have a thousand engineers slinging services around your cluster, but there’s no benefit to this level fo rigor in service management in a homelab. Here again, the networking in podman/docker is more straightforward and maps better to the stuff I want to do in my homelab.
    3. Podman accepts a subset of k8s resource-yaml as a docker-compose-like config interface. This lets me use my familiarity with k8s configs iny podman setup.

    Overall, the simplicity and lightweight resource consumption of podman/docker are are what I value at home. The extra layers of abstraction and constraints k8s employs are valuable at work, where we have a lot of machines and alot of people that must coordinate effectively… but I don’t have those problems at home and the overhead (compute overhead, conceptual overhead, and config-overhesd) of k8s’ solutions to them is annoying there.


  • The more normal transfer path is to offer to take over a specific community or communities by:

    1. Reaching out to the existing mod and asking to be added to the mod team.
    2. Documenting their lack of response after a few days or a week.
    3. Documenting the failure to abide by Lemmy world moderation guidelines: https://lemmy.world/post/424735 by linking spam or off-topic posts and to communities that lack rules/useful-sidebar-content, etc.
    4. Posting this info in [email protected] and offering to takeover moderation.

    This is better than mass deletion because it keeps whatever small list of existing subscribers and post content intact across the transition. For moderation, Lemmy world admins will get notified of reports and can address anything that violates instance rules.


    1. If a service supports sqlite, I often will use that option. It provides everything a self-hoster needs from a DB with basically no operational overhead.
    2. If I do need a proper RDBMS (because the software I’m using doesn’t support sqlite), I’m going to use…
      1. A single Postgres container.
      2. Configured with multiple logical “databases” (the container for schemas and tables), one DB for each app connecting.

    I do this because I’m always memory constrained and the rdbms is generally the most memory-hungry part of any software stack. By sharing one db-process across all the apps that need it I get the most out of my db cache memory, etc. And by using multiple logical db’s, I get good separation between my apps, and they’re straightforward to migrate to a truly isolated physical DB if needed… but that’s never been needed.


  • … advertisement and push they did on sites like reddit…

    The lemmy world admins advertised on Reddit? Can you link an example?

    … their listing on join-lemmy.org

    Until recently EVERY lemmy instance was listed on join-lemmy.

    And with the name Lemmy.world they did nothing to dissuade anyone from thinking that.

    They run a family of servers under the world tld, including at least mastodon, lemmy, and calckey. They’re all named similarly.

    I also saw nothing from .world not claiming to be the bigger instance(super lemmy)

    They ARE the biggest instance, but that happened organically. It’s not based on any marketing claims from the admin team about being a flagship/super/mega/whatever instance. People just joined, and the admins didn’t stop them (nor should they). It’s not a conspiracy to take over lemmy. It’s just an instance that… until recently… happened to work pretty well when some were struggling.


  • I think the issue is that .world has put itself forward as some sort of super lemmy.

    Citation needed. All the admins of lemmy world ever purported to do was host a well-run general-purpose (aka not topic-oriented) lemmy instance. It was and remains that, and part of being a well-run general purpose instance is managing legal risk when a small subset of the community generates an outsized portion of it.

    Being well run meant that they scaled up and remained operational during the first reddit migration wave. People appreciated that, but continuing to function does not amount to a declaration of being a super lemmy.

    World also has kept signups open through good times, and more recently bad. Other instances at various times shut down signups or put irritating steps and purity tests along the way. Keeping signups open is a pretty bare-minimum bar for running a service though, it is again not a declaration of being a super-lemmy.

    Essentially lemmy world just… kept working (until recently when it has done a pretty poor job of that). I dunno where you found a declaration that lemmy world is a super-lemmy, but it’s not coming from the lemmy world admins, it’s likely randos spouting off.


  • I think a couple things are in play:

    • Very few people consumed these comics as we are… reading each one in sequence. You’d more likely sporadically encounter them in the funnies section of a physical newspaper. Which was a pretty hit/miss proposition to begin with. No one expected every one to be a winner, and people would routinely skip over stuff that didn’t interest them without thinking about it too hard. You’re operating under the assumption that Far Side is a classic, but at the time people would just cruise by and think “that comic is stupid, just like 60% of the other stupid comics on this page”. And folks were pretty happy to have 40% of comics be a bit funny.
    • What made Far Side a classic was not its consistency. Rather, there were a few strips that became cultural phenomena. Basically a handful of hits that were breakout memes of the 80s and 90s. Colleges used to sell t-shirts of the school for the gifted strip with the kid pushing on the door that says pull, which is pretty accessible and one of those breakout hits.
    • Because of those breakout hit strips, some folks got into Larson’s style of humor enough that fewer of his strips were inscrutable to them and he had a lasting market.
    • Other comments point about topical references and those are also a big deal. If someone sees a beans meme with no context 30y from now, it ain’t gonna be funny. But a few weeks ago on lemmy, it was part of a contextual zeitgeist that was more or less about “these idiots will upvote anything, I’m one of the idiots… I’ll upvote this!” and it kind of captured the exuberant excitement of not knowing what lemmy was but wanting it to be something. Similarly, these strips often weren’t intended to last multiple generations. They assumed you were reading the newspaper RIGHT NOW… and so could reference current events very obliquely and still be accessible.

    TLDR: Like a stupid meme, many Larson comics require shared transient context we’re missing now. Some are also just fukin weird, like cow tools. But some were very accessible and became hugely popular. These mega-star strips cemented Far Side’s popularity, and which gave Larson the autonomy to stay weird when he chose. Now we waste time trying to figure out what they meant.